Thursday, October 11, 2018

Adding static ip route when interface is down or Network is unreachable

Adding static route on interface fails when interface is down


root@r-118-JAY:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 0e:00:a9:fe:00:84 brd ff:ff:ff:ff:ff:ff
    inet 169.254.0.132/16 brd 169.254.255.255 scope global eth0
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 1e:00:7c:00:00:11 brd ff:ff:ff:ff:ff:ff
    inet 10.147.46.105/24 brd 10.147.46.255 scope global eth1
       valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 1e:00:7c:00:00:22 brd ff:ff:ff:ff:ff:ff
    inet 10.147.52.202/24 brd 10.147.52.255 scope global eth2
       valid_lft forever preferred_lft forever
root@r-118-JAY:~#
root@r-118-JAY:~# ip route show table Table_eth2
throw 10.147.46.0/24  proto static
throw 10.147.70.0/24  proto static
throw 10.147.80.0/24
throw 10.147.81.0/24
root@r-118-JAY:~# ip route show table static_route
root@r-118-JAY:~#
root@r-118-JAY:~#  ip route add throw 10.147.81.0/24 table static_route
ip route throw on static_route table is success but below route add fails.
root@r-118-JAY:~# ip route add 10.147.70.0/24 via 10.147.52.1 table static_route
RTNETLINK answers: Network is unreachable


This route add is failing because ip interface eth2 (10.147.52.0/24) subnet is down. Here the route using via 10.147.52.1 which is not reachable now. Due to this the route add is failing.
 
root@r-118-JAY:~#
root@r-118-JAY:~# ip route show table static_route
throw 10.147.81.0/24
root@r-118-JAY:~#
 

Wednesday, October 10, 2018

keepalived configuration


global_defs {
    router_id r-17-VM
}

vrrp_script heartbeat {
    script "/ramdisk/rrouter/heartbeat.sh"
    interval 5
}

vrrp_instance inside_network {
    state BACKUP
    interface eth0
    virtual_router_id 51
    nopreempt

    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 9YasrrhZ
    }

    virtual_ipaddress {
        10.1.1.1 brd 10.1.1.255 dev eth0
    }

    track_script {
        heartbeat
    }

    !That's the correct path of the master.py file.
    notify_backup "/opt/cloud/bin/master.py --backup"
    notify_master "/opt/cloud/bin/master.py --master"
    notify_fault "/opt/cloud/bin/master.py --fault"
}root@r-17-VM:~#




conntrackd.conf:

Sync {
    Mode FTFW {
        #
        # Size of the resend queue (in objects). This is the maximum
        # number of objects that can be stored waiting to be confirmed
        # via acknoledgment. If you keep this value low, the daemon
        # will have less chances to recover state-changes under message
        # omission. On the other hand, if you keep this value high,
        # the daemon will consume more memory to store dead objects.
        # Default is 131072 objects.
        #
        # ResendQueueSize 131072

        #
        # This parameter allows you to set an initial fixed timeout
        # for the committed entries when this node goes from backup
        # to primary. This mechanism provides a way to purge entries
        # that were not recovered appropriately after the specified
        # fixed timeout. If you set a low value, TCP entries in
        # Established states with no traffic may hang. For example,
        # an SSH connection without KeepAlive enabled. If not set,
        # the daemon uses an approximate timeout value calculation
        # mechanism. By default, this option is not set.
        #
        # CommitTimeout 180

        #
        # If the firewall replica goes from primary to backup,
        # the conntrackd -t command is invoked in the script.
        # This command schedules a flush of the table in N seconds.
        # This is useful to purge the connection tracking table of
        # zombie entries and avoid clashes with old entries if you
        # trigger several consecutive hand-overs. Default is 60 seconds.
        #
        # PurgeTimeout 60

        # Set the acknowledgement window size. If you decrease this
        # value, the number of acknowlegdments increases. More
        # acknowledgments means more overhead as conntrackd has to
        # handle more control messages. On the other hand, if you
        # increase this value, the resend queue gets more populated.
        # This results in more overhead in the queue releasing.
        # The following value is based on some practical experiments
        # measuring the cycles spent by the acknowledgment handling
        # with oprofile. If not set, default window size is 300.
        #
        # ACKWindowSize 300

        #
        # This clause allows you to disable the external cache. Thus,
        # the state entries are directly injected into the kernel
        # conntrack table. As a result, you save memory in user-space
        # but you consume slots in the kernel conntrack table for
        # backup state entries. Moreover, disabling the external cache
        # means more CPU consumption. You need a Linux kernel
        # >= 2.6.29 to use this feature. By default, this clause is
        # set off. If you are installing conntrackd for first time,
        # please read the user manual and I encourage you to consider
        # using the fail-over scripts instead of enabling this option!
        #
        # DisableExternalCache Off
    }

    #
    # Multicast IP and interface where messages are
    # broadcasted (dedicated link). IMPORTANT: Make sure
    # that iptables accepts traffic for destination
    # 225.0.0.50, eg:
    #
    #    iptables -I INPUT -d 225.0.0.50 -j ACCEPT
    #    iptables -I OUTPUT -d 225.0.0.50 -j ACCEPT
    #
    Multicast {
IPv4_address 225.0.0.50
Group 3780
IPv4_interface 10.1.1.197
Interface eth0
SndSocketBuffer 1249280
RcvSocketBuffer 1249280
Checksum on
    }
    #
    # You can specify more than one dedicated link. Thus, if one dedicated
    # link fails, conntrackd can fail-over to another. Note that adding
    # more than one dedicated link does not mean that state-updates will
    # be sent to all of them. There is only one active dedicated link at
    # a given moment. The `Default' keyword indicates that this interface
    # will be selected as the initial dedicated link. You can have
    # up to 4 redundant dedicated links. Note: Use different multicast
    # groups for every redundant link.
    #
    # Multicast Default {
    #    IPv4_address 225.0.0.51
    #    Group 3781
    #    IPv4_interface 192.168.100.101
    #    Interface eth3
    #    # SndSocketBuffer 1249280
    #    # RcvSocketBuffer 1249280
    #    Checksum on
    # }

    #
    # You can use Unicast UDP instead of Multicast to propagate events.
    # Note that you cannot use unicast UDP and Multicast at the same
    # time, you can only select one.
    #
    # UDP {
        #
        # UDP address that this firewall uses to listen to events.
        #
        # IPv4_address 192.168.2.100
        #
        # or you may want to use an IPv6 address:
        #
        # IPv6_address fe80::215:58ff:fe28:5a27

        #
        # Destination UDP address that receives events, ie. the other
        # firewall's dedicated link address.
        #
        # IPv4_Destination_Address 192.168.2.101
        #
        # or you may want to use an IPv6 address:
        #
        # IPv6_Destination_Address fe80::2d0:59ff:fe2a:775c

        #
        # UDP port used
        #
        # Port 3780

        #
        # The name of the interface that you are going to use to
        # send the synchronization messages.
        #
        # Interface eth2

        #
        # The sender socket buffer size
        #
        # SndSocketBuffer 1249280

        #
        # The receiver socket buffer size
        #
        # RcvSocketBuffer 1249280

        #
        # Enable/Disable message checksumming.
        #
        # Checksum on
    # }

}

#
# General settings
#
General {
    #
    # Set the nice value of the daemon, this value goes from -20
    # (most favorable scheduling) to 19 (least favorable). Using a
    # very low value reduces the chances to lose state-change events.
    # Default is 0 but this example file sets it to most favourable
    # scheduling as this is generally a good idea. See man nice(1) for
    # more information.
    #
    Nice -20

    #
    # Select a different scheduler for the daemon, you can select between
    # RR and FIFO and the process priority (minimum is 0, maximum is 99).
    # See man sched_setscheduler(2) for more information. Using a RT
    # scheduler reduces the chances to overrun the Netlink buffer.
    #
    # Scheduler {
    #    Type FIFO
    #    Priority 99
    # }

    #
    # Number of buckets in the cache hashtable. The bigger it is,
    # the closer it gets to O(1) at the cost of consuming more memory.
    # Read some documents about tuning hashtables for further reference.
    #
    HashSize 32768

    #
    # Maximum number of conntracks, it should be double of:
    # $ cat /proc/sys/net/netfilter/nf_conntrack_max
    # since the daemon may keep some dead entries cached for possible
    # retransmission during state synchronization.
    #
    HashLimit 131072

    #
    # Logfile: on (/var/log/conntrackd.log), off, or a filename
    # Default: off
    #
    LogFile on

    #
    # Syslog: on, off or a facility name (daemon (default) or local0..7)
    # Default: off
    #
    #Syslog on

    #
    # Lockfile
    #
    LockFile /var/lock/conntrack.lock

    #
    # Unix socket configuration
    #
    UNIX {
        Path /var/run/conntrackd.ctl
        Backlog 20
    }

    #
    # Netlink event socket buffer size. If you do not specify this clause,
    # the default buffer size value in /proc/net/core/rmem_default is
    # used. This default value is usually around 100 Kbytes which is
    # fairly small for busy firewalls. This leads to event message dropping
    # and high CPU consumption. This example configuration file sets the
    # size to 2 MBytes to avoid this sort of problems.
    #
    NetlinkBufferSize 2097152

    #
    # The daemon doubles the size of the netlink event socket buffer size
    # if it detects netlink event message dropping. This clause sets the
    # maximum buffer size growth that can be reached. This example file
    # sets the size to 8 MBytes.
    #
    NetlinkBufferSizeMaxGrowth 8388608

    #
    # If the daemon detects that Netlink is dropping state-change events,
    # it automatically schedules a resynchronization against the Kernel
    # after 30 seconds (default value). Resynchronizations are expensive
    # in terms of CPU consumption since the daemon has to get the full
    # kernel state-table and purge state-entries that do not exist anymore.
    # Be careful of setting a very small value here. You have the following
    # choices: On (enabled, use default 30 seconds value), Off (disabled)
    # or Value (in seconds, to set a specific amount of time). If not
    # specified, the daemon assumes that this option is enabled.
    #
    # NetlinkOverrunResync On

    #
    # If you want reliable event reporting over Netlink, set on this
    # option. If you set on this clause, it is a good idea to set off
    # NetlinkOverrunResync. This option is off by default and you need
    # a Linux kernel >= 2.6.31.
    #
    # NetlinkEventsReliable Off

    #
    # By default, the daemon receives state updates following an
    # event-driven model. You can modify this behaviour by switching to
    # polling mode with the PollSecs clause. This clause tells conntrackd
    # to dump the states in the kernel every N seconds. With regards to
    # synchronization mode, the polling mode can only guarantee that
    # long-lifetime states are recovered. The main advantage of this method
    # is the reduction in the state replication at the cost of reducing the
    # chances of recovering connections.
    #
    # PollSecs 15

    #
    # The daemon prioritizes the handling of state-change events coming
    # from the core. With this clause, you can set the maximum number of
    # state-change events (those coming from kernel-space) that the daemon
    # will handle after which it will handle other events coming from the
    # network or userspace. A low value improves interactivity (in terms of
    # real-time behaviour) at the cost of extra CPU consumption.
    # Default (if not set) is 100.
    #
    # EventIterationLimit 100

    #
    # Event filtering: This clause allows you to filter certain traffic,
    # There are currently three filter-sets: Protocol, Address and
    # State. The filter is attached to an action that can be: Accept or
    # Ignore. Thus, you can define the event filtering policy of the
    # filter-sets in positive or negative logic depending on your needs.
    # You can select if conntrackd filters the event messages from
    # user-space or kernel-space. The kernel-space event filtering
    # saves some CPU cycles by avoiding the copy of the event message
    # from kernel-space to user-space. The kernel-space event filtering
    # is prefered, however, you require a Linux kernel >= 2.6.29 to
    # filter from kernel-space. If you want to select kernel-space
    # event filtering, use the keyword 'Kernelspace' instead of
    # 'Userspace'.
    #
    Filter From Userspace {
        #
        # Accept only certain protocols: You may want to replicate
        # the state of flows depending on their layer 4 protocol.
        #
        Protocol Accept {
            TCP
            SCTP
            DCCP
            # UDP
            # ICMP # This requires a Linux kernel >= 2.6.31
        }

        #
        # Ignore traffic for a certain set of IP's: Usually all the
        # IP assigned to the firewall since local traffic must be
        # ignored, only forwarded connections are worth to replicate.
        # Note that these values depends on the local IPs that are
        # assigned to the firewall.
        #
        Address Ignore {
            IPv4_address 127.0.0.1
            IPv4_address 169.254.0.4
        }

        #
        # Uncomment this line below if you want to filter by flow state.
        # This option introduces a trade-off in the replication: it
        # reduces CPU consumption at the cost of having lazy backup
        # firewall replicas. The existing TCP states are: SYN_SENT,
        # SYN_RECV, ESTABLISHED, FIN_WAIT, CLOSE_WAIT, LAST_ACK,
        # TIME_WAIT, CLOSED, LISTEN.
        #
        # State Accept {
        #    ESTABLISHED CLOSED TIME_WAIT CLOSE_WAIT for TCP
        # }
    }
}
root@r-17-VM:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:00:63:d3:00:05 brd ff:ff:ff:ff:ff:ff
    inet 10.1.1.197/24 brd 10.1.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 10.1.1.1/32 brd 10.1.1.255 scope global eth0
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 0e:00:a9:fe:00:04 brd ff:ff:ff:ff:ff:ff
    inet 169.254.0.4/16 brd 169.254.255.255 scope global eth1
       valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
    link/ether 1e:00:3d:00:00:10 brd ff:ff:ff:ff:ff:ff
    inet 10.147.48.42/24 brd 10.147.48.255 scope global eth2
       valid_lft forever preferred_lft forever
    inet 10.147.48.45/24 brd 10.147.48.255 scope global secondary eth2
       valid_lft forever preferred_lft forever
root@r-17-VM:~#

Why both VRRP instances both become backup or master

1. When both Master/primay and Backup become master

One of the reason for both VRRP instances become master is
A. There is no reachability (connection issue) between the two. When there is no reachability both instances send VRRP packets and assume master as themselves.


B. The auth type protocol mismatch in VRRP instances in keepalived.conf

When there is auth_type mismatch between the two then both VRRP instances does not understand VRRP packets themselves and both become master.




2. When VRRP instance become fault
If keepavlied service stop on the instance.

Stateful firewall conntrack states transistion

This blog shows how to check the conntrack state changes when a connection is made or connection got dropped.

Using conntrack command we can states update information.

Example conntrack command to show connection states.
Conntrack command example filter:
# conntrack -E -p tcp --dport 33

1. First ssh connection request sent to firewall public ip address.

    [NEW] tcp      6 120 SYN_SENT src=10.233.89.39 dst=10.147.46.107 sport=64887 dport=33 [UNREPLIED] src=10.1.1.44 dst=10.233.89.39 sport=22 dport=64887 mark=2
 [UPDATE] tcp      6 60 SYN_RECV src=10.233.89.39 dst=10.147.46.107 sport=64887 dport=33 src=10.1.1.44 dst=10.233.89.39 sport=22 dport=64887 mark=2
 [UPDATE] tcp      6 432000 ESTABLISHED src=10.233.89.39 dst=10.147.46.107 sport=64887 dport=33 src=10.1.1.44 dst=10.233.89.39 sport=22 dport=64887 [ASSURED] mark=2


 2. ssh session got closed.
[UPDATE] tcp      6 120 FIN_WAIT src=10.233.89.39 dst=10.147.46.107 sport=64887 dport=33 src=10.1.1.44 dst=10.233.89.39 sport=22 dport=64887 [ASSURED] mark=2
 [UPDATE] tcp      6 30 LAST_ACK src=10.233.89.39 dst=10.147.46.107 sport=64887 dport=33 src=10.1.1.44 dst=10.233.89.39 sport=22 dport=64887 [ASSURED] mark=2
 [UPDATE] tcp      6 120 TIME_WAIT src=10.233.89.39 dst=10.147.46.107 sport=64887 dport=33 src=10.1.1.44 dst=10.233.89.39 sport=22 dport=64887 [ASSURED] mark=2