SimonV's Blog-JNCIS-SEC – Juniper SRX100 Cluster Configuration

SimonV · March 01, 2016, 09:01:33 PM

JNCIS-SEC – Juniper SRX100 Cluster Configuration

In this post I will go through the basics of cluster configuration on the SRX. I still have a couple of SRX100s laying around, which is perfect to cover the clustering topics of the JNCIS-SEC blueprint! Before you start configuring the cluster, always verify that both your boxes are on the same software version. root@FW01A> … Continue reading →

In this post I will go through the basics of cluster configuration on the SRX. I still have a couple of SRX100s laying around, which is perfect to cover the clustering topics of the JNCIS-SEC blueprint!

Before you start configuring the cluster, always verify that both your boxes are on the same software version.


root@FW01A> show system software
Information for junos:

Comment:
JUNOS Software Release [12.1X44-D40.2]

Physical Wiring

Here is how the cluster will be cabled up. Because there’s a lot to remember during the configuration, it’s best to make this sort of diagram before you begin.

Connect the fxp1 and Fab ports

The Control link (fxp1) is used to synchronize configuration and performs cluster health checks by sending heartbeat messages. The physical port location depends on the SRX model, and can also be configurable on the high-end models. In my case, on the branch SRX100B, the fe0/0/07 interfaces are predetermined as fxp1.

The fab interface is used to exchange all the session state information between both devices. This provides a stateful failover if anything happens to the primary cluster node. You can choose which interface to assign. I will use fe/0/0/5 so all the first ports stay available.

Setting the Cluster-ID and Node ID

First, wipe all the old configuration and put both devices in cluster mode. Some terminology:

The cluster ID ranges from 1 to 15 and uniquely identifies the cluster if you have multiple clusters across the network. I will use Cluster ID 1

The node ID identifies both members in the cluster. A cluster will only have two members ever, so the options are 0 and 1

The commands below are entered in operational mode:


root@FW01A> set chassis cluster cluster-id 1 node 0 reboot
Successfully enabled chassis cluster. Going to reboot now


root@FW01B> set chassis cluster cluster-id 1 node 1 reboot
Successfully enabled chassis cluster. Going to reboot now

Keep attention when you enter the commands above. Make sure you are actually enabling the cluster, not disabling it. That would return the following message:


 Successfully disabled chassis cluster. Going to reboot now

Configuring the management interfaces

Once the devices have restarted we can move on to the configuration part.

To get out-of-band access to your firewalls, you really should configure both the members with a managment IP on the fxp0 interface.

All member-specific configuration is applied under the groups node-memmbers stanza. This is also where the hostnames are configured.


{primary:node0}[edit groups]
root@FW01A# show
node0 {
    system {
        host-name FW01A;
    }
    interfaces {
        fxp0 {
            unit 0 {
                family inet {
                    address 192.168.1.1/24;
                }
            }
        }
    }
}
node1 {
    system {
        host-name FW01B;
    }
    interfaces {
        fxp0 {
            unit 0 {
                family inet {
                    address 192.168.1.2/24;
                }
            }
        }
    }
}

On the SRX100B, the fxp0 interface is automatically mapped to the fe0/0/6 interface. Be sure to check the documentation for your specific model.

Apply Group

Before committing, don’t forget to include the command below . This ensures that node-specific config is only applied to that particular node.


{primary:node0}[edit]
root@FW01A# set apply-groups "${node}"

Configuring the fabric interface

The next step is to configure your fabric links, which are used to exchange the session state. Node0 has the fab0 interface and Node1 has the fab1 interface.


{primary:node0}[edit interfaces]
root@FW01A# show
fab0 {
    fabric-options {
        member-interfaces {
            fe-0/0/5;
        }
    }
}
fab1 {
    fabric-options {
        member-interfaces {
            fe-1/0/5;
        }
    }
}

After a commit, we can see both the control and fabric links are up.


root@FW01A# run show chassis cluster interfaces
Control link status: Up

Control interfaces:
    Index   Interface        Status
    0       fxp1             Up

Fabric link status: Up

Fabric interfaces:
    Name    Child-interface    Status
                               (Physical/Monitored)
    fab0    fe-0/0/5           Up   / Up
    fab0
    fab1    fe-1/0/5           Up   / Up
    fab1

Redundant-pseudo-interface Information:
    Name         Status      Redundancy-group
    lo0          Up          0

Configuring the Redundancy Groups

The redundancy group is where you configure the cluster’s failover properties relating to a collection of interfaces or other objects. RG0 is configured by default when you activate the cluster, and manages the redundancy for the routing engines. Let’s create a new RG 1 for our interfaces.


{secondary:node0}[edit chassis]
root@FW01A# show
cluster {
    redundancy-group 0 {
        node 0 priority 100;
        node 1 priority 1;
    }
    redundancy-group 1 {
        node 0 priority 100;
        node 1 priority 1;
    }
}

Configuring Redundant Ethernet interfaces

The reth interfaces are bundles of physical ports across both cluster members. The child interfaces inherit the configuration from the overlying reth interface – think of it as being similar to an 802.3ad Etherchannel. In fact, you can use an Etherchannel to use more than one physical port on each node.


{secondary:node0}[edit chassis cluster]
root@FW01A# set reth-count 2

After entering this command, you can do a quick commit, which will make the reth interfaces visible in the terse command.


root@FW01A# run show interfaces terse | match reth
reth0                   up    down
reth1                   up    down

Now you can configure the reth interfaces as you would with any other interface, give them an IP and assign them to the Redundancy Group.

reth0 is our outside interface, and reth1 is the inside.


{secondary:node0}[edit interfaces]
root@FW01A# show reth0
redundant-ether-options {
    redundancy-group 1;
}
unit 0 {
    family inet {
        address 1.1.1.1/24;
    }
}


{secondary:node0}[edit interfaces]
root@FW01A# show reth1
redundant-ether-options {
    redundancy-group 1;
}
unit 0 {
    family inet {
        address 10.0.0.1/24;
    }
}

When our reths are configured, we can add our physical ports. The fe-0/0/0 (node0) and fe-1/0/0 (node1) will join reth0, fe-0/0/1 and fe-1/0/1 will join reth1.


{secondary:node0}[edit interfaces]
root@FW01A# show
fe-0/0/0 {
    fastether-options {
        redundant-parent reth0;
    }
}
fe-0/0/1 {
    fastether-options {
        redundant-parent reth1;
    }
}
fe-1/0/0 {
    fastether-options {
        redundant-parent reth0;
    }
}
fe-1/0/1 {
    fastether-options {
        redundant-parent reth1;
    }
}

Interface monitoring

We can use interface monitoring to subtract a predetermined priority value off our redundancy group priority, when a link goes physically down.

For example, node0 is primary for RG1 with a priority of 100. If we add an interface-monitor value of anything higher than 100 to the physical interface, the link-down event will cause to priority to drop to zero and trigger the failover. Configuration is applied at the redundancy-groups:


{primary:node0}[edit chassis cluster]
root@FW01A# show
reth-count 2;
redundancy-group 0 {
    node 0 priority 100;
    node 1 priority 1;
}
redundancy-group 1 {
    node 0 priority 100;
    node 1 priority 1;
    preempt;
    gratuitous-arp-count 5;
    interface-monitor {
        fe-0/0/0 weight 255;
        fe-0/0/1 weight 255;
        fe-1/0/0 weight 255;
        fe-1/0/1 weight 255;
    }
}

Finally, we add the interfaces to security zones.


root@FW01A# show security zones
security-zone untrust {
    interfaces {
        reth0.0;
    }
}
security-zone trust {
    host-inbound-traffic {
        system-services {
            ping;
        }
    }
    interfaces {
        reth1.0;
    }
}

Verification

After cabling it up, we can verify that the cluster is fully operational.


root@FW01A> show chassis cluster status
Cluster ID: 1
Node                  Priority          Status    Preempt  Manual failover

Redundancy group: 0 , Failover count: 0
    node0                   100         primary        no       no
    node1                   1           secondary      no       no

Redundancy group: 1 , Failover count: 0
    node0                   100         primary        yes      no
    node1                   1           secondary      yes      no


root@FW01A> show chassis cluster interfaces
Control link status: Up

Control interfaces:
    Index   Interface        Status
    0       fxp1             Up

Fabric link status: Up

Fabric interfaces:
    Name    Child-interface    Status
                               (Physical/Monitored)
    fab0    fe-0/0/5           Up   / Up
    fab0
    fab1    fe-1/0/5           Up   / Up
    fab1

Redundant-ethernet Information:
    Name         Status      Redundancy-group
    reth0        Up          1
    reth1        Up          1

Redundant-pseudo-interface Information:
    Name         Status      Redundancy-group
    lo0          Up          0

Interface Monitoring:
    Interface         Weight    Status    Redundancy-group
    fe-1/0/1          255       Up        1
    fe-1/0/0          255       Up        1
    fe-0/0/1          255       Up        1
    fe-0/0/0          255       Up        1

Let’s see how long the failover takes, by unplugging one of the links in the trust zone.

Pinging from the inside switch to the reth1 address 10.0.0.1


theswitch#ping 10.0.0.1 repeat 10000

Type escape sequence to abort.
Sending 10000, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Excellent, only lost one ping while failing over to node1 which would be 2 seconds.

We can see that node1 is now the primary for redundancy group 1, which holds our interfaces:


root@FW01A> show chassis cluster status redundancy-group 1
Cluster ID: 1
Node                  Priority          Status    Preempt  Manual failover

Redundancy group: 1 , Failover count: 3
    node0                   0           secondary      yes      no
    node1                   1           primary        yes      no

{primary:node0}

In the next article, I will dive a bit deeper and integrate the SRX cluster in a real-world topology.

Source: JNCIS-SEC – Juniper SRX100 Cluster Configuration