Let’s assume you have an error on your primary master , a failed service , and you can’t create object groups for example. What do you do then ? First thing you have in mind is to reboot, correct ?
But how do you do that with no / minimal interruption in a production environment ?
E.g. Forti OS prior to 6.0.7 has a known bug - ID - 567487 (CPU usage goes to 100% when modifying members of an addrgrp object.)
Step 1: Initial assumption
Assuming you have a A-P cluster made of 2 Fortigate Firewalls ( running FortiOS prior to 6.0.7)
Assuming you have dedicated management IP addresses for each firewall
Assuming you have “set override” enable configured on the HA cluster;
Assuming you had an error on Master firewall preventing your doing some changes ( ex. creating a new firewall address group).
Step 2: Getting job done
=== This should be done in a test environment first, I’m not held responsible if something breaks=====
Log on using SSH / console on each firewall firewall first ( not on the VIP iP address, but on the FW management address) - from a management machine from the same Management L2 subnet.
Log into web management interface on Dashboard to see if HA cluster is synchronized before steps 1,2,3,4
FW1 - the current Master
FW2 - the current Slave
You can double check which FW is the master/ slave by running
get system ha status - Then note the SN of each firewall.
- Run ‘Execute reboot’ on FW2 to reload the FW. Press Y. Wait to return on line.
- On FW1 run ‘diagnose sys ha reset-uptime’ (This will failover the traffic to slave FW2 and slave becomes master).
- Run ‘Execute reboot’ on FW1 to reload the FW.
- On FW2 run ‘diagnose sys ha reset-uptime’ (This will failover the traffic to slave FW1. FW1 retains the previous role of Master).
Now you should have both Master / Slave up an running.
5 Spice ups