Dylan's Main Page

Redundant Bridging Firewall on OpenBSD

Summary for the A.D.H.D. crowd


This is my first time writing a paper that tries to help other people do something. As such, I expect is is riddled with flaws. If you have any feedback that could improve this document, please send it to me at:



This document summarizes what I've learned setting up redundant bridging firewalls on OpenBSD 3.5.

Big Frickin Disclaimer

While this is working and has been for over a year, I can't guarantee that I know what I'm doing. There might be a better way to do this, and this way might actually suck. Make sure you understand the implications of what you're doing before trying to follow these directions.

What you should already know

I assume you can set up a bridging firewall on OpenBSD and you just want to know about the redundancy stuff. You should also know how to wobble around in Cisco IOS. If you don't know how to set up a bridging firewall on OpenBSD this document might help you decide if a redundant bridging firewall is right for you. Or it might not.

What the heck is a Redundant Bridging Firewall on OpenBSD?

The firewall is the thing in front of your accelerator pedal...

OK, a bridging firewall is different from a normal firewall or routing firewall in that is hangs out at the data link layer of the OSI seven layer burrito while a routing firewall hangs out at the network layer. The bridge thinks in terms of Ethernet frames (if you use Ethernet) and the router thinks in terms of TCP/IP packets.

I like the bridging firewall because you don't need to understand how it works to bypass it. If I or one of my co-workers need to, we can just unplug the cables to bypass it. This is handy for a number of reasons. It's easy to take the firewall out of the picture if we're diagnosing network problems. My co-workers can bypass the firewall while I'm away. If I need to do maintenance that affects the firewall I can do so without affecting uptime (though it does affect security)

Since I'd rather not leave my network unprotected, and I'd rather not have the 30 seconds of interruption caused by me dancing between switches with Ethernet cables in my teeth when I bypass the bridging firewall for maintenance, I want a redundant firewall.

A redundant firewall is two firewalls. One is happily filtering packets while the other is waiting as a backup. If the one dies, the second quickly and automatically takes over.

I got my system to automatically failover by using spanning tree (STP). Spanning tree can get funky (or so I hear) when it's trying to talk to too many entities, but this system only uses 4, so it's pretty stable. If you're in an environment that uses a lot of STP and you're thinking of setting up a firewall like mine, I recommend isolating these STP entities from the rest of your STP network. (Rootsafe is good for that.)

OpenBSD 3.5 is the operating system I'm doing this with. If you want to use a different OS, make sure it has a way to share connection state information (pfsync in OpenBSD.)

It's important to note that this does not provide twice the throughput as a single bridging firewall. One of the firewalls is totally idle while the other is working. It may be possible to make them work in parallel (in fact, I think it is), but I haven't looked into it because more throughput isn't a priority for me.

Another really important thing to note is, it fails over when the data link layer has problems. (E.G. you unplug a cable or the bridge crashes completely) If there's a problem with the network layer (E.G. you load a bunch of bad firewall rules) the non-root switch won't know, and won't do the failover dance.

A Quick Note About CARP

If you're an OpenBSD user who knows what carp is, and you're all excited about using it on your bridging firewall, I'm sorry, but you don't get to.

I used Spanning Tree

Building It

What it looks like


I have two Cisco Catalyst 2950 switches running IOS 12.1 and two PCs running OpenBSD 3.5. The PCs each have three nics. Essentially, the Internet goes into one switch, which connects to both firewalls, which both connect to the other switch, which is connected to my protected network. The two firewalls are connected to each other with a crossover cable.

How it works

Thanks to the magic of spanning tree and pfsync. Spanning tree knows when Ethernet goes down and switches to alternative paths when necessary. Spanning tree has a reputation of getting weird in environments with lots of entities, but luckily, this system only requires 4, which seems to be stable. Pfsync can keep the state tables of two firewalls in sync. As long as the state tables are in sync and the bridges mac address table isn't full of wonky addresses, the firewalls don't mind if the traffic switches between the two.

How to set it up

Get spanning tree working without any bridges
direct redundant cables

Set up a test environment with two switches and a few computers plugged into each. The computers plugged into one switch represent the Internet, and the computers plugged into the other represent your own network. Strangely, they're all on the same IP subnet. You just don't trust half of them. That's fine.

Here's links to the Cisco docs I found useful

Connect the two switches together with two Xover cables.

Get consoles onto the two switches

Decide which switch should be the root switch. There are two big things to think about when you decide which switch should be the root switch.

The bridges will always be connected to the root switch. The non-root switch will block it's connection to one of the two bridges to prevent a loop. If you want to be able to SSH to the inactive bridge and that bridge has it's IP address on one of the bridging interfaces (that's what I have) you'll have to SSH through the root switch. If your firewall rules don't let SSH connections to your bridges from outside your bridging firewalls, your root switch had better be behind your firewall.

If you want to change which bridge is active and which is idle without unplugging anyone, you need to change the path cost on the interfaces on the non-root switch. That means you need access to make that kind of changes on the non-root switch, and you should probably be comfortable changing the configuration on that switch often. This is a good argument against using your company backbone router as your non-root switch. That's everything about deciding on a root switch

If you're not using Cisco switches with IOS 12.1, you'll need some different documentation for this part. The key is, get spanning tree running so you can unplug the active port and have the alternate port quickly take over. Once that's done, you can go on to the next section. For those of you using similar equipment to mine, here's what you do.

Standard out of the box STP on Cisco switches should work but a little slowly.

Apply uplinkfast to the non-root switch. This not only makes failover really fast when you unplug from the non-root switch, it also designates this switch as the non-root switch, making the other switch root by default.

Make sure portfast isn't enabled on any of the interfaces that participate in the spanning tree. If it is, it's locked open and will cause loops.

run 'show spanning-tree' or just 'sh sp' to see how your switch's spanning tree is doing.

One switch should say "This bridge is the root" with both lines in the Role of Desg and the Sts of FWD. This is the STP root, and it's pretty boring.

The other switch should have one line in the Role of Root with the Sts of FWD and the other line should be Altn with the Sts of BLK. This is the switch you play with the most.

When it's correctly configured it should say 'Uplinkfast enabled' on the non-root switch

Here's the output from 'sh sp' on my root switch

  Spanning tree enabled protocol ieee                                           
  Root ID    Priority    24577                                               
             Address     0008.a447.6f40   
             This bridge is the root
             Hello Time   2 sec  Max Age 12 sec  Forward Delay  9 sec   
  Bridge ID  Priority    24577  (priority 24576 sys-id-ext 1)
             Address     0008.a447.6f40
             Hello Time   2 sec  Max Age 12 sec  Forward Delay  9 sec
             Aging Time 300                                                     
										Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/1            Desg FWD 19        128.1    P2p                                
Fa0/5            Desg FWD 19        128.5    P2p            

Here's the output from 'sh sp' on my non-root switch

  Spanning tree enabled protocol ieee
  Root ID    Priority    24577
             Address     0008.a447.6f40
             Cost        3074
             Port        10 (FastEthernet0/10)
             Hello Time   2 sec  Max Age 12 sec  Forward Delay  9 sec

  Bridge ID  Priority    49153  (priority 49152 sys-id-ext 1)
             Address     000d.28dd.90c0
             Hello Time   2 sec  Max Age 12 sec  Forward Delay  9 sec           
             Aging Time 300                                                     
  Uplinkfast enabled                                                            
Interface        Role Sts Cost      Prio.Nbr Type                               
---------------- ---- --- --------- -------- --------------------------------   
Fa0/6            Desg FWD 3019      128.6    P2p                                
Fa0/10           Root FWD 3019      128.10   P2p                                

If you notice your Max Age and Forward Delay are higher than mine, that's because I've optimized mine for a spanning tree network with a diameter of 4. To learn more, read the 'Tuning STP for faster failover' document linked above

Test by pinging back and forth between the two test networks you've set up. Unplug the active cable (you can see which is active from 'sh sp')

Ping should keep running with a barely noticeable pause.

If your switches don't look like this, you should consult someone's spanning tree documents. I found Cisco's very helpful (see the links above)

Insert Bridges without firewalls
bridges w/out xover cable

Set up two machines at bridges (I assume you know how to do this, remember) make sure pf is disabled.

Updated 4/15/05

Disable learning on on the interior bridge interfaces interfaces of each bridge (keep discover turned on). If you leave learning on, the backup bridge will think every external mac address is connected to the internal interface. Then when your primary bridge fails, the backup will suddenly find all that traffic coming from the external interface and block traffic from those mac addresses.

Updated 4/15/05

You will have to make a static table that tells the bridge what mac addresses connect to the internal interface. Because of the way ethernet works, it's possible to get a temporary situation where the switch confuses one of the firewall machines for any of the other attached machines. The only solution I know of is to manually enter those mac addresses into either the switch or the firewall machines. I think the firewall machines are easier to manage.

To tell your bridge that a particular mac address can be reached through the internal interface, you run a command like this:

# brconfig bridgename static interface macaddress

So if my bridge is bridge0, my internal interface is fxp0 and a computer attached to the internal switch has the mac address aa:bb:cc:dd:ee:ff, that command would look like this:

# brconfig bridge0 static fxp0 aa:bb:cc:dd:ee:ff

The output of brconfig bridge0 should look something like this:

bridge0: flags=41
                priority 32768 hellotime 2 fwddelay 10 maxage 12
                fxp1 flags=b
                        port 3 ifpriority 128 ifcost 55 forwarding
                fxp0 flags=a
                        port 2 ifpriority 128 ifcost 55 forwarding
        Addresses (max cache: 100, timeout: 240):
                aa:bb:cc:dd:ee:ff fxp0 1 flags=1

Entering a lot of mac addresses by hand would suck (I only have to handle about 10), but it should be trivial to write a script to automate this process.

The brconfig man page is very helpful.

Enable STP on both interfaces of each bridge.

Replace the two Xover cables in the earlier version with two bridges.

By running 'sh sp' on your non-root switch, you should be able to tell which bridge is actively passing traffic and which is not.

Run the ping test from above, but this time unplug the cable between the active bridge and the non-root switch (you have 4 cables to choose from instead of 2.)

Try firewall without shared state information

Install some simple firewall rules, and try the test above again. A simple pf.conf might look like this:

# set some variables
bridges="{ $one $two }"

scrub in all

#block by default
block all

#ignore everything on these interfaces
pass quick on pfsync0 proto pfsync 
pass quick on $state_if  
pass quick on $int_if

# out on ext_if 
pass out on $ext_if keep state
block out on $ext_if inet proto tcp all
pass out on $ext_if inet proto tcp all flags S/SA keep state

# in on ext_if - only allowing ssh
block drop in log on $ext_if all
pass in log on $ext_if inet proto tcp from any to any port 22 \
	flags S/SA keep state

This is the pf.conf I used in my test lab when I was trying this out. It's not good for securing a network. It's good for testing this redundant bridging firewall setup.

Ping should still work, but ssh/telnet/ftp should all die (if your firewall keeps state and blocks packets that aren't either part of an already established connection or the initiating packets of a connection.)

Share State Information
complete setup

(Updated 9/3/05)You don't need to set state-policy if-bound in both machines' pf.conf.

If the machines have different types of network interfaces, if-bound will screw things up badly. For instance, if one machine has fxp0 as the external interface and the second machine has bge0, the second machine would wind up with state information bound to fxp0, which it doesn't have.

In earlier versions of this document, I had recommended if-bound because of difficulties when the inactive machine tried to communicate with hosts outside the firewall. Packets would leave the inactive machine, creating a state table entry, and then arrive at the active machine which would, thanks to pfsync, already have state information for that packet, but the state data would be from the wrong machine's point of view, so the active machine would block that packet.

I recently discovered that you can add no-sync to any 'keep state' rules to prevent them from being transmitted over pfsync. So, all you have to do is add another set of your 'pass out on $extif all keep state' type rules with that machine's ip address as the source instead of 'any' and '(no-sync)' on the end. For instance:

# this firewall machine uses 

# default allow out
pass out on fxp0 inet proto udp all keep state
pass out on fxp0 inet proto icmp all keep state
block out on fxp0 inet proto tcp all
pass out on fxp0 inet proto tcp all flags S/SA modulate state

# don't pass state on my own packets - this way I don't need if-bound any more! 
pass out on fxp0 inet proto udp from to any keep state (no-sync)
pass out on fxp0 inet proto icmp from to any keep state (no-sync)
pass out on fxp0 inet proto tcp from to any flags S/SA modulate state (no-sync)

Now that we've dealt with that problem, we can get back to sharing state.

Run a Xover cable between the two machines. Give these interfaces goofy addresses that aren't likely to cause commotion, like and Make sure they work (ping, ssh etc). This is your state sharing network. Just in case you didn't know, this MUST be a secure network shared only by your firewalls.

Turn on pfsync. You use ifconfig and the interface name, not the IP address you just gave it. In my setup, the interface for the state sharing network is xl2 on both machines, so substitute the correct interface name for your machines.

ifconfig pfsync0 up
ifconfig pfsync syncif xl2

Test that state information is being shared by clearing your state tables and then do something that makes one of the machines note some state information . (any TCP connection across the firewall will cause the active bridge's firewall to track that connection's state)

Then check both state tables to see if they're the same. If they are, you're basically done.

Now you should test the living @#$% out of it by running all sorts of state-sensitive connections over the firewall and trying to make it fail. It can be sensitive to how much time you give it to figure out the spanning tree topology, so you should wait about 30 seconds to a minute between events that cause spanning tree to re-think it's place in the world.

Syncif Watcher

When I originally wrote this document, I forgot to mention the problem of keeping syncif running. It seems, at least in the 3.5 days, that in order for pfsync to bind, both machines had to be running and ready. Also, if you unplugged the pfsync interface, it wouldn't re-bind itself after getting plugged back in. Worse, both machines needed to be re-bound. I needed a way to make the correct interface bind to pfsync after rebooting or after interrupting the connection in any way.

I solved this by writing a little perl script that essentially checks the state of the interfaces every minute and performs corrections as needed. While I don't want to share my perl script and consequently maintain it, I can give you an outline of what it does.

pfsync0 is the pfsync interface, not to be confused with a hardware ethernet interface. In this example xl0 will be the hardware interface we're going to bind to pfsync0. You can use ifconfig to gather all the data and perform all the corrections. You just have to automate it.

  1. Find out if xl0 is active (has a cable attached etc..)
  2. Find out if pfsync0 lists xl0 as bound (syncif)
  3. If xl0 IS active and pfsync0 does NOT list xl0 as a syncif, then bind 'em.
  4. If xl0 is NOT active and pfxync0 DOES list xl0 as a syncif, then unbind 'em and then bind 'em again.
Make sure your bridges come up right when they reboot

You need a pf.conf file, a hostname.if for each network card, a hostname.pfsync0, and a bridgename.bridge0 file. Read the pf.conf, hostname.if, and pfsync man pages. (The hostname.if man page is also the bridgename.bridge man page.)


How many nics is that?

If you have a whole lot of nics laying around, and your hardware has a whole lot of available slots, you can put 4 in each machine. That's two for bridging, one for state sharing and one to to give the machine an IP address you can ssh to.

If you're limited to 3 nics (as I was due to my 1U machines having insufficient slots) you can give one of the bridge interfaces an IP address and ssh in that way. You cannot forgo the xover cable to the other firewall with plans of sharing state over ppp/ipsec/whatever. If you really want to share state over ipsec, you need to use the -current branch of OpenBSD 3.5 or wait for 3.6 to come out.

If you're really limited to only 2 nics, you can use IPSec and the syncpeer option when you set up pfsync (see the ifconfig man page). You'll have to put an IP address on one of the bridge interfaces like the option above.

Another option, if you're out of slots but you want more interfaces, is to buy a card with more than one port on it.


Ramsey Tantawi built a redundant bridging firewall using some of these same techniques, but with some very significant differences.

His system has a longer failover time but a lower cost to build. It uses non-managed switches instead of the manages switches that I use.

Rather than have four spanning tree devices, he has only two: his two bridges. I'll refer to them as the root and non-root.

If both bridges are running, the non-root switch will realize there is a loop and block one of it's ports, making traffic run through the root bridge. If the root switch dies, the non-root switch will realize there is no possible loop and open both ports, making traffic run through the non-root bridge. If both bridges are running and the non-root bridge dies, the root bridge doesn't care. It just keeps processing traffic normally.

Mr. Tantawi says his failover takes between 45 seconds and 3.5 minutes, which is acceptable to him.

Update Jan 17 2006:You can find his writeup here.

Updated Tue Jan 17 10:46:12 PST 2006

Dylan's Main Page