Building High-Availability SD-WAN Architectures

High Availability in Fortinet SD-WAN Architectures: A Practical Guide

Significantly under-caffeinated at my desk on my third cup of coffee of the day (and yes, that’s the right one) I find myself thinking about high-availability (HA) in SD-WAN architectures. Especially Fortinet SD-WAN. Ran networking since 1993, when the internet was first coming into homes, starting as a network admin. At that time, it was about muxing voice and data over PSTN lines, this was even before the first worms slithered into our system (Slammer worm was like a slap in the face, taught me early downtime wasn’t a choice).

Fast forward to today—when I run my own security firm PJ Networks and support three banks in their most recent migrations to zero-trust architectures—making sure you have constant connectivity through a HA SD-WAN setup is not something you SHOULD DO, but something you’d need to DO to stay alive.

HA Concepts: So, It Still Matters

High availability has your network (or pieces of it) keep operating even if some bits of it fail. Sounds simple, right? But there’s nuance here, particularly when you’re talking about SD-WANs, which let’s face it, are more complex animals than a good old router.

Here’s the catch: in the old world of networking, HA was all about hardware redundancy: two routers, one active one standby, and standback. With SD-WAN, it’s about software intelligence taking the best route dynamically as well.

Redundancy is the core. If one of those connections goes down—MPLS, broadband, LTE—traffic has to route around it, without you really knowing. That’s the endgame.

Active-Passive vs Active-Active: What’s Your Pick?

This debate is as old as networking itself. Spoiler: neither is a silver bullet.

On Fortinet SD-WAN, I find active-active gets more utilization of bandwidth but can complicate failure detection, and also couples you into jitter and re-ordering (more so for voice) penalties/features.

Active-passive? Simpler failover mechanics. But sometimes your passive link is just a fancy paperweight waiting for the worst to happen.

For the banks that we assisted, a hybrid model was most effective. Active-active for the critical data centers, intelligent active-passive for the remote branches where unreliable broadband is predictable. Flexibility here is key.

Link Redundancy: It’s Not The Number Of Links, It’s What You Do With Them

You would perhaps assume more links means greater redundancy. Yes and no.

Some links are more includable than others:

PJ Networks will always evaluate these important links prior to designing your topology. The question we ask ourselves is what happens if that wire gets cut? Or your ISP has an outage?

Sometimes, redundancy isn’t another link or a third, better-spanning tree (Hank) but better tuning of your existing paths to include better monitoring and fault resilience.

This is basic? Maybe. But so many people just use ping, which lies.

What Keeps Me Up At Night: A Non-Exhaustive Failure Scenario

Been there. Architected dozens of Fortinet SD-WAN architectures with HA to still find faults where you wouldn’t think it exists.

For example:

I have a client, mid-tier bank, lost 15 minutes during a peak hour because their HA was pretty much in theory. We revamped design, added more aggressive failover timers, and honed path health algorithms as part of the SD-WAN.

This is my takeaway: test your failure scenarios BEFORE deploying. Or regret it later.

Testing Failover Drills Are Not Optional

PJ Networks didn’t write the book on HA topologies but we do live by them. We practice failover drills routinely (ya, like fire drills but for your network). It is the only way to detect those subtle issues that sneak through simulations.

What we do:

Our 24×7 monitoring plays to that— after all, sometimes the question isn’t just if fail over works, but when it’s running. Too late = bad. Too early = annoying flapping.

PJ Network Drills: Our Secret Sauce

Look—I admit it. So shats in your service also have value Early in my career failure scenarios all just felt like theory until I was woken up in the middle of the night because, oh yeah, the system I’d built that stopped my customer from being able to send anyone money was a rather large US bank’s entire payment rail. As a result, at PJ Networks, we’ve made failover drills a standard part of our operating procedures.

We’ve seen some eye-openers:

Our drills are what allow us to catch them before our clients experience them.

A Few Hard Truths

Here’s the deal: Fortinet SD-WAN HA is a strong but nuanced feature that requires careful planning, real-world testing, and continual tuning.


Quick Take


Final Thought

Running a secure and highly available SD-WAN is about more than tech. It’s knowing that failure will occur, and being prepared for it. You know, because like, if your network is down, how’s your security even doing?

If you have questions or want to talk about your HA design, let me know! Believe me — I’ve learned the hard way that getting it right can save you a lot of headaches, money and even, yes, sleep.

Exit mobile version