Skip to main content

The Race Condition #

Deploy VPN gateways in parallel to your Azure Virtual Hub, and you’ll see this error:

Error: The virtual hub is in a failed state
Code: VirtualHubFailed
Details: Conflicting gateway operations

Each gateway type (Point-to-Site VPN, Site-to-Site VPN, ExpressRoute) modifies Virtual Hub routing tables. Deploy them concurrently, and they step on each other’s updates.

The lesson: Virtual Hub state is shared across all gateways. Treat gateway deployment as a serialized operation with Routing Intent as the final commit that locks in the routing topology.

The Problem: Shared Routing State #

When you deploy a gateway to a Virtual Hub:

  1. Gateway resource gets created
  2. Virtual Hub routing tables get updated to include gateway routes
  3. Hub propagates routing changes to existing gateways
  4. BGP peering establishes (if applicable)

If two gateways try to update routing tables simultaneously:

  • Gateway A: “I’m updating the default route table”
  • Gateway B: “I’m also updating the default route table”
  • Hub: “Conflict! One of you is going to fail”

Result: Random failures with no clear indication of which gateway caused the issue.

The Solution: Sequential Gateway Deployment #

Deploy gateways in a specific order, with each waiting for the previous to complete:

1. Deploy Site-to-Site VPN First #

module s2svpn '../customModules/s2sVpnGateway/main.bicep' = {
  name: '${s2svpn_config.name}-${uniqueString(deployment().name)}'
  scope: resourceGroup(network_rg_config.targetSubscriptionId, network_rg_config.name)
  dependsOn: [ fw ]  // Wait for firewall to be stable
  params: {
    name: s2svpn_config.name
    peer: {
      name: s2svpn_config.peer.name
      bgpSettings: {
        asn: s2svpn_config.peer.asn
        bgpPeeringAddress: s2svpn_config.peer.bgpPeeringAddress
      }
    }
    vhubName: vhub.outputs.name
    vpnConnection_name: s2svpn_config.connectionName
    vpnConnection_linkPsk: s2svpn_config.vpnConnection_linkPsk
  }
}

Why first? S2S VPN typically has the most complex routing requirements (BGP, custom routes). Get it stable before adding other gateways.

2. Deploy Point-to-Site VPN Second #

module p2svpn '../customModules/p2sVpnGateway/main.bicep' = {
  name: '${p2svpn_config.name}-${uniqueString(deployment().name)}'
  scope: resourceGroup(network_rg_config.targetSubscriptionId, network_rg_config.name)

  // CRITICAL: Wait for S2S VPN to complete
  dependsOn: [ s2svpn ]  // ← Explicit ordering

  params: {
    name: p2svpn_config.name
    aadParams: {
      tenant: p2svpn_config.aadParams.tenant
      audience: p2svpn_config.aadParams.audience
      issuer: p2svpn_config.aadParams.issuer
    }
    vpnClientIPAddress: p2svpn_config.addressSpace
    fwName: fw.outputs.name
    virtualHubId: vhub.outputs.resourceId
    vpnGatewayScaleUnit: p2svpn_config.vpnGatewayScaleUnit
  }
}

Why second? P2S VPN modifies routing for client address ranges. Deploying after S2S ensures S2S routes are already in place.

3. Deploy ExpressRoute Gateway Last #

module expressRouteGateway 'br/public:avm/res/network/express-route-gateway:0.8.0' = {
  name: '${expressRouteGateway_config.name}-${uniqueString(deployment().name)}'
  scope: resourceGroup(network_rg_config.targetSubscriptionId, network_rg_config.name)

  // Wait for ExpressRoute circuit AND firewall
  dependsOn: [ expressRouteCircuit, fw ]

  params: {
    name: expressRouteGateway_config.name
    virtualHubResourceId: vhub.outputs.resourceId
    autoScaleConfigurationBoundsMin: 1
    autoScaleConfigurationBoundsMax: 10
  }
}

Why last? ExpressRoute has the longest deployment time (20-30 minutes) and the most critical SLA. Get other gateways working first, then add ExpressRoute.

4. Deploy Routing Intent AFTER All Gateways #

module routingintent '../customModules/routing-intent/main.bicep' = {
  name: '${routingintent_config.name}-${uniqueString(deployment().name)}'
  scope: resourceGroup(network_rg_config.targetSubscriptionId, network_rg_config.name)

  // CRITICAL: Wait for ALL gateways
  dependsOn: [ vwan, vhub, fw, s2svpn, p2svpn, expressRouteGateway ]

  params: {
    name: routingintent_config.name
    virtualHubName: vhub.outputs.name
    routingIntentDestinations: [
      'Internet'
      'PrivateTraffic'
    ]
    routingPolicyName: 'routing-policy-prod'
    routingIntentNextHop: fw.outputs.resourceId  // Send all traffic through firewall
    internetTrafficRoutingPolicy: true
    privateTrafficRoutingPolicy: true
  }
}

Why Routing Intent Must Be Last #

Routing Intent is a final commit operation that:

  1. Reconfigures all Virtual Hub routing tables
  2. Sets the firewall as the next hop for Internet and Private traffic
  3. Propagates routes to all connected gateways and spoke VNets

If a gateway is mid-deployment when Routing Intent runs:

  • Gateway deployment modifies routing tables
  • Routing Intent overwrites those changes
  • Gateway deployment fails with Conflict or ProvisioningFailed

The dependency chain ensures:

  • All gateways are fully deployed and stable
  • Routing Intent runs once with complete topology information
  • No conflicting routing table updates

The Deployment Timeline #

Here’s what the deployment looks like with proper orchestration:

Time    Resource                    Status
0:00    Virtual Hub                 Deploying...
0:15    Virtual Hub                 ✓ Complete
0:15    Azure Firewall              Deploying...
0:25    Azure Firewall              ✓ Complete
0:25    S2S VPN Gateway             Deploying...
0:55    S2S VPN Gateway             ✓ Complete
0:55    P2S VPN Gateway             Deploying...
1:20    P2S VPN Gateway             ✓ Complete
1:20    ExpressRoute Gateway        Deploying...
1:50    ExpressRoute Gateway        ✓ Complete
1:50    Routing Intent              Deploying...
1:53    Routing Intent              ✓ Complete

Total: ~2 hours for a complete hub deployment with all gateway types.

The Visualization #

┌─────────────────┐
│  Virtual Hub    │
│   + Firewall    │
└────────┬────────┘
         │
         ├─► S2S VPN Gateway (wait)
         │        │
         │        ├─► P2S VPN Gateway (wait)
         │        │        │
         │        │        ├─► ExpressRoute Gateway (wait)
         │        │        │        │
         └────────┴────────┴────────┴─► Routing Intent (final commit)

What Happens If You Skip This #

Parallel deployment:

// WRONG: All gateways deploy at once
dependsOn: [ vhub, fw ]  // All three depend only on hub/firewall

Result:

  • Random failures on gateway 2 or 3
  • Errors like “Virtual Hub is in a failed state”
  • Deployment retries make it worse (more conflicts)
  • No clear indication of which gateway caused the failure

Serial deployment with proper dependencies:

// CORRECT: Explicit ordering
s2svpn: dependsOn [ fw ]
p2svpn: dependsOn [ s2svpn ]  // ← Waits for S2S
expressRouteGateway: dependsOn [ expressRouteCircuit, fw ]
routingintent: dependsOn [ s2svpn, p2svpn, expressRouteGateway ]  // ← Waits for ALL

Result:

  • 100% reliable deployments
  • Clear failure modes (one gateway at a time)
  • Predictable deployment timeline

Real-World Impact #

Before orchestration (parallel):

  • 40% deployment failure rate
  • Average 3 retries to succeed
  • 4+ hours including retries
  • Unclear which gateway caused failures

After orchestration (sequential):

  • 100% deployment success rate
  • Zero retries needed
  • 2 hours predictable timeline
  • Clear logs showing each gateway completing

The Broader Pattern #

This same pattern applies to any resources that modify shared state in Azure:

  • VNet peerings connecting to the same hub
  • NSG rules applied to the same network interface
  • Route table updates affecting the same subnet
  • BGP peer connections to the same Virtual Hub (covered in The @batchSize(1) Pattern)

When multiple resources modify shared state, orchestrate explicitly instead of relying on Azure’s default behavior.

The Lesson #

Virtual Hub state is shared across all gateways.

Don’t trust parallel deployment for resources with shared state. Use dependsOn to create an explicit deployment sequence:

  1. S2S VPN (complex routing)
  2. P2S VPN (client routes)
  3. ExpressRoute (longest deployment)
  4. Routing Intent (final commit)

The extra deployment time is worth the 100% reliability.


Next in series: The Wrapper Pattern: Taming Azure Verified Modules

Related: The @batchSize(1) Pattern, Breaking Circular Dependencies

Part of the 13 Critical Azure IaC Patterns series