From b817a6e959b8ca93ced60ecda8378119347bcfb1 Mon Sep 17 00:00:00 2001 From: Nick Liu Date: Thu, 12 Feb 2026 11:32:44 -0800 Subject: [PATCH 1/6] update and refine the fully converged design after team discussion --- ...st-StorageConnections-ConnectivityCheck.md | 2 +- ...Overview-Azure-Local-Deployment-Pattern.md | 31 +++++------- .../Reference-TOR-Fully-Converged-Storage.md | 50 +++++++++---------- 3 files changed, 36 insertions(+), 47 deletions(-) diff --git a/TSG/EnvironmentValidator/Networking/Troubleshoot-Network-Test-StorageConnections-ConnectivityCheck.md b/TSG/EnvironmentValidator/Networking/Troubleshoot-Network-Test-StorageConnections-ConnectivityCheck.md index 40511c7a..bc56e69a 100644 --- a/TSG/EnvironmentValidator/Networking/Troubleshoot-Network-Test-StorageConnections-ConnectivityCheck.md +++ b/TSG/EnvironmentValidator/Networking/Troubleshoot-Network-Test-StorageConnections-ConnectivityCheck.md @@ -252,7 +252,7 @@ In converged deployments, the Storage Connections validator will create a tempor 4. If any ping fails, check the following: - - That the VLANs are correctly configured on the TOR switches. In a converged deployment, both storage VLANs should be configured on the interface. + - That the VLANs are correctly configured on the TOR switches. In a converged deployment, each storage VLAN should be configured on its respective ToR switch (Storage VLAN A on ToR-A, Storage VLAN B on ToR-B). - That physical NICs are connected to the correct ports on the TOR switches. - That no VLANs are configured on the physical NICs. - That no firewall rules or other configuration are blocking APIPA traffic. diff --git a/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md b/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md index 59ac33bc..5d884d34 100644 --- a/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md +++ b/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md @@ -72,7 +72,7 @@ A high-performance design utilizing dedicated NICs for management/compute and st ![Switched with 2 ToRs](images/AzureLocalPhysicalNetworkDiagram_Switched.png) **Fully Converged Deployment** -A balanced design where all traffic types (management, compute, storage) share the same physical NICs through VLAN segmentation. This pattern minimizes hardware footprint while maintaining high scalability. **Both storage VLANs must be configured on both ToR switches** because SET (Switch Embedded Teaming) may route either storage VLAN through either physical NIC. +A balanced design where all traffic types (management, compute, storage) share the same physical NICs through VLAN segmentation. This pattern minimizes hardware footprint while maintaining high scalability. The **recommended** configuration uses **one storage VLAN per ToR switch**: Storage VLAN A on ToR-A (mapped to one physical NIC) and Storage VLAN B on ToR-B (mapped to the other physical NIC). In failure scenarios (NIC or ToR), SMB/RDMA traffic automatically fails over to the remaining path. ![Fully-Converged with 2 ToRs](images/AzureLocalPhysicalNetworkDiagram_FullyConverged.png) @@ -83,10 +83,10 @@ A balanced design where all traffic types (management, compute, storage) share t |---------------------|------------------------|-------------------------------|-------------------| | **Switchless** | 2 NICs to switches (M+C traffic) + (Nāˆ’1) direct inter-node NICs (S traffic) | Trunk ports with M, C VLANs only; no storage VLANs on ToRs | Edge deployments, remote sites, cost-sensitive environments | | **Switched** | 4 NICs per host: 2 for M+C traffic, 2 dedicated for storage | M and C VLANs on both ToRs; S1 VLAN on ToR1 only, S2 VLAN on ToR2 only (dedicated storage NICs) | Enterprise deployments requiring dedicated storage performance and traffic isolation | -| **Fully Converged** | 2 NICs per host carrying all traffic types (M+C+S) via VLAN segmentation | Both storage VLANs (S1, S2) on both ToRs (required for SET) | General-purpose deployments balancing performance, simplicity, and hardware efficiency | +| **Fully Converged** | 2 NICs per host carrying all traffic types (M+C+S) via VLAN segmentation | S1 VLAN on ToR-A only, S2 VLAN on ToR-B only (recommended) | General-purpose deployments balancing performance, simplicity, and hardware efficiency | > [!NOTE] -> **Storage VLAN Configuration**: Storage VLANs can be configured as either **Layer 3 (L3) networks with IP subnets** or **Layer 2 (L2) networks without IP subnets**. **Layer 2 configuration is recommended** because it simplifies VLAN tagging, allowing Azure Local hosts to use any IP addresses without hardcoding subnet configurations on the switch or requiring predefined IP ranges. Since Azure Local nodes handle storage traffic tagging, ensure these VLANs are configured as **tagged VLANs on trunk ports** across all ToR switches. +> **Storage VLAN Configuration**: Storage VLANs can be configured as either **Layer 3 (L3) networks with IP subnets** or **Layer 2 (L2) networks without IP subnets**. **Layer 2 configuration is recommended** because it simplifies VLAN tagging, allowing Azure Local hosts to use any IP addresses without hardcoding subnet configurations on the switch or requiring predefined IP ranges. Since Azure Local nodes handle storage traffic tagging, ensure these VLANs are configured as **tagged VLANs on trunk ports** on their respective ToR switches. --- @@ -131,27 +131,20 @@ This tool is designed to automate the generation of Azure Local switch configura ### Q: How should Storage VLANs be configured across ToR switches? **A:** -Storage VLAN configuration depends on the **deployment pattern**: +The recommended baseline design uses **one storage VLAN per ToR switch** for both Switched and Fully Converged deployments: | Deployment Pattern | ToR VLAN Configuration | Why | |-------------------|------------------------|-----| -| **Switched** | S1 on ToR1 only, S2 on ToR2 only | Dedicated storage NICs connect to specific ToRs | -| **Fully Converged** | Both S1 & S2 on both ToRs | SET may route either storage VLAN through either physical NIC | +| **Switched** | S1 on ToR-A only, S2 on ToR-B only | Dedicated storage NICs connect to specific ToRs | +| **Fully Converged** | S1 on ToR-A only, S2 on ToR-B only | Each storage VLAN is mapped to one physical NIC; failover occurs automatically | -**Switched Deployment (One Storage VLAN per ToR):** -- Each host has **dedicated storage NICs** (4 NICs total) -- Storage NIC1 connects to ToR1 → only needs VLAN 711 -- Storage NIC2 connects to ToR2 → only needs VLAN 712 -- This reduces MC-LAG utilization and optimizes RDMA performance +**Storage VLAN Configuration:** +- Storage VLAN A is configured only on ToR-A and mapped to one physical NIC +- Storage VLAN B is configured only on ToR-B and mapped to the other physical NIC +- In failure scenarios (NIC or ToR failure), SMB/RDMA traffic automatically fails over to the remaining path with reduced bandwidth but no functional impact -**Fully Converged Deployment (Both Storage VLANs on Both ToRs):** -- Each host has only **2 NICs** shared for all traffic -- SET (Switch Embedded Teaming) handles vNIC-to-pNIC mapping -- SET may route either storage VLAN through either physical NIC -- **Both ToRs must carry both storage VLANs** to support SET's flexibility - -> [!IMPORTANT] -> In Fully Converged deployments, configuring only one storage VLAN per ToR will cause connectivity issues when SET routes a storage vNIC to a physical NIC connected to a ToR that doesn't have that VLAN configured. +> [!NOTE] +> Configuring both storage VLANs on both ToR switches is also supported but optional. Testing has confirmed there is no meaningful resiliency or failover benefit from this configuration, and it increases complexity without improving availability. ### Q: Are **DCB (Data Center Bridging)** features like **PFC** and **ETS** required for RDMA in Azure Local deployments? diff --git a/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md b/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md index 66776d0f..c52d8cd4 100644 --- a/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md +++ b/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md @@ -27,7 +27,7 @@ This document provides a comprehensive reference for implementing a fully conver - [Quality of Service (QoS)](#quality-of-service-qos) - [BGP Routing](#bgp-routing) - [Frequently Asked Questions](#frequently-asked-questions) - - [Q: Why must both Storage VLANs be on both ToR switches in Fully Converged?](#q-why-must-both-storage-vlans-be-on-both-tor-switches-in-fully-converged) + - [Q: How should Storage VLANs be configured in Fully Converged deployments?](#q-how-should-storage-vlans-be-configured-in-fully-converged-deployments) - [Additional Resources](#additional-resources) - [Official Documentation](#official-documentation) - [Technical Deep Dives](#technical-deep-dives) @@ -44,7 +44,7 @@ Azure Local's fully converged network design provides a unified approach to hand The fully converged physical network architecture integrates **management**, **compute**, and **storage** traffic over the same physical Ethernet interfaces. This design minimizes hardware footprint while maximizing scalability and deployment simplicity. -**Key Design Principle**: In Fully Converged deployments, **both storage VLANs must be configured on both ToR switches**. This is because each host has only 2 NICs (shared for all traffic), and SET (Switch Embedded Teaming) may route either storage VLAN through either physical NIC based on its load balancing algorithm. +**Key Design Principle**: In Fully Converged deployments, the **recommended** baseline design uses **one storage VLAN per ToR switch**: Storage VLAN A is configured only on ToR-A and mapped to one physical NIC, while Storage VLAN B is configured only on ToR-B and mapped to the other physical NIC. In failure scenarios (NIC or ToR), SMB/RDMA traffic automatically fails over to the remaining path with reduced bandwidth but no functional impact. Configuring both storage VLANs on both ToR switches is also supported but optional. ## Architecture Components @@ -132,14 +132,14 @@ The fully converged design uses VLAN segmentation to isolate different traffic t | Storage 1 | SMB storage over RDMA (first path) | 711 | Tagged VLAN, L2 only (no SVI) | | Storage 2 | SMB storage over RDMA (second path) | 712 | Tagged VLAN, L2 only (no SVI) | -> [!IMPORTANT] -> **Storage VLAN Design Pattern for Fully Converged**: In Fully Converged deployments, **both storage VLANs (711 and 712) must be configured on both ToR switches**. This is because: +> [!NOTE] +> **Storage VLAN Design Pattern for Fully Converged**: The **recommended** baseline design uses **one storage VLAN per ToR switch**: > -> - Each host has only **2 NICs** connecting to both ToRs (no dedicated storage NICs) -> - **SET (Switch Embedded Teaming)** handles vNIC-to-pNIC mapping at the host level -> - SET may route either storage VLAN through either physical NIC based on its load balancing algorithm +> - Storage VLAN 711 is configured only on ToR-A and mapped to one physical NIC +> - Storage VLAN 712 is configured only on ToR-B and mapped to the other physical NIC +> - In failure scenarios (NIC or ToR), SMB/RDMA traffic automatically fails over to the remaining path > -> This differs from **Switched** deployments where dedicated storage NICs connect to specific ToRs, allowing one storage VLAN per ToR. +> Configuring both storage VLANs on both ToR switches is also supported but optional. Testing has confirmed no meaningful resiliency benefit from this configuration. ### Top-of-Rack Switch Configuration @@ -168,7 +168,7 @@ This section provides configuration guidance using **Cisco Nexus 93180YC-FX3 (NX - **VLAN 712 (Storage 2)**: Layer 2 only VLAN (no SVI), tagged on trunk ports for RDMA traffic > [!NOTE] -> In Fully Converged deployments, **both storage VLANs must be configured on both ToR switches** because SET handles vNIC-to-pNIC mapping at the host level and may route either storage VLAN through either physical NIC. +> In Fully Converged deployments, the recommended design uses **one storage VLAN per ToR switch**: Storage VLAN 711 on ToR-A only, Storage VLAN 712 on ToR-B only. This simplifies configuration while automatic failover handles NIC or ToR failures. > [!IMPORTANT] > Storage VLANs 711 and 712 should **NOT** be permitted on the ToR-to-ToR peer-link (vPC peer-link, MLAG inter-switch trunk, or any L2 interconnect between ToR switches). Storage traffic must flow directly from host to ToR to destination host to maintain optimal RDMA performance. Allowing storage VLANs on peer links can cause performance degradation. @@ -213,7 +213,7 @@ interface Ethernet1/1-3 switchport switchport mode trunk switchport trunk native vlan 7 - switchport trunk allowed vlan 7,201,711,712 + switchport trunk allowed vlan 7,201,711 priority-flow-control mode on send-tlv spanning-tree port type edge trunk mtu 9216 @@ -227,8 +227,6 @@ vlan 7 name Management_7 vlan 201 name Compute_201 -vlan 711 - name Storage_711 vlan 712 name Storage_712 @@ -253,7 +251,7 @@ interface Ethernet1/1-3 switchport switchport mode trunk switchport trunk native vlan 7 - switchport trunk allowed vlan 7,201,711,712 + switchport trunk allowed vlan 7,201,712 priority-flow-control mode on send-tlv spanning-tree port type edge trunk mtu 9216 @@ -262,8 +260,8 @@ interface Ethernet1/1-3 ``` > [!NOTE] -> - Both ToR switches have **identical VLAN configurations** (7, 201, 711, 712) in Fully Converged deployments -> - SET at the host level handles vNIC-to-pNIC mapping to optimize storage traffic paths +> - ToR-A has Storage VLAN 711 only, ToR-B has Storage VLAN 712 only (one storage VLAN per ToR) +> - In failure scenarios, SMB/RDMA traffic automatically fails over to the remaining path > - QoS policies and routing design (e.g., uplinks, BGP/OSPF, default gateway) will be introduced in a separate document @@ -326,7 +324,7 @@ Host4 c$ Administrator Administrator 3.1.1 2 > [!NOTE] > **SMB Multichannel Validation Key Points:** -> - Both storage VLANs (711 and 712) are operational with RDMA enabled +> - Storage VLANs 711 and 712 are operational with RDMA enabled (each mapped to its respective ToR) > - `RdmaConnectionCount = 2` confirms RDMA is being used for storage traffic > - `TcpConnectionCount = 0` shows no fallback to regular TCP > - SMB 3.1.1 dialect is being used for optimal performance @@ -405,26 +403,24 @@ For BGP routing configuration and best practices in Azure Local deployments: ## Frequently Asked Questions -### Q: Why must both Storage VLANs be on both ToR switches in Fully Converged? +### Q: How should Storage VLANs be configured in Fully Converged deployments? **A:** -In Fully Converged deployments, **both storage VLANs (711 and 712) must be configured on both ToR switches**. This is required because: +The recommended baseline design uses **one storage VLAN per ToR switch** for Fully Converged deployments: -1. **Only 2 NICs per host**: Each host connects one NIC to ToR1 and one to ToR2 -2. **SET handles traffic routing**: Switch Embedded Teaming maps storage vNICs to physical NICs at the host level -3. **Either VLAN through either NIC**: SET's load balancing may route Storage VLAN 711 or 712 through either physical NIC +- Storage VLAN A (711) is configured only on ToR-A and mapped to one physical NIC +- Storage VLAN B (712) is configured only on ToR-B and mapped to the other physical NIC +- In failure scenarios (NIC or ToR failure), SMB/RDMA traffic automatically fails over to the remaining path with reduced bandwidth but no functional impact -**How it differs from Switched deployment:** +**Storage VLAN Configuration:** | Deployment Pattern | Storage NICs | ToR VLAN Config | Why | |-------------------|--------------|-----------------|-----| -| **Fully Converged** | Shared (2 NICs total) | Both VLANs on both ToRs | SET may route either VLAN through either NIC | -| **Switched** | Dedicated (4 NICs total) | One VLAN per ToR | Each storage NIC connects to a specific ToR | - -**Key Point:** The "one storage VLAN per ToR" optimization applies to **Switched** deployments where dedicated storage NICs connect to specific ToRs. In Fully Converged, SET's flexibility requires both VLANs on both switches. +| **Fully Converged** | Shared (2 NICs total) | S1 on ToR-A only, S2 on ToR-B only | One storage VLAN per NIC; failover occurs automatically | +| **Switched** | Dedicated (4 NICs total) | S1 on ToR-A only, S2 on ToR-B only | Each storage NIC connects to a specific ToR | > [!NOTE] -> SET uses vNIC-to-pNIC affinity mapping to optimize traffic paths, but the switches must still be configured to carry both storage VLANs to handle any mapping SET chooses. +> Configuring both storage VLANs on both ToR switches is also supported but optional. Testing has confirmed there is no meaningful resiliency or failover benefit from this configuration, and it increases complexity without improving availability. ## Additional Resources From 8bc945e7135d4ed4017f54a37b4f947d3fb7d30a Mon Sep 17 00:00:00 2001 From: Copilot <198982749+Copilot@users.noreply.github.com> Date: Mon, 16 Mar 2026 08:35:03 -0700 Subject: [PATCH 2/6] [WIP] [WIP] Address feedback from team review on fully converged design update (#262) * Initial plan * Remove VLAN 712 from ToR1 configuration block (one storage VLAN per ToR) Co-authored-by: liunick-msft <105009141+liunick-msft@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: liunick-msft <105009141+liunick-msft@users.noreply.github.com> --- .../Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md b/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md index c52d8cd4..d17c4a1c 100644 --- a/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md +++ b/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md @@ -189,8 +189,6 @@ vlan 201 name Compute_201 vlan 711 name Storage_711 -vlan 712 - name Storage_712 interface Vlan7 description Management From b4310668a319cf8e63be7c44b404df6130cf1abc Mon Sep 17 00:00:00 2001 From: Copilot <198982749+Copilot@users.noreply.github.com> Date: Mon, 16 Mar 2026 08:40:37 -0700 Subject: [PATCH 3/6] Unify TOR switch naming to TOR-A/TOR-B in Fully Converged Storage reference doc (#263) * Initial plan * Unify all TOR switch naming to TOR-A and TOR-B in Reference-TOR-Fully-Converged-Storage.md Co-authored-by: liunick-msft <105009141+liunick-msft@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: liunick-msft <105009141+liunick-msft@users.noreply.github.com> --- .../Reference-TOR-Fully-Converged-Storage.md | 42 +++++++++---------- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md b/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md index d17c4a1c..d84e382f 100644 --- a/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md +++ b/TSG/Networking/Top-Of-Rack-Switch/Reference-TOR-Fully-Converged-Storage.md @@ -44,7 +44,7 @@ Azure Local's fully converged network design provides a unified approach to hand The fully converged physical network architecture integrates **management**, **compute**, and **storage** traffic over the same physical Ethernet interfaces. This design minimizes hardware footprint while maximizing scalability and deployment simplicity. -**Key Design Principle**: In Fully Converged deployments, the **recommended** baseline design uses **one storage VLAN per ToR switch**: Storage VLAN A is configured only on ToR-A and mapped to one physical NIC, while Storage VLAN B is configured only on ToR-B and mapped to the other physical NIC. In failure scenarios (NIC or ToR), SMB/RDMA traffic automatically fails over to the remaining path with reduced bandwidth but no functional impact. Configuring both storage VLANs on both ToR switches is also supported but optional. +**Key Design Principle**: In Fully Converged deployments, the **recommended** baseline design uses **one storage VLAN per ToR switch**: Storage VLAN A is configured only on TOR-A and mapped to one physical NIC, while Storage VLAN B is configured only on TOR-B and mapped to the other physical NIC. In failure scenarios (NIC or ToR), SMB/RDMA traffic automatically fails over to the remaining path with reduced bandwidth but no functional impact. Configuring both storage VLANs on both ToR switches is also supported but optional. ## Architecture Components @@ -82,7 +82,7 @@ This section demonstrates a **fully converged Azure Local deployment** where man ### Design Characteristics - **Fully Converged**: All traffic types (Management, Compute, Storage) utilize the same physical links -- **Redundant Infrastructure**: Each node connects to both ToR1 and ToR2 for high availability +- **Redundant Infrastructure**: Each node connects to both TOR-A and TOR-B for high availability - **Switch Embedded Teaming**: Host-level NIC bonding provides fault tolerance and load balancing - **VLAN Segmentation**: Traffic isolation using IEEE 802.1Q VLAN tagging @@ -103,22 +103,22 @@ The following tables demonstrate physical connectivity between Azure Local nodes | Azure Local Node | Interface | ToR Switch | Interface | |------------------|-----------|------------|-------------| -| **Host1** | NIC A | ToR1 | Ethernet1/1 | -| **Host1** | NIC B | ToR2 | Ethernet1/1 | +| **Host1** | NIC A | TOR-A | Ethernet1/1 | +| **Host1** | NIC B | TOR-B | Ethernet1/1 | #### Host 2 | Azure Local Node | Interface | ToR Switch | Interface | |------------------|-----------|------------|-------------| -| **Host2** | NIC A | ToR1 | Ethernet1/2 | -| **Host2** | NIC B | ToR2 | Ethernet1/2 | +| **Host2** | NIC A | TOR-A | Ethernet1/2 | +| **Host2** | NIC B | TOR-B | Ethernet1/2 | #### Host 3 | Azure Local Node | Interface | ToR Switch | Interface | |------------------|-----------|------------|-------------| -| **Host3** | NIC A | ToR1 | Ethernet1/3 | -| **Host3** | NIC B | ToR2 | Ethernet1/3 | +| **Host3** | NIC A | TOR-A | Ethernet1/3 | +| **Host3** | NIC B | TOR-B | Ethernet1/3 | ### VLAN Architecture @@ -135,8 +135,8 @@ The fully converged design uses VLAN segmentation to isolate different traffic t > [!NOTE] > **Storage VLAN Design Pattern for Fully Converged**: The **recommended** baseline design uses **one storage VLAN per ToR switch**: > -> - Storage VLAN 711 is configured only on ToR-A and mapped to one physical NIC -> - Storage VLAN 712 is configured only on ToR-B and mapped to the other physical NIC +> - Storage VLAN 711 is configured only on TOR-A and mapped to one physical NIC +> - Storage VLAN 712 is configured only on TOR-B and mapped to the other physical NIC > - In failure scenarios (NIC or ToR), SMB/RDMA traffic automatically fails over to the remaining path > > Configuring both storage VLANs on both ToR switches is also supported but optional. Testing has confirmed no meaningful resiliency benefit from this configuration. @@ -168,7 +168,7 @@ This section provides configuration guidance using **Cisco Nexus 93180YC-FX3 (NX - **VLAN 712 (Storage 2)**: Layer 2 only VLAN (no SVI), tagged on trunk ports for RDMA traffic > [!NOTE] -> In Fully Converged deployments, the recommended design uses **one storage VLAN per ToR switch**: Storage VLAN 711 on ToR-A only, Storage VLAN 712 on ToR-B only. This simplifies configuration while automatic failover handles NIC or ToR failures. +> In Fully Converged deployments, the recommended design uses **one storage VLAN per ToR switch**: Storage VLAN 711 on TOR-A only, Storage VLAN 712 on TOR-B only. This simplifies configuration while automatic failover handles NIC or ToR failures. > [!IMPORTANT] > Storage VLANs 711 and 712 should **NOT** be permitted on the ToR-to-ToR peer-link (vPC peer-link, MLAG inter-switch trunk, or any L2 interconnect between ToR switches). Storage traffic must flow directly from host to ToR to destination host to maintain optimal RDMA performance. Allowing storage VLANs on peer links can cause performance degradation. @@ -181,7 +181,7 @@ This section provides configuration guidance using **Cisco Nexus 93180YC-FX3 (NX ##### Sample NX-OS Configuration -**ToR1 Configuration:** +**TOR-A Configuration:** ```console vlan 7 name Management_7 @@ -219,7 +219,7 @@ interface Ethernet1/1-3 no shutdown ``` -**ToR2 Configuration:** +**TOR-B Configuration:** ```console vlan 7 name Management_7 @@ -258,7 +258,7 @@ interface Ethernet1/1-3 ``` > [!NOTE] -> - ToR-A has Storage VLAN 711 only, ToR-B has Storage VLAN 712 only (one storage VLAN per ToR) +> - TOR-A has Storage VLAN 711 only, TOR-B has Storage VLAN 712 only (one storage VLAN per ToR) > - In failure scenarios, SMB/RDMA traffic automatically fails over to the remaining path > - QoS policies and routing design (e.g., uplinks, BGP/OSPF, default gateway) will be introduced in a separate document @@ -341,7 +341,7 @@ Confirm that storage VLANs 711 and 712 are allowed on the trunk to the host: ```console # Verify VLANs are allowed on the interface trunk -ToR1# show interface ethernet 1/3 trunk +TOR-A# show interface ethernet 1/3 trunk Port Native Status Port Vlan Channel @@ -360,7 +360,7 @@ Check MAC address table entries for storage VLANs. The example below shows one p ```console # Check per-VLAN MAC table entries across the ToR -ToR1# show mac address-table vlan 711 +TOR-A# show mac address-table vlan 711 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link, @@ -369,7 +369,7 @@ Legend: ---------+-----------------+--------+---------+------+----+------------------ * 711 0015.5dc8.2006 dynamic 0 F F Eth1/3 -ToR1# show mac address-table vlan 712 +TOR-A# show mac address-table vlan 712 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link, @@ -406,16 +406,16 @@ For BGP routing configuration and best practices in Azure Local deployments: **A:** The recommended baseline design uses **one storage VLAN per ToR switch** for Fully Converged deployments: -- Storage VLAN A (711) is configured only on ToR-A and mapped to one physical NIC -- Storage VLAN B (712) is configured only on ToR-B and mapped to the other physical NIC +- Storage VLAN A (711) is configured only on TOR-A and mapped to one physical NIC +- Storage VLAN B (712) is configured only on TOR-B and mapped to the other physical NIC - In failure scenarios (NIC or ToR failure), SMB/RDMA traffic automatically fails over to the remaining path with reduced bandwidth but no functional impact **Storage VLAN Configuration:** | Deployment Pattern | Storage NICs | ToR VLAN Config | Why | |-------------------|--------------|-----------------|-----| -| **Fully Converged** | Shared (2 NICs total) | S1 on ToR-A only, S2 on ToR-B only | One storage VLAN per NIC; failover occurs automatically | -| **Switched** | Dedicated (4 NICs total) | S1 on ToR-A only, S2 on ToR-B only | Each storage NIC connects to a specific ToR | +| **Fully Converged** | Shared (2 NICs total) | S1 on TOR-A only, S2 on TOR-B only | One storage VLAN per NIC; failover occurs automatically | +| **Switched** | Dedicated (4 NICs total) | S1 on TOR-A only, S2 on TOR-B only | Each storage NIC connects to a specific ToR | > [!NOTE] > Configuring both storage VLANs on both ToR switches is also supported but optional. Testing has confirmed there is no meaningful resiliency or failover benefit from this configuration, and it increases complexity without improving availability. From 5a2968a3c390cdb9361ac47a10340f6accbc5164 Mon Sep 17 00:00:00 2001 From: Nick Liu <105009141+liunick-msft@users.noreply.github.com> Date: Mon, 16 Mar 2026 08:49:55 -0700 Subject: [PATCH 4/6] Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- .../Overview-Azure-Local-Deployment-Pattern.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md b/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md index 5d884d34..207b78ae 100644 --- a/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md +++ b/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md @@ -86,7 +86,7 @@ A balanced design where all traffic types (management, compute, storage) share t | **Fully Converged** | 2 NICs per host carrying all traffic types (M+C+S) via VLAN segmentation | S1 VLAN on ToR-A only, S2 VLAN on ToR-B only (recommended) | General-purpose deployments balancing performance, simplicity, and hardware efficiency | > [!NOTE] -> **Storage VLAN Configuration**: Storage VLANs can be configured as either **Layer 3 (L3) networks with IP subnets** or **Layer 2 (L2) networks without IP subnets**. **Layer 2 configuration is recommended** because it simplifies VLAN tagging, allowing Azure Local hosts to use any IP addresses without hardcoding subnet configurations on the switch or requiring predefined IP ranges. Since Azure Local nodes handle storage traffic tagging, ensure these VLANs are configured as **tagged VLANs on trunk ports** on their respective ToR switches. +> **Storage VLAN Configuration**: Storage VLANs can be configured as either **Layer 3 (L3) networks with IP subnets** or **Layer 2 (L2) networks without IP subnets**. **Layer 2 configuration is recommended** because it simplifies VLAN tagging, allowing Azure Local hosts to use any IP addresses without hardcoding subnet configurations on the switch or requiring predefined IP ranges. For the recommended deployment patterns in this document, storage VLANs must be configured as **tagged VLANs on trunk ports only on their respective ToR switches**, and **must not be tagged across all ToR switches** unless you are intentionally implementing a non-recommended, legacy, or special-case design that explicitly requires global storage VLAN reachability. --- From 10ae26bdb2076a67506e4fb2f9b03fbc78d5bc5b Mon Sep 17 00:00:00 2001 From: Copilot <198982749+Copilot@users.noreply.github.com> Date: Mon, 16 Mar 2026 08:53:20 -0700 Subject: [PATCH 5/6] Initial plan (#264) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> From bf20dc66520664b0842a3fa2a0651a4e479ca5a0 Mon Sep 17 00:00:00 2001 From: Nick Liu <105009141+liunick-msft@users.noreply.github.com> Date: Mon, 16 Mar 2026 08:54:23 -0700 Subject: [PATCH 6/6] Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- .../Overview-Azure-Local-Deployment-Pattern.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md b/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md index 207b78ae..2b0fed79 100644 --- a/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md +++ b/TSG/Networking/Top-Of-Rack-Switch/Overview-Azure-Local-Deployment-Pattern.md @@ -82,7 +82,7 @@ A balanced design where all traffic types (management, compute, storage) share t | Deployment Pattern | Host NIC Configuration | ToR Switch VLAN Configuration | Primary Use Cases | |---------------------|------------------------|-------------------------------|-------------------| | **Switchless** | 2 NICs to switches (M+C traffic) + (Nāˆ’1) direct inter-node NICs (S traffic) | Trunk ports with M, C VLANs only; no storage VLANs on ToRs | Edge deployments, remote sites, cost-sensitive environments | -| **Switched** | 4 NICs per host: 2 for M+C traffic, 2 dedicated for storage | M and C VLANs on both ToRs; S1 VLAN on ToR1 only, S2 VLAN on ToR2 only (dedicated storage NICs) | Enterprise deployments requiring dedicated storage performance and traffic isolation | +| **Switched** | 4 NICs per host: 2 for M+C traffic, 2 dedicated for storage | M and C VLANs on both ToRs; S1 VLAN on ToR-A only, S2 VLAN on ToR-B only (dedicated storage NICs) | Enterprise deployments requiring dedicated storage performance and traffic isolation | | **Fully Converged** | 2 NICs per host carrying all traffic types (M+C+S) via VLAN segmentation | S1 VLAN on ToR-A only, S2 VLAN on ToR-B only (recommended) | General-purpose deployments balancing performance, simplicity, and hardware efficiency | > [!NOTE]