Logical Diagram of VMware vSAN Stretched Cluster
Physical Diagram of VMware vSAN Stretched Cluster
Last week I deployed a test environment of VMware vSAN Stretched Cluster which is running on Dell EMC VxRail Appliance. In this post we will describe how to setup VMware vSAN Stretched Cluster on Dell EMC VxRail Appliance. Above figure is the high level of physical system diagram. In site A/B there are six VxRail Appliances and two 10GB Network Switch which are interconnected by two 10GB links, and each VxRail Appliance has one 10GB uplink connects to each Network Switch. In site C, there are one vSAN Witness host and one 10GB Network Switch. For the details of configuration of each hardware equipment in this environment, you can reference the followings.
Site A (Preferred Site)
3 x VxRail E460 Appliance
Each node includes 1 x SSD and 3 x SAS HDD, 2 x 10GB SFP+ ports
1 x 10GB Network switch
Site B (Secondary Site)
3 x VxRail E460 Appliance
Each node includes 1 x SSD and 3 x SAS HDD, 2 x 10GB SFP+ ports
1 x 10GB Network switch
Site C (Remote Site)
1 x vSAN witness host, 2 x 10GB SFP+ ports
1 x 10GB Network switch
Top-of-Rack Switch A/B/C
Mgmt 192.168.10.x/24, VLAN 100 (Native)
vSAN 192.168.11.x/24, VLAN 110
vMotion 192.168.12.x/24, VLAN 120
VM Network 192.168.13.x/24, VLAN 130
VMware and EMC Software
VxRail Manager 4.5.070
VMware vCenter 6.5 U1 Appliance
VMware ESXi 6.5 U1
VMware vSAN 6.6 Enterprise
VMware vSAN Witness Host 6.5
NOTE: For production network planning, please check the details of “VxRail Planning Guide for Virtual SAN Stretched Cluster”. The above network design is only used for a test environment.
Click to access h15275-vxrail-planning-guide-virtual-san-stretched-cluster.pdf
NOTE: For the best practice, iDRAC port on VxRail requires to connect to the extra 1GB Network Switch.
When all of the physical network connections and requirement are ready at each site, we can start to build up the VxRail cluster. Before the VxRail installation, please note that we need to predefine the required network IP address and VLAN in each 10GB Network Switch, ie ESXi Mgmt, vMotion, vSAN and VM Network. For the details of VxRail network design, please check “Dell EMC VxRail Network Guide”.
Click to access h15300-vxrail-network-guide.pdf
Browse to the VxRail Manager IP address and start to initial the configuration.
For vSAN stretched cluster deployment on VxRail, we need to use external vCenter Server.
When the validation of all the requirements can be passed, we can build VxRail cluster.
After the VxRail cluster build up successfully, we can access the VxRail Manager now.
After it build up VxRail cluster, we login into vCenter with vSphere Web client. We can see it has different network port groups in vDS on each node. The following table shows VxRail traffic on the E Series 10GbE NICs is separated as follows:
Then we need to prepare the one vSAN Witness host at Site C, it is the OVA file. We can download this OVA file at VMware website.
The following is VxRail Release Compatibility Table. In this test environment, we will setup vSAN stretched cluster (3+3+1).
When all nodes and witness host are ready, now we start to configure the vSAN Stretched Cluster now. First we configure the two fault domains, preferred site and secondary site. Each fault domain includes three VxRail node. Then select the host at site C as witness host. Finally select the cache HDD and capacity HDD, it starts to build up vSAN Stretched Cluster automatically.
When vSAN Stretched Cluster is ready on VxRail Cluster, we can see there are six hosts are running in one ESXi cluster, the witness host is running at other data center. Then we go to the next configuration, create the vSAN storage policy to protect the virtual machine. Since this vSAN Stretched Cluster is (3+3+1), it can be only supported RAID-1 mirroring protection. For RAID-5/6 (Erasure Coding), it requires more VxRail nodes at preferred site and secondary site, please reference the above VxRail Release Compatibility Table for details.
The VM Storage Policies for R1 (PFTT=1, SFTT=1, Failure tolerance method=RAID-1)
Primary level of failures to tolerate =1 ( specific data will be replicated into fault domain )
Secondary level of failures to tolerate = 1 ( specific the local protection )
Failure tolerance method = RAID-1 (Mirroring) ( Specific R1 protection at both sites )
The VM Storage Policies for R5/6 (PFTT=1, SFTT=2, Failure tolerance method=RAID-5/6)
Primary level of failures to tolerate =1 ( specific data will be replicated into fault domain )
Secondary level of failures to tolerate = 2 ( specific the local protection )
Failure tolerance method = RAID-5/6 ( Specific Erasure Coding )
When you created the VM storage policies, then you can apply the storage polices into the virtual machines.
For the other information of VMware vSAN and Dell EMC VxRail Appliance, you can reference my previous posts.
Create VM Storage Policies on vSAN 6.6 Stretched Cluster
vSAN Management Pack for vRealize Operations
VMware vSAN 6.6 Stretched Cluster
VxRail 4.0 – Scale Out
Dell EMC RecoverPoint for Virtual Machines 5.1 – Protect VM
EMC RecoverPoint For VMs 5.0 Deployment
EMC vRPVM Cluster configuration
Victor Wu
Chief Architect, Blogger, Author at Dell EMC Knowledge Sharing & Packt
The latest update:
The PFFT=1 and SFTT=2 with erasure coding is 6+6+1 and not 5+5+1 (RAID 6 rules) posted below.
Click to access h15275-vxrail-planning-guide-virtual-san-stretched-cluster.pdf
LikeLike
Hi, thanks a lot for sharing this great post!
Have you checked the latest VxRail Stretched Cluster Planning Guide, i see this note:
“Starting with VxRail 4.5.200, either a VxRail or a Customer Supplied vCenter Server can be used for stretched clusters. Prior to VxRail 4.5.200, only a Customer Supplied vCenter can be used for stretched clusters. (Note: An RPQ is required for using the VxRail vCenter.)
Customer Supplied vCenter Server Appliance is the recommended choice.”
Since the hard requirement for customer supplied vCenter has been removed, now you have to option to use the VxRail Supplied vCenter with Stretched Cluster, meaning no need for customer supplied vCenter license required or external physical server, this can be useful when customer are replacing all their infrastructure with new VxRails nodes on Stretched Cluster topology.
Any possibilities to test this new design option on your lab environment?
Thanks a lot for your time.
Best Regards
LikeLike
Hi Jose,
Click to access h15275-vxrail-planning-guide-virtual-san-stretched-cluster.pdf
I have checked this planning guide (released at Sep 2018). Please note one important thing on the page 6 “the Customer Supplied vCenter can NOT be hosted on and manage the VxRail Cluster that is also in its own Stretched Cluster”. The Customer Supplied vCenter (VCSA) can be hosted on other VxRail Cluster outside of vSAN Stretched Cluster.
Based on this scenario, I cannot test it in this moment. But I will try to test this scenario soon.
Regards
Victor Wu
LikeLike
Hi Victor. Do you know if it’s still a requirement to have the customer supplied vCenter hosted outside of VxRail cluster in a stretched cluster configuration with VxRail 4.7?
Cheers
Suka
LikeLike
According to below guide (updated at Jan 2019), starting with VxRail 4.5.200, either a VxRail vCenter Server or a Customer Supplied vCenter Server can be used for stretched clusters.
DELL EMCVxRAIL™ vSAN STRETCHED CLUSTERS PLANNING GUIDE
Click to access h15275-vxrail-planning-guide-virtual-san-stretched-cluster.pdf
LikeLiked by 1 person
I was looking at the same guide however the guide doesn’t mentioned whether if your customer supplied vCenter can be hosted inside your VxRail cluster or not in a stretched cluster setup
LikeLike
I am suggest to host the vCenter Server outside of VxRail box. In my project of stretched cluster, I follow up this configuration.
LikeLike
Thanks Victor. I’ll keep my eye on this page for an update.
We currently have VxRail stretched cluster with external vCenter deployed however we are looking into bringing vCenter into the VxRail cluster.
Back when we deployed VxRail 4.5 it was mentioned in the “DELL EMC VxRAIL™ VCENTER SERVER
PLANNING GUIDE” a customer supplied vCenter can not be hosted in VxRail cluster.
With VxRail 4.7 and looking at the updated guide you can find the below notes in the guide:
– Prior to VxRail 4.5.200, the customer-supplied vCenter can NOT be hosted on the VxRail cluster it is
managing
– Note: Starting with VxRail 4.5.200, you can deploy a customer-supplied vCenter Server on an
existing VxRail cluster, even the one it is managing. You must still provide a vCenter Server license
– With Stretched clusters, if an Inter-Switch Link (ISL) failure occurs, all virtual machines that are not on the
same site as the vCenter will be powered off. Thus, special attention is needed when planning to deploy
an internal vCenter
The last note is the confusing part as it points out an issue if vCenter is hosted on VxRail stretched cluster.
What do you think?
LikeLike
hello victor,
It is really good you share this with us, I have one question if you could help to get this ans
I have 3+3+1, we are planning to build vxrail stretch cluster. i able to installed the witness appliance.
Cisco ACI stretch cluster we are using between two sites. now question related to primary and secondary site ,
how will i discover the secondary site nodes? like will it discover from the primary site and add it into the cluster. ?
or i should build 3 nodes cluster at primary site first and then later add secondary site as expansion?
i am asking you this because i am facing some challenges, secondary site node can’t be discover from the primary side, but both site can be discover and seen from their own respective site but can not discover each other. do we need to do something here ? any idea or suggestion will be highly appreciated.
Vxrail:- P570F, 4X10gb
VERSION 7.0.214
ESXI 7
SWITCH CONFIGURATION
ALL PORTS ARE TRUNKS WHICH IS CONNCTED TO VXRAIL NODES
Management Vlan:-724
VM Network 724
discovery vlan 3939
LikeLike
I suggest you complete the deployment/configuration of stretched cluster (3+3+1) first in primary site. Then you put the secondary nodes into maintenance mode from cluster and shutdown these nodes. Then you can relocate these nodes into the secondary site, power on it and exit into maintenance mode.
LikeLike