In an NSX for vSphere environment, basically the management plane is responsible for providing the GUI interface and the REST API entry point to manage the NSX environment.
The control plane includes a three node cluster running the control plane protocols required to capture the system configuration and push it down to the data plane and data plane consists of VIB modules installed in the hypervisor during host preparation.
NSX Controller stores the following types of tables:
- VTEP table -keeps track of what virtual network (VNI) is present on which VTEP/hypervisor.
- MAC table – keeps track of VM MAC to VTEP IP mappings.
- ARP table – keeps track of VM IP to VM MAC mappings.
Controllers maintain the routing information by distributing the routing data learned from the control VM to each routing kernel module in the ESXi hosts. The use of the controller cluster eliminates the need for multicast support from the physical network infrastructure. Customers no longer have to provision multicast group IP addresses. They also no longer need to enable PIM routing or IGMP snooping features on physical switches or routers. Logical switches need to be configured in unicast mode to avail of this feature.
NSX Controllers support an ARP suppression mechanism that reduces the need to flood ARP broadcast requests across the L2 network domain where virtual machines are connected. This is achieved by converting the ARP broadcasts into Controller lookups. If the controller lookup fails, then normal flooding will be used.
The ESXi host, with NSX Virtual Switch, intercepts the following types of traffic:
- Virtual machine broadcast
- Virtual machine unicast
- Virtual machine multicast
- Ethernet requests
- Queries to the NSX Controller instance to retrieve the correct response to those requests
Each controller node is assigned a set of roles that define the tasks it can implement. By default, each controller is assigned all the following roles:
- API Provider: Handles HTTP requests from NSX Manager
- Persistence Server: Persistently stores network state information
- Logical Manager: Computes policies and network topology
- Switch Manager: Manages the hypervisors and pushes configuration to the hosts
- Directory Server: Manages VXLAN and distributed logical routing information
One of the controller nodes is elected as a leader for each role.so may be controller 1 elected as the leader for the API Provider and Logical Manager.controller 2 as the leader for Persistence Server and Directory Server and controller 3 has been elected as the leader for the Switch Manager role.
The leader for each role is responsible for allocating tasks to all individual nodes in the cluster. This is called slicing and slicing is being used to increase the scalability characteristics of the NSX architecture , slicing ensure that all the controller nodes can be active at any given time
The leader of each role maintains a sharding db table to keep track of the workload. The sharding db table is calculated by the leader and replicated to every controller node. It is used by both VXLAN and distributed logical router, known as DLR. The sharding db table may be recalculated at cluster membership changes, role master changes, or adjusted periodically for rebalancing.
In case of the failure of a controller node, the slices for a given role that were owned by the failed node are reassigned to the remaining members of the cluster. Node failure triggers a new leader election for the roles originally led by the failed node.
Control Plane Interaction
- ESXi hosts and NSX logical router virtual machines learn network information and send it to NSX Controller through UWA.
- The NSX Controller CLI provides a consistent interface to verify VXLAN and logical routing network state information.
- NSX Manager also provides APIs to programmatically retrieve data from the NSX Controller nodes in future.
Controller Internal Communication
The Management Plane communicates to the Controller Cluster over TCP/443.The Management Plane communicates directly with the vsfwd agent in the ESXi host over TCP/5671 using RabbitMQ, to push down firewall configuration changes.
The controllers communicates to the netcpa agent running in the ESXi host over TCP/1234 to propagate L2 and L3 changes. Netcpa then internally propagates these changes to the respective routing and VXLAN kernel modules in the ESXi host. Netcpa also acts as a middleman between the vsfwd agent and the ESXi kernel modules.
NSX Manager chooses a single controller node to start a REST API call. Once the connection is established, the NSX Manager transmits the host certificate thumbprint, VNI and logical interface information to the NSX Controller Cluster.
All the date transmitted by NSX Manager can be found in the file config-by-vsm.xml in the directory /etc/vmware/netcpa on the ESXi host. File /var/log/netcpa.log, can be helpful in troubleshooting the communication path between the NSX Manager, vsfwd and netcpa.
Netcpa randomly chooses a controller to establish the initial connection that is called core session and thsi core session is used to transmit the Controller Sharding table to the hosts, so they are aware of who is responsible for a particular VNI or routing instance.
Hope this helps you in understanding NSX Controllers.Happy Learning 🙂