VMware PKS, NSX-T & Kubernetes Networking & Security explained

In continuation of my last post on VMware PKS and NSX-T explaining on getting started with VMware PKS and NSX-T (Getting Started with VMware PKS & NSX-T) , here is next one explaining around behind the scene NSX-T automation for Kubernetes by VMware NSX CNI plugin and PKS.

NSX-T address all the K8s networking functions, load balancing , IPAM , Routing and Firewalling needs, it supports complete automation and dynamic provisioning of network objects required for K8s and it’s workload and this is what i am going to uncover in this blog post.

Other features  like it has support for different topology choice for POD and NODE networks (NAT or NO-NAT) , it supports network security policies for Kubernetes , Clusters , Namespaces and individual services and it also supports network traceability/visibility using NSX-T in built operational tools for kubernetes.

I will be covering the deployment procedure of PKS after some time but just want to let explain that what happens on NSX-T side when you run “#pks create cluster” on PKS command line..and then when you create K8s Namespaces and PODs

pks create cluster

So when you run #pks create cluster with some argument , it goes to vCenter and deploys Kubernetes Master and Worker VMs based on specification you have chosen during deployment and on NSX-T side a new logical switch get created for these vms and get connected to these vms. (in this Example one K8s Master and 2 Nodes has been deployed) , along with logical switch , a Tier-1 cluster router get created which get connected to your organisation’s Tier-0 router.

K8s Master and Node Logical Switches

13.png

K8s Cluster connectivity towards Tier-0

12.png

Kubernetes Namespaces and NSX-T

Now if K8s cluster deployed successfully, Kubernetes cluster by default deploys three name space:

  • Default – The default namespace for objects with no other namespace.
  • kube-system – The namespace for objects created by the Kubernetes system.
  • kube-public – The namespace is created automatically and readable by all users (including those not authenticated). This namespace is mostly reserved for cluster usage, in case that some resources should be visible and readable publicly throughout the whole cluster. The public aspect of this namespace is only a convention, not a requirement.

for each default namespace, PKS automatically deploys and configures NSX-T Logical Switchs and each logical switch will have its own Tier-1 router connected to Tier-0.

14.png

pks-infrastructure Namespace and NSX-T

in the above figure you can clearly see “default”,”kube-public” and “kube-system” Logical Switches. there is another Logical Switch “pks-infrastructure” get created which is pks specific namespace and running pks related stuff like NSX-T CNI. “pks-infrastructure” is running NSX-NCP CNI plugin to integrate NSX-T with kubernetes.

15.png

kube-system Namespace & NSX-T

Typically, this runs pod like heapster , kube-dns , kubernetes-dashboard,  monitoring db , telemetry agent and stuff like ingresses and so on if you deploy so.

16.png

on NSX-T side as explained earlier a Logical switch get created for this Namespace and for each system POD a logical port get created by PKS on NSX-T.

17.png

Default Namespace & NSX-T

This is the cluster’s default namespace which is used for holding the default set of pods, services, and deployments used by the cluster. so when you deploy a POD without creating/specifying a new name space , “default Namespace” becomes default container to hold these pods and as explained earlier this also has its own NSX-T logical switch with a uplink port to Tier-1 router.

18.png

now when you deploy a Kubernetes pod without a new namespace , since that POD will be part of “Default Namespace”, PKS create a NSX-T logical port on the default logical switch.   let’s create a simple POD:

19.png

let’s go back to NSX-T’s  “Default Namespace” logical switch:

20.png

as you can see a new logical port has been created on default logical switch.

New Namespace & NSX-T

Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces. in simple terms Namespaces are like org vdc in vCD and Kubernetes best practice is to arrange PODs in namespaces. so when we create a new Namespace , what happens in NSX-T ?

i have created a new Namespace called “demo”.

23.png

if you observe below images left image showing default switches and right image is showing logical switches after creation of new Namespace.

and as you can see a new Logical switch has been created for new Namespace.

if you are creating PODs in default Namespace then all the pods get attached to default logical switch and if you are creating Namespace ( which is K8s best-practice) then a new logical switch get created and any POD which is getting deployed in this namespace will be part of its NSX-T logical switch and this new logical switch will also have its own Tier-1 router connecting to Tier-0 router.

24.png

Expose PODs to Outer World

in this example we deployed POD and get the internal network connectivity but internal only connectivity is not going to give access to this web server to outer world and this is default forbidden in kubernetes , so we need to expose this deployment using load balancer to the public interface on the specific port. let’s do that:

26.png

27.png

Lets browse this App using EXTERNAL-IP as you might know CLUSTER-IP is internal ip.

33.png

Kubernetes Cluster and NSX-T Load Balancer

As above when we expose this deployment using service on NSX-T there is a cluster load balancer get deployed automatically when we create cluster , on this load balancer NSX-CNI go ahead and add pod to virtual servers under a new load balancer VIP.

28.png

if we drill down to pool members of this VIP , we will see our kubernetes pod ep ips.

29.png

behind the scene when you deploy a cluster a LB logical switch and LB Tier-1 router which is having logical connectivity to the Load Balancer and Tier-0 Router , so that you can access the deployment externally.

3031

This is what your Tier-0 will look like, having connectivity to all the Tier-1 and Tier-1 is having connected to Namespace logical switches.

32.png

these all logical switchs, Tier-1 router , Tier-0 router creations , their connectivity , LB creations etc all has been done automatically by NSX-T container (CNI) plugin and PKS. i was really thrilled when i tried this first time and it was so simple , if you understood the concept.

Kubernetes and Micro-segmentation

The NSX-T container plugin helps to exposure of container “Pods”as NSX-T logical switch /ports and because of this we can easily implement micro-segmentation rules. once “Pods ” expose to the NSX ecosystem, we can use the same approach we have with Virtual Machines for implementing micro segmentation and other security measures.

3435

or you can use security groups based on tags to achieve micro segmentation.

3637

 

This is what i have tried in this post to explain what happen behind the scene on NSX-T networking stack when you deploy and expose your applications on kubernetes and how we can bring in proven NSX-T based micro-segmentation.

Enjoy learning PKS,NSX-T and Kubernetes one of the best combination for Day-1 and Day-2 operation of kubernetes 🙂 and feel free to comment and suggestions.

 

 

 

Advertisements

Getting Started with VMware PKS & NSX-T

VMware Pivotal Container Service (PKS) provides a Kubernetes based container service for deploying and operating modern applications across private and public clouds. basically it is Managed kubernetes for multiple kubernetes cluster and aimed at Day 2 operations. K8S is designed with focus on high availability, auto-scaling and supports rolling upgrades.

PKS integrates with VMware NSX-T for advanced container networking, including micro-segmentation, ingress controller, load balancing, and security policy and also by using VMware Harbor, PKS secures container images through vulnerability scanning, image signing, and auditing.A PKS deployment consists of multiple VM instances classified into 2 different categories:

PKS Management Plane –

PKS management plane consist of below VMs:

tiles.png

  • PCF Ops Manager

Pivotal Operations Manager (Ops Manager) is a graphical interface for deploying and managing Pivotal BOSH, PKS Control Plane, and VMware Harbor application tiles. Ops Manager also provides a programmatic interface for performing lifecycle management of Ops Manager and application tiles.

  • VMware BOSH Director

Pivotal BOSH is an open-source tool for release engineering for the deployment and lifecycle management of large distributed systems. By using BOSH, developers can version, package, and deploy software in a consistent and reproducible manner.

BOSH is the first component, that’s installed by Ops Manager. BOSH is a primary PKS tile.BOSH was originally designed to deploy open source Cloud Foundry.Internally BOSH has below components:

  1. Director: This holds the role of core orchestration engine controls the provisioning of vms , required softwares and service life cycle events.
  2. Blobstore: The Blobstore stores the source forms of releases and the compiled images of releases. An operator uploads a release using the CLI, and the Director inserts the release into the Blobstore. When you deploy a release, BOSH orchestrates the compilation of packages and stores the result in the Blobstore.
  3. Postgres DB: Bosh director uses a postgres database to store information about the desired state of deployment including information about stemcells, releases and deployments. DB is internal to the Director VM.
  • Pivotal Container Service (PKS Control Plane)

PKS Control Plane is the self service API for on-demand deployment and Life cycle management of K8s clusters. API submit the request to BOSH which automates the creation , deletion and updates of kubernetes clusters.

  • VMware Harbor

VMwware harbor is an open-source, enterprise-class container registry service that stores and distributes container images in a private, on-premises registry. In addition to providing Role-Based Access Control (RBAC), Lightweight Directory Access Protocol (LDAP), and Active Directory (AD) support, Harbor provides container image vulnerability scanning, policy-based image replication, notary and auditing service.

PKS Data Plane

  • Kubernetes

K8s is an open-source container orchestration framework. Containers package applications and their dependencies in container images. A container image is a distributable artifact that provides portability across multiple environments, streamlining the development and deployment of software. Kubernetes orchestrates these containers to manage and automate resource use, failure handling, availability, configuration, scalability, and desired state of the application.

Integration with NSX-T

VMware NSX-T helps simplify networking and security for Kubernetes by automating the implementation of network policies, network object creation, network isolation, and micro-segmentation. NSX-T also provides flexible network topology choices and end-to-end network visibility.

PKS integrates with VMware NSX-T for production-grade container networking and security. A new capability introduced in NSX-T 2.2 allows you to perform workload SSL termination using Load Balancing services. PKS can leverage this capability to provide better security and workload protection.

Major benefit of using NSX-T with PKS and K8s is automation that is  dynamic provisioning and association of network objects for unified VM and pod networking. The automation includes the following:

  • On-demand provisioning of routers and logical switches for each Kubernetes cluster
  • Allocation of a unique IP address segment per logical switch
  • Automatic creation of SNAT rules for external connectivity
  • Dynamic assignment of IP addresses from an IPAM IP block for each pod
  • On-demand creation of load balancers and associated virtual servers for HTTPS and HTTP
  • Automatic creation of routers and logical switches per Kubernetes namespace, which can isolate environments for production, development, and test.

PKS Networking

  • PKS Management Network

This network will be used to deploy PKS Management components. this could be a dvSwitch or NSX-T logical switch since in my Lab i will be using no NAT topologies with virtual switch. in my Lab i will be using dvs with network segment of 192.168.110.x/24.

  • Kubernetes Node Network

This Network will be used for kubernetes management nodes. it is allocated to master and worker nodes. these nodes embed Node Agent to monitor the liveness of the cluster.

  • Kubernetes Pod Network

This network is used when an application will be deployed on to a new kubernetes namespace. A /24 network is taken from IP Block and is allocated to a specific Kubernetes namespace allowing for network isolation and policies to be applied between name spaces. The NSX-T Container Plugin automatically creates the NSX-T logical switch and Tier-1 router for each name spaces.

  • Load Balancer and NAT Subnet

This network pool, also known as the Floating IP Pool, provides IP addresses for load balancing and NAT services which are required as a part of an application deployment in Kubernetes.

PKS deployment Network Topologies – Refer Here

PKS Deployment Planning

Before you install PKS on vSphere with NSX-T integration, you must prepare your vSphere and NSX-T environment and ensure vCenter, NSX-T components, and ESXi hosts must be able to communicate with each other, ensure we have adequate resources.

  • PKS Management VM Sizing

When you size the vSphere resources, consider the compute and storage requirements for each PKS management component.

VM Name vCPU Memory Storage No. of VMs
Ops Manager 1 8 160 GB 1
BOSH 2 8 103 GB 1
PKS Control VM 2 8 29 GB 1
Compilation VMs 4 4 10 GB 4
Client VM 1 2 8 GB 1
VMware Harbor 2 8 169 GB 1

Compilation vms get created when an initial K8s cluster is deployed, software packages are compiled and four additional service VMs are automatically deployed as a process as a single task and these vms get deleted once compilation process completes. To manage and configure PKS,  PKS and Kubernetes CLI command-line utilities are required , these utilities can be installed locally on a workstation called Client VM.

  • Plan your CIDR block

Before you install PKS on vSphere with NSX-T, you should plan for the CIDRs and IP blocks that you are using in your deployment as explained above. these are the CIDR blocks that we need to plan:

  • PKS MANAGEMENT CIDR
  • PKS LB CIDR
  • Pods IP Block
  • Nodes IP Block

Below are the CIDR blocks that you can’t use because:

The Docker daemon on the Kubernetes worker node uses the subnet in the following CIDR range:

  • 172.17.0.1/16
  • 172.18.0.1/16
  • 172.19.0.1/16
  • 172.20.0.1/16
  • 172.21.0.1/16
  • 172.22.0.1/16

If PKS is deployed with Harbor, Harbor uses the following CIDR ranges for its internal Docker bridges:

  • 172.18.0.0/16
  • 172.19.0.0/16
  • 172.20.0.0/16
  • 172.21.0.0/16
  • 172.22.0.0/16

Each Kubernetes cluster uses the following subnet for Kubernetes services,Do not use the following IP block for the Nodes IP Block:

  • 10.100.200.0/24

In this blog post series i will be deploying NO-NAT topology and will walk you through step by step process of PKS deployment with NSX-T integration.

Next post on this series is VMware PKS, NSX-T & Kubernetes Networking & Security explained , this will help you understand what happens behind the scene in networking and security stack when PKS and NSX-T deploys kubernetes and its networking stack.

 

 

Upgrade NSX-T 2.1 to NSX-T 2.3

I am working on PKS deployment and will soon sharing my deployment procedure on PKS but before proceeding with PKS deployment, i need to upgrade my NSX-T lab environment to support latest PKS as per below compatibility matrix.

PKS Version Compatible NSX-T Versions Compatible Ops Manager Versions
v1.2 v2.2, v2.3 v2.2.2+, v2.3.1+
v1.1.6 v2.1, v2.2 v2.1.x, 2.2.x
v1.1.5 v2.1, v2.2 v2.1.x, v2.2.x
v1.1.4 v2.1 v2.1.x, 2.2.x
v1.1.3 v2.1 v2.1.0 – 2.1.6
v1.1.2 v2.1 v2.1.x, 2.2.x
v1.1.1 v2.1 – Advanced Edition v2.1.0 – 2.1.6

In this post i will be covering the procedure to upgrade NSX-T 2.1 to NSX-T 2.3.

So Before proceeding for upgrade , lets check the health of current deployment which is very important because if we start upgrading the environment and once upgrade is completed and after upgrade if some thing is not working , we will not come to know whether before upgrade it was working or not , so lets get in to validation of health and version checks.

Validate Current Version Components Health

First thing to check the Management Cluster and Controller connectivity and ensure they are up.

Next is to Validate host deployment status and connectivity.

34

Check the Edge health

5.png

Lets check the Transport Node Health

6

Upgrade Procedure

Now Download the upgrade bundle

7.png

Go to NSX Manager and browse to Upgrade

8.png

Upload the downloaded upgrade bundle file in NSX Manager

9.png

Since upgrade bundle is very big in size , it will take lots of time in upload, extraction and verification.Once the package has uploaded, click to “BEGIN UPGRADE”.

11

The upgrade coordinator will then check the install for any potential issues. In my environment there is one warnings for the Edge that the connectivity is degraded – this is because of i have disconnected 4 th nic which is safe to ignore, so when you are doing for your environment , please access all the warnings and take necessary actions before proceeding with upgrade.

12

Click Next will take you to view the Hosts Upgrade page. Here you can define the order and method of upgrade for each host, and define host groups to control the order of upgrade. I’ve gone with the defaults, serial (one at a time) upgrades over the parallel because i have two hosts in each clusters.

Click START to begin the upgrade, and the hosts will be put in maintenance mode, then upgraded and rebooted if necessary. ensure you need to have DRS enabled and the VMs on the hosts must be able to vMotion off of the host being put in maintenance mode. Once the host has upgraded, and the Management Plane Agent has reported back to the Manager, the Upgrade Coordinator will move on to the next host in the group.

13.png

Once the hosts are upgraded, click next to move to the Edge Upgrade page. Edge Clusters can be upgraded parallel if you have multiple edge clusters, but the Edges which has formed the Edge Clusters and upgraded serially to ensure connectivity is maintained. In my lab , i have a single Edge Cluster with two Edge VMs, so this will be upgraded one Edge at a time.Click on the “START” to start the edge upgrade process.

14

15

Once the Edge Cluster has been upgraded successfully, click NEXT to move to the Controller Node Upgrade Page. here you can’t change the sequence of upgrade of the controllers, controllers are done in parallel by default. (in my Lab i am running a single controller because of resource constraint but in production you will see three controllers deployed in a cluster). Click on “START” to begin the upgrade process.

16.png

Once the controller upgrade has been completed, click NEXT to move to the NSX Manager upgrade page. The NSX Manager will become unavailable for about 5 minutes after you click START and it might take 15 to 20 minutes to upgrade the manager.

17.png

Once the Manager upgrade has completed. review the upgrade cycle.

19.png

you can re-validate the installation as we did at the start of the upgrade, checking that we have all the green lights on, and the version of components have increased.

vCloud Availability Cloud-to-Cloud Design and Deploy Guide

a.pngvCloud Architecture Toolkit white paper that I have written now has been  published on the cloudsolutions.vmware.com website – this design and deploy guide helps cloud providers to design and deploy vCloud Availability Cloud-to-Cloud DR solution.  This guide is based on real life example and helps cloud providers to successfully plan , design and deploy vCloud Availability Cloud-to-Cloud DR based on version 1.5.

White Paper Download Link

This white paper includes the following chapters to plan your deployment:

  • Introduction
  •  Use Cases
  • vCloud Availability Cloud-to-Cloud DR Components
  • vCloud Availability Cloud-to-Cloud DR Node Types and Sizing
  • vCloud Availability Cloud-to-Cloud DR Deployment Requirements
  • vCloud Availability Cloud-to-Cloud DR Architecture Design
  • Physical Design
  • Certificate
  • Network Communication and Firewalls
  • Deployment
  • Replication Policy
  • Services Management Interface Addresses
  • Log Files
  • Configuration Files
  •  References

I hope this helps in your plan, design  and deployment of vCloud Availability Cloud-to-Cloud DR version 1.5. please feel free to share the feedback to make this white paper more effective and helpful.

What is VMware vCloud Availability Cloud-to-Cloud DR

The VMware vCloud Availability for Cloud-to-Cloud DR solution extends existing hybrid cloud offerings of VMware Cloud Providers™ on top of VMware vCloud Director with disaster recovery and application continuity between vCloud Director Virtual Data Centers or Cloud Environments. vCloud Availability Cloud-to-Cloud brings in a much-needed in providing native Disaster Recovery between vCloud Director instances. VMware vCloud Availability for Cloud-to-Cloud DR can help VMware Cloud Providers enable further monetization of existing VMware vCloud Director multi-tenant cloud environments with DR services, including replication and failover capabilities for workloads at both VM and vApp level.

1.png

Features:

  • vCloud Availability Cloud-to-Cloud DR has capability of each deployment to serve as both source and recovery sites. There are no dedicated source and destination sites. Same set of appliances works as Sources or Destination.
  • Replication and recovery of vApps (VMs) between organization Virtual Data Centers (orgVDC) as well as two instances of vCloud Director for migration, DR, and planned migration.
  • it offers complete self-serviceability for the provider and tenant administrator via a unified HTML5 portal that can be used alongside vCloud Director. Replication, migration, and failover can be managed completely by the tenant or provided as a managed service by the provider.
  • Symmetrical replication flow that can be started from either the source or the recovery vCD instance.
  • Built-in encryption or encryption and compression of replication traffic.
  • Enhanced control with white-listing of DR-enabled vCloud Director organizations, enforcement of specific min Recovery Point Objective (RPO) at an organization (org) level, maximum snapshots per org and max replications per tenant.
  • Provide Non-disruptive, on-demand disaster recovery testing.
  • Policies that allow service provider administrators to control the following system attributes for one or multiple vCloud Director organizations:
    • Limit the number of replications at the vCloud Director organization level
    • Limit the minimum Recovery Point Objective (RPO)
    • Limit number of retained snapshots per VM replication
    • Limit the total number of VM replications

Use Cases:

Though the most obvious use case for VMware vCloud Availability Cloud-to-Cloud DR is disaster recovery from one cloud availability zone to another cloud availability zone, it can handle a number of different use cases and provide significant capability and flexibility to service providers. For all use cases and situations, VMware vCloud Availability Cloud-to-Cloud DR supports non-disruptive testing of protected cloud workload in network and storage isolated environments. This provides the ability to test disaster recovery, disaster avoidance, or planned migrations as frequently as desired to ensure confidence in the configuration and operation of recovery on cloud. The use cases are as below:

Migration:

A tenant or provider administrator can utilize C2C to migrate workloads from one organization VDC to another with minimal disruption from a self-service portal. End benefit is re-organizing workloads from an easy to use workflow.

  • Easy to use workflow mechanism
  • Organize workloads in different orgVDCs
  • Ability to migrate between vCD instances or within the same vCD instance

Disaster Recovery:

A service provider has multiple sites with vCD based multi-tenant environment. Customer like to do DR from one cloud provider site to another cloud site. Disaster recovery or a planned/unplanned failover is what VMware vCloud Availability Cloud-to-Cloud DR was specifically designed to accomplish for cloud providers. This helps providers and customers to achieve:

  • Fastest RTO
  • Recover from unexpected failure
  • Full or partial site recovery

Disaster Avoidance:

Preventive failover is another common use case for VMware vCloud Availability Cloud-to-Cloud DR. This can be anything from an oncoming storm to the threat of power issues.

VMware vCloud Availability Cloud-to-Cloud DR allows for the graceful shutdown of virtual machines at the protected site, full replication of data, and startup of virtual machines and applications at the recovery site ensuring app-consistency and zero data loss. Solution helps Providers and Customer in recovering from:

  • Anticipate outages
  • Preventive failover
  • Graceful shutdown ensuring no data loss

Upgrade and Patch Testing:

The VMware vCloud Availability Cloud-to-Cloud DR test environment provides a perfect location for conducting operating system and application upgrade and patch testing. Test environments are complete copies of production environments configured in an isolated network segment which ensures that testing is as realistic as possible while at the same time not impacting production workloads or replication.

This will give you basic idea of what vCloud Availability Clout-to-Cloud DR solves for the providers.

 

Features of VMware Cloud on AWS

VMware Cloud on AWS enables operational consistency for customers of all sizes whether their workloads operate on-premises or in the public cloud. here i would be covering some of the great feature which i like most and will give you opportunity to understand and explore more..

Automated Cluster Remediation:

Let’s suppose in our on-prem environment we have 8 node cluster , one of the node goes down because of hardware failure , that’s where our struggle start to get required hardware from hardware vendor etc.. but most importantly we loose one host in our HA cluster and if this cluster was highly utilised then your application VM might start facing resource crunch and in my experience this might go for at least 3-4 days by the time you get hardware fix and put back the host in to the cluster.

Now see the power of VMware Cloud on AWS – failed hosts in a VMware SDDC are automatically detected by VMware and replaced with healthy hosts and process runs as below:

  • VMware Team detects Host failure or problem identified
  • New Host will be added in to the cluster and data from problematic host will be either rebuild or migrated.
  • Old host evacuated from the cluster and replaced by new host.

Scale as per your convenience:

One of the major challenges in traditional data centers is finding the right balance between hardware and workload utilization.

VMware Cloud on AWS enables you to quickly scale up to ensure that you always have enough capacity to run your workloads during volume spikes and quickly scale down to ensure that you are not paying for hardware that is not being used. This feature provides higher availability with lower overall costs.

aws4

you have option to add and remove cluster as well as Host or you can enable Elastic Distributed Resources Scheduler (EDRS) , which is a policy-based solution that automatically scales a vSphere Cluster in VMware Cloud on AWS based on utilization. EDRS monitors CPU, memory, and storage resources for scaling operations. EDRS monitors the vSphere cluster continuously, and each 5 minutes EDRS runs the algorithm to determine if scale-out or scale-in operations is required.

vCenter Hybrid Linked Mode:

Hybrid Linked Mode allows you to link your VMware Cloud on AWS vCenter Server instance with an on-premises vCenter Single Sign-On domain and If you link your cloud vCenter Server to a domain that contains multiple vCenter Server instances linked using Enhanced Linked Mode, all of those instances are linked to your cloud SDDC.

You have two options for configuring Hybrid Linked Mode. You can use only one of these options at a time.

  • You can install the Cloud Gateway Appliance and use it to link from your on-premises data center to your cloud SDDC. In this case, Active Directory groups are mapped from your on-premises environment to the cloud.

  • you can link from your cloud SDDC to your on-premises data center. In this case, you must add Active Directory as an identity source to the cloud vCenter Server.

Using Hybrid Linked Mode, you can:

  • View and manage the inventories of both your on-premises and VMware Cloud on AWS data centers from a single vSphere Client interface, accessed using your on-premises credentials.

  • Migrate workloads between your on-premises data center and cloud SDDC.

  • Share tags and tag categories across vCenter Server instances.

Well Defined Separation of Duty for VMware and Customer Teams:

Amazon in discussion with VMware performs the following  tasks:

Hardware refresh , failed component replacement , bios upgrade and underline firmware patching will be done by AWS based on VMware compatibility list and this allow customer not to worry about this tedious exercise, compatibility issues and dedicated skill resources.

VMware Experts perform the following maintenance tasks:

  • Backup and restore of VMware appliances and infrastructure  like vCenter, NSX Manager,PSC etc…
  • Patching VMware Cloud on AWS components like vSphere, ESXi drivers, vSAN, NSX, SDDC console etc…this helps customers to just focus of App VM and their business , leave their virtual infrastructure maintenance to experts.
  • Providing VMware Tools patches through vSphere and will be available to your virtual machines , now customer is free to
  • Host and infrastructure VM monitoring

Customer’s Administrator are responsible for the following tasks:

  • Customer administrator manages backup and restoration of your workload VMs and applications.
  • Patching inside VM like guest OS, applications etc..
  • Upgrading VMware Tools installed on workload VMs
  • Monitoring of the your workload VMs and applications
  • Keeping VM templates and content library files updated so that new vms are deployed with latest/updated/patched updated master templates.
  • Manage and monitoring user access and monitoring of resource utilization and charges of integrated AWS if consuming.

Outages, Scheduled Maintenance, and Health Service Information:

VMware has hosted a separate website to display the current status of VMware Cloud services at https://status.vmware-services.io/ , you can subscribe to updates.

Apart from VMware Cloud on AWS service, this website reports for below services also:

  • VMware AppDefense
  • VMware Cost Insight
  • VMware Discovery
  • VMware Kubernetes Engine
  • Log Intelligence
  • VMware Network Insight

NSX Hybrid Connect

NSX Hybrid Connect enables cloud on-boarding without retrofitting source infrastructure and supports migration from vSphere 5.1 or later to VMware Cloud on AWS without introducing application risk and complex migration assessments.NSX Hybrid Connect includes:

  • vSphere vMotion
  • bulk migration
  • high throughput network extension
  • WAN optimization
  • traffic engineering
  • load balancing
  • automated VPN with strong encryption
  • secured data center interconnectivity with built-in hybrid abstraction and hybrid interconnects.

aws1.png

VMware Site Recovery

VMware Site Recovery for VMware Cloud on AWS is separately purchased item that communicates with separately licensed VMware Site Recovery Manager and VMware vSphere Replication instances. Recovery can occur from on-premises to AWS or AWS SDDC to AWS SDDC. VMware Site Recovery can protect vCenter Server version 6.7, 6.5, and 6.0 U3.

aws2.png

Consumption of AWS Native Services with VMware Cloud on AWS

The partnership between VMware and Amazon increases the catalog of solutions readily available to all VMware Cloud on AWS users. Some of the popular AWS solutions are listed below:

  • Simple Storage Service (S3): Highly available, highly durable object storage service.
  • Glacier: Highly durable, high latency archive storage used mostly for backup.
  • EC2: AWS flagship compute platform.
  • VPC: Networking solution of AWS solutions both internal and external.
  • CloudWatch: Monitoring for AWS solutions.
  • IAM: Identity and Access Management solution of AWS.
  • AWS Database Services: Wide range of  DB service like: Relational Database Service (RDS), DynamoDB (NoSQL Database Service), RedShift (data warehouse for data from relational databases for analytics)
  • Simple Queue Service (SQS): Fully managed message queues for microservices, distributed systems, and server-less applications.
  • Route 53: (DNS) Domain name provider and services.
  • Elasti-Cache: Managed, in-memory data store services.

Simple and feature-rich Web Interface for Network Services

Customer can easily consume Network services with few clicks , you need not to be network expert and strong command line hands-on experience. just few clicks and your IPsec VPN, L2 VPN , NAT , Edge FW rules , getting public IP from amazon all are ready to consume.

aws3.png

i have covered few features of VMware Cloud on AWS , if you wants to dirty your hands , go ahead and login to http://labs.hol.vmware.com  and if your organisation wants to test the feature and ease of consumption , there is one host option is there , By deploying a 1-node SDDC, you will be able to test out the features and functionality of VMware Cloud on AWS at a fraction of the cost. These 1-node SDDC’s are fully self-service, paid for by credit card (or HPP/SPP credits), and deployed in just under two hours.

Hope this helps you in understanding feature of VMware Cloud on AWS  better 🙂

 

 

Getting Started with VMware Cloud on AWS

What is VMware Cloud on AWS ?

VMware Cloud on AWS allows the use of familiar VMware products while leveraging the benefits of a public cloud. A hybrid infrastructure can be created between an on-premises VMware vSphere software-defined data center (SDDC) and a VMware Cloud on AWS SDDC.

VMware Cloud on AWS allows you to create vSphere data centers on Amazon Web Services and these vSphere data centers include vCenter Server for managing your data center, vSAN for storage, and VMware NSX for networking. you can use Hybrid Linked Mode, if you want to connect an on-premises data center to your cloud SDDC, and manage both from a single vSphere Client interface. Hybrid Linked Mode is like existing Enhanced Linked mode additionally it support cross SSO connection and vMotion.

VMware Cloud on AWS offers the following benefits:

  • It reduces capital and operational expenditures.
  • It reduces time to market for new applications.
  • It helps in enhanced scalability of applications in reduced time frames.
  • It helps in achieving greater availability of applications.
  • Your Application will have reduced recovery time objective (RTO).
  • and the most important one , it helps you to reduce staff time performing maintenance operations.

VMware Cloud Foundation

VMware Cloud Foundation is the unified SDDC platform that bundles vSphere, vSAN, and VMware NSX into a natively integrated stack to deliver enterprise-ready cloud infrastructure for the private and public cloud.

Secret sauce behind cloud foundation is VMware SDDC Manager which manages the initial configuration of the Cloud Foundation system, creates and manages workload domains, and performs life cycle management to ensure that the software components remain up to date. SDDC Manager also monitors the logical and physical resources of Cloud Foundation.

VMware Cloud on AWS is powered by VMware Cloud Foundation.

So in nutshell VMware Cloud on AWS uses VMware Cloud Foundation and VMware Validated Design to provide VMware SDDC and other migration solution on the hardware of AWS.

All components of this solution are delivered, operated, and supported by VMware Global Support Services. VMware fully certifies and supports all hardware and software components of this service. The customers are facing issue around managing firmware , patches , upgrades of underline infrastructure, now with VMware Cloud on AWS , VMware removed the burden of managing software patches, updates, or upgrades.  all this will be managed and maintained by VMware itself.

Use Cases

Data Center Extension:

  • DC extension of the on-premises data center to the public cloud to expand resource capacity, increase disaster avoidance and recovery options, or localize application instances to new geographic regions. For Example, one Organisation which is successful in one particular region and wants to grow their foot print across another region, in-stead of arranging data center space , hardware etc , this organisation can focus on core business and order IT infrastructure on VMware on AWS and just clone / migrate application vms to this localized data center.

Data Center Consolidation:

  • Maintaining a Datacenter is not easy, you have to take care of multi-source power , cooling , power backups , people , access management,BMS operations, Real state etc.., so instead of you managing Data Center , let VMware maintain your data center by consolidation of the on-premises data center costs by migrating applications from on-premises data center to the public cloud to reduce data center costs, prevent costs from growing, or close data centers entirely.

Data Center Peering:

  • Peering private and public cloud to allow for moving workloads between clouds. For example, moving applications from development or test to production or vice versa. or running CI/CD across private and public cloud.

This gives you basic understanding about what is VMware on AWS , in next few posts i will be covering how to install and configure this service.

vCloud Director – Chargeback

There were frequent asks from VMware based cloud providers that we must have a robust metering capabilities, VMware has launched New vRealize Operations Manager Tenant App for vCloud Director 2.0 in conjunction with vROps which has now inbuilt Charge back and metering capabilities.

Here I am going to discusses few awesome features with detailed screenshot. Go ahead and try these new features in your environment and build a robust Cloud infrastructure with native charge back with additional cost.

Creation of pricing policy based on chargeback strategy: With this new Release  Provider administrator can create one or more pricing policies based on how they can chargeback their consumers. Based on the vCloud Director allocation models, each pricing policy is of the type, allocation, reservation, or pay-as-you-go (PAYG).

policy01

This New Tenant App for vCloud Director 2.0 provides following ways to create pricing policies:

  • Base prices for primary resources:

    Pricing policy can be created to charge for primary resources, CPU, memory, storage, and network.

    • CPU & Memory ->

      • Users can be charged base on GHz or vCPU , can be charged “Hourly”,”Daily”,”Monthly”.policy02policy03
      • Charge Flexibility : Users can be charge based on allocation, use, reservation, or the advanced methodology such as, taking maximum of usage and allocation. Fixed cost too is available.policy04policy06
    • Storage ->

      • You can create various policies based on storage tiers to charge differential pricing and it is mapped to your storage policies.storagepolicy01
        • if not using Policy based storage then use based on Standard rate as below:storagepolicy02.png
    • Network ->

      • Data transmitted/received (MB), and network transmitted/received rate (MBPS) can be charged.Network01.png
    • Advanced Network ->

      • Pricing configurations:Pricing policy provides the flexibility to configure advanced chargeback mechanisms for network services, apart from charging primary network resources. Using advanced network pricing, users can apply variable and fixed charges for the following network services associated with edge. BGP Routing, DHCP, Firewall, High Availability, IP, IPV6, IP Sec, Load Balancer, L2 VPN, NAT, OSPF Routing, Static Routing, SSL VPN, Base rate and fixed costs can be applied for Edge Gateway sizes

         

    • Guest OS pricing ->

      • Guest OS can be charged uniquely. The charge can be applied based on the VM uptime, regardless of the uptime, or if the VM is powered-on at least once.gos01.png

    • Tag based and vCD metadata-based chargeback mechanism -> 

      •  Differential pricing can be established using tags or vCD metadata. Using vCenter tags or vCD metadata, tag key and key value can be referenced to apply base rate or fixed cost for VMs
  • Apply Policy ->

    • New Tenant App provides flexibility to the Service Provider administrator to map the created pricing policies with specific organization vDC. By doing this, the service provider can holistically define how each of their customers can be charged. The following vCloud Director allocation models are supported as part of the chargeback mechanism: Reservation pool Pay-as-you-go Allocation pool.assign.png
  • Exhaustive set of templates – >

    • Service Provider administrator can generate reports at various levels for a different set of objects. The following OOTB default templates are available:

  • Detailed Billing for Each Tenant ->

    • Every tenant/customer of service provider can review their bills using the vCD tenant app interface. Service Provider administrator can generate bills for a tenant by selecting a specific resource and a pricing policy that must be applied for a defined period and can also log in to review the bill details.
    • bill.png

This completes the feature demonstration available with vRealize Operations Manager Tenant App for vCloud Director 2.0. GO ahead and deploy and add native charge back power to your Cloud. 🙂

VMware vCloud Availability Installation-Part-10-Fully Automated Deployment

What i have learnt during deployments that  an automated installation and configuration of the vCloud Availability components is simple, time saving, faster and less error prone compared to the manual deployment. lets deploy it automatically with few clicks of the button.

For the automated installation of vCloud Availability, we must need to create a registry file containing information about the infrastructure and vCloud Availability components we are about to deploy.

Registry template file is located in vCloud Availability Installer appliance located at /root/.vcav/ and file name is – .registry.tmpl. this is self explanatory file about what option do you need to change and what not.

open this file with a text editor and save as “registry”  , here is my “registry” file for your convenience which you can modify based on your environment.

General Options:

Disabling all certification validation and specifying NTP server and SSH_PASSWORD for the entire environment,

1

Cloud Provider Management vCenter Information:

  1. This is identifier must be remain same and we will use the same in other commands and if you are changing this make sure you update in other commands.
  2. placement-locator – this parameter represents on which cluster your vCAV management VM will deploy. specify correctly.
  3. Make sure you have network Profile/Pool created (i have created with name “default”) and specify IP information accordingly.

2.png

Cloud Provider Resource (Tenant ) vCenter Information:

This is your tenant vCenter where your tenant vm resides , in my case i have single vCenter with separate cluster.Notice the identifier – vsphere vc.0 , you will reference this in deploying components. other information as suggested above.

3.png

vCloud Director Information:

  1. Notice the Identifier vcd vcd.0.
  2. Number 2 – in amqp parameter we are specifying amqp.1 , this means we need to create an identifier called amqp.1 in next section and since this will be identifier on docker host , so first we need to create docker host.

4.png

Docker Host Information:

  1. Again notice the identifier docker docker.0
  2. placement-vsphere  vc.mgmt (this is your vc.mgmt identifier , that means that this docker VM will get deployed on management vcenter.
  3. placement-address – this is the IP address of this VM.
  4. other options are self explanatory.

5.png

Message queue container on Docker Host Information:

  1. Again ensure the identifier is written and noted properly.
  2. Notice placement-docker – here we are specifying docker.0 which is docker host identifier in previous step we created.
  3. user – it is the user name that VCD will use to talk to Message queue server.
  4. password – it is the user name that VCD will use to talk to Message queue server.

6.png

Cassandra container on Docker Host Information:

  1. Notice the cassandra identifier
  2. Notice placement-docker – here we are specifying docker.0 which is docker host identifier in previous step we created on this docker host this cassandra host will get deployed.
  3. hcs-list – here we specified the vSphere Replication Cloud Service appliance identifier which will be deployed in next step.

7.png

vSphere Replication Manager Appliance Information:

  1. Again make a note of hms identifier.
  2. This host will get deployed in vc.mgmt.
  3. This VM will have ip address – 192.168.110.161
  4. This VM will have hostname – hms01.corp.local
  5. This hms will get registered with mgmt vCenter
  6. This hms will get registered with vCloud Director which we specified in indentifier vcd.0

8.png

vSphere Replication Cloud Service Appliance Information

  1. Make a note of hcs identifier.
  2. placement-vsphere is where this appliance will get deployed.
  3. placement-address is the ip address which will get assigned to this vm.
  4. hostname will be the name of this vm.
  5. vcd specified here , this appliance will get registreded to.
  6. Here we are specifying number of “cassanda” servers.
  7. message queuing server to registered with.

9.png

vSphere Replication Server Appliance Information:

  1. Make a note of hbr identifier.
  2. placement-vsphere is where this appliance will get deployed.
  3. placement-address is the ip address which will get assigned to this vm.
  4. hostname will be the name of this vm.
  5. vsphere specifies on which vcenter it is going to be registered.
  6. vcd specified here , this appliance will get registered to.

10

vCloud Availability Portal Host Information:

  1. Make a note of ui identifier.
  2. placement-vsphere is where this appliance will get deployed.
  3. placement-address is the ip address which will get assigned to this vm.
  4. hostname will be the name of this vm.
  5. vcd specified here , this appliance will get registered to.

11

vCloud Availability Administration Portal Host Information:

  1. Make a note of smp identifier.
  2. placement-vsphere is where this appliance will get deployed.
  3. placement-address is the ip address which will get assigned to this vm.
  4. hostname will be the name of this vm.
  5. vcd specified here , this appliance will get registered to.
  6. The mongodb-database property value is optional. Default value is vcav-smp , if you want you can use custom
  7. The mongodb-user property value is optional. Default value is vcav-smp.
  8. amqp will be used which we have specified in “amqp.1” identifier.
  9. this appliance will get registered with tenant ui which we have deployed in previous step under “ui.1” identifier.

12

save the file ensure there is no extension and copy to directory in vCAV appliance as below: /root/.vcav/ directory and run below command to validate you registry file , if out put is as below that means your registry file has been created correctly…

1

if you have configured registry file correctly and if all goes well then after around 20-30 minute appliance returns “OK” . which means we have successfully deployed vCloud Availability.

2

deployment of vCAV is simpler and less time consuming using automated one.only effort that you need to put in to create a proper registry file.

You can run a single task by running the #vcac next command. The vCloud Availability Installer Appliance detects the first task that is not completed and runs it. You can indicate which task you want to run by adding the #–task=Task-Number argument.

then follow my existing post number 9 

VMware vCloud Availability Installation-Part-9-Tenant On-Boarding

for tenant on-boarding. this completes the installation of vCAV. now you can work with your customers for the demo of DRaaS.

Here is my registry file for your reference.

 

 

 

 

VMware vCloud Availability Installation-Part-9-Tenant On-Boarding

Let’s Deploy your very known vSphere replication appliances and before we get in to that ensure that Tenant/customer has vSphere and vSphere Web Client installed and if vSphere is installed properly ,then in the vSphere Web Client, select the vCenter Server instance on which you are deploying vSphere Replication, click Manage > Settings > Advanced Settings, and verify that the VirtualCenter.FQDN value is set to a fully-qualified domain name.

Let’s On-Board tenant – Download vSphere Replication appliance ISO , mount the ISO and choose below three files during deployment of OVF from vSphere Client. we had multiple times deployed OVF , so not covering entire process in details , here are the screenshots of installation…

123456

there are two configurations , i am choosing minimum with 2vCPU, for your environment you can choose based on recommendation for production.

78

Enter IP address and other details, ensure that this IP address is reachable to Cloud ( you can use NAT etc..)

910

Register your vSphere Replication appliance with vCenter SSO and after registering restart the services and ensure services are up and running.

11

Pair Sites

Login to vCenter and Click on “Site Recovery” that will take you to below screen , on this screen click on “Configure”.

12

Configure opens a new Window , Click on “NEW SITE PAIR”

pair1

First site must be your current vCenter and “Second Site” – Choose “Cloud Provider”

Cloud Provider Address – Enter the IP address or URL like (vcd.provider.com) of the vCD without /Cloud.

Enter Organization name which is configured on the cloud and your org cloud credentials and click Next.

pair2

if you do not have any connectivity issue , then you should see certificate warning. Accept Certificate warning by clicking on “CONECT”

pair3

Select your VDC and click Next.

pair4

Configure Network Mapping for your VMs in provider environment , and the best thing is you can select two networks, one for testing DR and another one is actual DR. ( How many Cloud providers has this option ?)

pair5

Configuring and enabling replication tasks.

pair6pair7

this completes Tenant on-boarding , now Tenant can choose which VM they want to DR to Cloud.

 

VMware vCloud Availability Installation-Part-8-Integration and DRaaS portal access

As we have completed  the deployment of  all the individual components of vCloud Availability, we must need to configure them to talk/register to each other to support DRaaS.

1- Configure the vSphere Replication Manager

Configure vSphere Replication Manager with vCD using Below command.

hms01

Run below command to check if the HMS service started successfully.

hms022– Configure Cassandra

First import  vSphere Replication Cloud Service host certificates to Cassandra host.

cassa02

Next is to register the Cassandra hosts with the lookup service.  run below command to register. you must see a successful message.

cassa04

3- Configure vSphere Replication Cloud Service

Next is to configure the vSphere Replication Cloud Service VM, use below command to register vSphere Replication Cloud Service appliance to vCD, resource vCenter Server, and RabbitMQ server.

hms02

Run below command to check the status of service. it should return “OK” if service started successfully.

hms04

4- Configure vSphere Replication Server

this step is to attach vSphere Replication Server to vSphere Replication Manager and vCenter Server.

hbr01

5- Configure vCloud Availability portal host

Use below command to configure the vCloud Availability Portal host and if it returns “OK” , then we have successfully configured vCloud Availability portal host.

UI01

6- Configure vCloud Availability Administration Portal

This portal runs a small “Mongo DB”. we must configure the vCloud Availability Administration Portal host with the vCloud Director server and its embedded MongoDB server then only services will start.

UI02

7 – Assign vSphere Replication Cloud Service Rights to the vCD Org Admin Role

before we enable VDC for replication , we must assign vSphere Replication Cloud Service rights to the vCD org administrator role.

SSH to VCAV appliance and run below command

3

see “–org” parameter i have put in “*” , that means all organisation’s admin will have vSphere Replication Cloud Service rights , if you want to enable on a particular organisation then instead of “*” , put organisation name.

8 – Enable org VDC for Replication

This step enable particular VDC for replication. run below command to get the list of “organisations” that we have and if you see the output of command , it says we have 4 organisations.

4

in the next command , let’s find out for organisation “T1” what is the vcd name on which we need to enable DRaas. you can check same thing using GUI also.

5

This is actual step to enable organisation “T1”  having VDC “T1-VDC” for enabling replication, and if everything goes right then we must see “OK” , that means our VDC is ready to use DRaaS.

6

This completes configuration, lets login to DR tenant portal using tenant portal URL , you need to use tenant credential for which this service has been enabled.

ui01-01

ui01

This is Service provider portal , on which you can check which orgs has been configured for DRaaS. here you will use your administration credential.

ui02UI022

This completes service provider end configuration , in next post we will configure client end configuration and will see how to enable replication from customer data center.

 

 

VMware vCloud Availability Installation-Part-7-Create vCloud Availability Tenant and Administration Portal

The vCloud Availability Portal provides a graphic user interface to facilitate the management of vCloud Availability operations.

The vCloud Availability Portal back end (PBE) scales horizontally. You can deploy a new vCloud Availability Portal instance on demand connected to the same load balancer that all the vCloud Availability Portal instances are under. The load balancer must support sticky sessions, so that the same PBE instance processes user requests within a session. This setting ensures that all the information displayed in the vCloud Availability Portal is consistent.

vCloud Availability Portal Sizing

Deployment Type

Size and Sessions

Small

Appliance size is 2 CPUs, 2 GB of memory, 10 GB of disk space, and 512 MB of Java Virtual Memory. Suitable for hosting up to 150 concurrent sessions.

Meduium

Appliance size is 2 CPUs, 4 GB of memory, 10 GB of disk space, and 1.5 GB of Java Virtual Memory. Suitable for hosting up to 400 concurrent sessions.

Large

Appliance size is 4 CPUs, 6 GB of memory, 10 GB of disk space, and 3 GB of Java Virtual Memory. Suitable for hosting up to 800 concurrent sessions.

Deploy the appliance using below command.

UI01.png

UI02.png

UI03.png

Create a new Variable with below information.

#export UI01_ADDRESS=192.168.110.164

we will use this variable in subsequent commands. Next is to configure Trust

UI04.png

vCloud Availability Administration Portal

The vCloud Availability Administration Portal is a graphic user interface that helps service providers to monitor and manage their DR environments. This also need to be deployed using appliance sizing consideration.

vCloud Availability Administration Portal Sizing

Deployment Type

Size and Sessions

Small

Appliance size is 2 CPUs, 2 GB of memory, 10 GB of disk space, and 512 MB of Java Virtual Memory. Suitable for hosting up to 150 concurrent sessions.

Meduium

Appliance size is 2 CPUs, 4 GB of memory, 10 GB of disk space, and 1.5 GB of Java Virtual Memory. Suitable for hosting up to 400 concurrent sessions.

Large

Appliance size is 4 CPUs, 6 GB of memory, 10 GB of disk space, and 3 GB of Java Virtual Memory. Suitable for hosting up to 800 concurrent sessions.

Now lets create  vCloud Availability Administration Portal host by running the following command.

UI021.png

UI022

Create a new Variable with below information.

#export UI02_ADDRESS=192.168.110.165

if the deployment succeed then you will see that command returns IP address of the deployed Appliance. that represent that appliance has been deployed successfully.

UI023

Update the truststore file with the vCloud Availability Administration Portal virtual machine credentials using below command:

#echo ‘VMware1!’ > ~/.ssh/.truststore

Run trust-ssh command to trust the certificate vCAV FQDN.

UI0204

 

now to validate that our deployments are ready for configuration , run below commands and must return “OK”.

validate01validation02

“OK” means till now we have deployed components are ready for configuration.This successfully completes install of the all the appliances and components for vCAV. now we need to integrate these components to each other and with vCD.

VMware vCloud Availability Installation-Part-6-Create vSphere Replication Cloud Service Host & Replication Server

The vSphere Replication Cloud Service is a tenancy aware replication manager that provides the required API for managing the service and all the components. vSphere Replication Cloud Service registers it self as a vCloud Director extension and will be  accessible through the vCloud Director interface.

Lets Deploy vSphere Replication Cloud Service Host using below command..

hcs01.png

hcs02.png

hcs03

Create a new Variable with below information.

#export HCS01_ADDRESS=192.168.110.162

we will use this variable in subsequent commands.

Next is to configure trust for vSphere replication certificate by vSphere using below command.

hcs04

if command response is “OK” , that means we have successfully deployed vSphere Replication Cloud Service Host.

Create vSphere Replication Server

As we know vSphere Replication Server handles the replication process for each protected virtual machine. ideally it should be deployed one per manager instance.Run command as below to deploy HBR01.

hbr01hbr02

if the deployment has been successfully completed , then you would get VM IP as success  message.

hbr03

Next is to  create a variable with the IP address of above deployed VM, you can create additional variable if you have deployed multiple hms. this variable we will use in further commands. (Variables are casesensitive)

#export HBR01_ADDRESS=192.168.110.163

Next step is to trust the vSphere replication certificate by vSphere using below command and it should return “OK”

hbr05

This completes deployment of vSphere Replication Server appliance for vCAV.

 

 

VMware vCloud Availability Installation-Part-5-Deploy vSphere Replication Manager

vSphere Replication Manager manages and monitors the replication process from tenant VMs to the cloud provider environment. A vSphere Replication management service runs for each vCenter Server and tracks changes to VMs and infrastructure related to replication. these appliances can be horizontally scaled based on the requirement.

In production environment we must deploy one vSphere Replication Manager for each Resource vCenter Server but in this lab i will be deploying in my management vCenter only as i don’t have two separate vCenter one for management and another one for tenant called resource vCenter.

Let’s Start the deployment , again make a SSH connection to VCAV appliance and run below command to deploy replication manager.

you do not need to specify the location of Replication manager appliance location as described in the documentation , command picks up automatically from within the appliances.

Location of appliances on the VCAV appliance is – /opt/vmware/share/vCAvForVCD/latest

hms04.png

Run command as below to deploy HMS01 on vCAV appliance.

hms01.png

I am using –debug just to understand what is happening behind the scene but you can ignore it if you want and monitor the progress in vCenter, it must be deploying a VM with Name called “hms01” with IP “192.168.110.161” as specified in –vm-address option.

hms02

once succeed and result on appliance will be displayed as deployed Virtual Machine IP address , that means it is successfully deployed virtual Machine.

hms03

Repeat the same process to deploy additional hms ,if you have many resource vCenter ideally you should have one per vCenter..

Next is to  create a variable with the IP address of above deployed VM, you can create additional variable if you have deployed multiple hms. this variable we will use in further commands.

#export HMS01_ADDRESS=192.168.110.161

Next step is to trust the vSphere replication certificate by vSphere using below command and it should return “OK”

hms06

This completes deployment of vSphere Replication Manager appliance for vCAV, which will help us  in managing and monitoring the replication process from tenant VMs to the service provider environment.

VMware vCloud Availability Installation-Part-4-vCD Configuration and IP Plan

In continuation to deploy and configure vCloud availability , till now we deployed vCAV appliance and prepared its dependencies. in this post we will configure vCD to be used as DR site and will Plan IP schema for vCAV appliances which will be deployed next.

First setup a trusted connection between the RabbitMQ host and the vCloud Availability Installer Appliance.

1.png

Register RabbitMQ host with vCloud Director by running the following command on the vCloud Availability Installer Appliance.

2.png

if command responds “OK” then configuration has been successfully applied. you can also verify in vCD UI.

3.pngRestart vCloud Director Service after configuring AMQP settings, by using

#Service vmware-vcd restart

Check vCD Endpoints:

This step to verify that our environment is properly configured for vCloud Availability installation, by checking the vCloud Director endpoints for known problems.

4.png

if everything  has been done properly then we should get response as “OK”. This completes pre-configuration before proceeding with the installation of VCAV Replication/UI Virtual Machines but before we get into the installation of appliances , we need to plan IP address and DNS names for those appliances.

Here is my IP planning sheet for your reference.

                                                  Planning Sheet  
Machine Type DNS Name IP Address
vCloud Availability Portal vcav.corp.local 192.168.110.150
Docker Host for Cassandra and RabbitMQ docker01.corp.local 192.168.110.181
HMS hms01.corp.local 192.168.110.161
HCS hcs01.corp.local 192.168.110.162
HBR hbr01.corp.local 192.168.110.163
UI01 ui01.corp.local 192.168.110.164
UI02 ui02.corp.local 192.168.110.165

This Completes this post , in next post we will install appliances using above Table.

 

VMware vCloud Availability Installation-Part-3-Install Cassandra and RebbitMQ

RabbitMQ

RabbitMQ is an Open Source AMQP server that can be used to exchanges messages within a vCloud Director environment.  in production environments for high availability and scalability purposes, you can configure the RabbitMQ servers in a cluster.

Cassandra

Cassandra is a free and open-source distributed NoSQL database management system that stores metadata and supports storage of the metadata for replication services. for High availability you must deploy 3 Clustered nodes.

Since i don’t have resources in my Lab, so i am going ahead and deploy Cassandra and RabbitMQ in a single VM using containers and this is enough for our Lab deployment.

In our Part-1 we deployed VCAV, connect to vCloud Availability using SSH and run below commands to start docker service on vcav host.

#systemctl start docker – and once commands succeed check status using..

docker01

Create Password Files on Your vCloud Availability Installer Appliance

  • # mkdir ~/.ssh = > Create a directory  called “ssh”.
  • # chmod 0700 ~/.ssh -> Changes the directory permission.
  • # echo ‘VMware1!’ > ~/.ssh/.root – creates a  file names “root”  with having password. “VMware1!”
  • # echo ‘VMware1!’ > ~/.ssh/.vcd  – create a file named “vcd” with having vCD admin password stored.
  • # echo ‘VMware1!’ > ~/.ssh/.sso – This file will store “SSO” password.
  • # echo ‘VMware1!’ > ~/.ssh/.vsphere.mgmt – This file will store “vSphere” password
  • # echo ‘VMware1!’ > ~/.ssh/.cassandra.root.password
  • # find ~/.ssh -type f -name ‘.*’ -print0 | xargs -0 chmod 0600

docker02

This completes creation of password files. now lets create a IP pool.

Add a Network Protocol Profile

Basically a vSphere network protocol profile contains a pool of IPv4 and IPv6 addresses, IP subnet, DNS, and HTTP proxy server.VC assigns those resources to vApps or to virtual machines with vApp functionality that are connected to port groups associated with the profile. let’s create a network profile which our VM’s will use during their deployment.

  1. Go to data center click the Configure tab , click Network Protocol Profiles and edit Default profile.
  2. docker03
  3. associate a port group with profiles , on which you want your deployed vms get connected.
  4. docker04
  5. Enter your Subnet, Gateway, DNS server address , don’t forget to enable the pool and specify the IP range , so in my case i have assigned 20 IPs start with .160.
  6. docker05
  7. Specify DNS domain name and DNS search path.
  8. docker06

this completes creation of Network IP pool and its setting that VMs will use while deploying vCAV component vms.

Deploy a Docker Host

To deploy a docker host on vSphere Management Cluster run below command on vCAV appliance.

docker08

but before running this command , you can see certain variables has been used in the command, so first lets create those variables..

  • $MGMT_VSPHERE_ADDRES -> export MGMT_VSPHERE_ADDRESS=vcsa-01a.corp.local
  • $MGMT_VSPHERE_USER -> export MGMT_VSPHERE_USER=administrator@vsphere.local
  • $MGMT_VSPHERE_NETWORK ->export MGMT_VSPHERE_NETWORK=VM-RegionA01-vDS-MGMT
  • $MGMT_VSPHERE_LOCATOR ->export MGMT_VSPHERE_LOCATOR=RegionA01/host/RegionA01-MGMT01
  • $MGMT_VSPHERE_DATASTORE ->export MGMT_VSPHERE_DATASTORE=RegionA01-ISCSI01-COMP01

docker07.png

so after creation of variables we run the above command on vcav vm using vcav docker create… which successfully created a docker VM in our management cluster.

docker10docker09

Download Rabbitmq container on vCAV appliance using below command. for this step your vCAV appliance must able to reach to internet or if you have your own registry like VMware Harbor then you can pull from there.

docker11

Download Cassandra container on vCAV appliance using below command.for this step your vCAV appliance must able to reach to internet or if you have your own registry like VMware Harbor then you can pull from there.

docker12

Create three new Variables and password file as below:

  • export AMQP_ADDRESS=192.168.110.180
  • export CASSANDRA_ADDRESS=192.168.110.180
  • export DOCKER01_ADDRESS=192.168.110.180
  • echo ‘VMware1!’ > ~/.ssh/.amqp

Create RabbitMQ Container

Now lets create RabbitMQ Container using below command on vCAV appliance. and command returned “OK” that means my container creation was successful.

docker13

trust the vCAV connection with RabbitMQ as below.

15

Create Cassandra Container

Now lets create Cassandra Container using below command on vCAV appliance. and command returned “OK” that means my container creation was successful.

docker14

You can check the connectivity using telnet with particular port numbers for RabbitMQ and Cassandra servers. this post completes RabbitMQ and Cassandra containers deployment , we will configure these in subsequent posts.

 

 

VMware vCloud Availability Installation-Part-2-Configure SAML Federation

Using the vSphere SSO service as the SAML identity provider for the vCloud Director System organisation can be a more secure alternative to LDAP or a local account. When vCloud Director is federated with vCenter SSO, enables you to import system administrators from vSphere and this is required for VCAV to work properly. so let’s configure it.

Login to vCD as system admin user and navigate to Administration > System Settings > Federation and click on Metadata (3) and download Metadata.  it will be like this

[2.png]

1.png

then go to vSphere and upload this downloaded vCD Metadata.

03.png

Choose the File downloaded by Clicking in “Import from File” which we have downloaded and click on “Import”. This will complete the the metadata import from vCD to vSphere.

04

Now we need to Download SSO metadata file and need to import to vCD. login to vSphere , Go to “Configuration” -> SAML Service Providers -> Click on “Download”

05.png

Go to vCD login with Administrator , then go “Administration” -> “Federation” -> Tick on “Use SAML Identity Provider” – > then Browse the File which we have downloaded in previous step – Click “Upload” and Click “Finish”

06.png

07.png

Once mutual metadata sharing is completed , on vCD go to Administrator -> Users -> Import Users – you will see new Source called “SAML”

08.png

Choose SAML and manually enter “administrator@vsphere.local”  and click ok.

09

and new user has been added to vCD with System administrator  role.

10logout and login with vSphere SSO credential like”administrator@vsphere.local” and its password , it should be a successful login.

There is one more important setting that we need to do on vCD appliances , go to /opt/vmware/vcloud-director/etc/global.properties and add – extensibility.timeout=60.

11.png

This completes our vCD pre-requisite configuration , in the next post i will deploy cassandra and rabbitmq.

 

 

 

 

 

 

VMware vCloud Availability Installation-Part1-Deploy Appliance

So in Previous post i tried to explain what problem VMware solving with vCAv , now let’s get in to the installing of the components , there are two ways to install vCAv either using automated way or manually running few commands on vCAv appliance which will automatically install and configure stuff , this we can call semi automated and in next few posts i will be installing is using semi automated way as this gives me more comfort around understanding of what component is getting installed and integrate with whom…..

So lets get into the installation mode and first thing we need few Linux VM for

Cassandra 01 Nodes
RabbitMQ 01 Nodes
Cloud Proxy 02 Nodes

Since this is Demo environment , so i am not considering HA for any VMs.So first lets create a Cent OS VM with all the required pre-requisite installed on this and then template it , which will help us in saving considerable amount of time. same approach can be taken to your production deployment with customisation specification.

For this Demo , i am creating a new Linux VM based on CentOS-7-x86_64-Minimal-1804.iso  and install the OS, Once OS installation is completed,  Connect to the VM with SSH and first update yum:

  • #yum update yum and Reboot the guest OS.

Install the packages required by vCD.

packages.png

#yum install alsa-lib bash chkconfig coreutils findutils glibc grep initscripts krb5-libs libgcc libICE libSM libstdc++ libX11 libXau libXdmcp libXext libXi libXt libXtst module-init-tools net-tools pciutils procps redhat-lsb sed tar wget which

I would suggest to install NTP to keep the VM clock in sync:

  • #yum install ntp

Configure ntp servers ,using VI, change the lines beginning with server to NTP servers. All components connecting to vCD should share the same NTP servers for accurate timekeeping:

  • #vi /etc/ntp.conf

Start the ntpd service

  • #systemctl start ntpd
  • #systemctl enable ntpd

Check ntpd is syncing to correct ntp servers using – #ntpq -p

There are lots of features depend on DNS , so i would suggest to install DNS bind utilities and verify that VM is able to resolve DNS queries.

  • #yum install bind-utils (to install nslookup)
  • #nslookup VMNAME
  • #nslookup VMNAME.DOMAIN.COM
  • #nslookup 192.168.110.1

For this lab environment, I turn off selinux as well as firewall, while in production deployment please choose correct configuration.

Go to selinux file and edit – #vi /etc/sysconfig/selinux and change SELINUX=enforcing to “SELINUX=disabled”.

1.png

To disable firewalld, run the following command as root: #systemctl disable firewalld and  stop firewalld, run the following command as root: #systemctl stop firewalld

Install VMware tools , reboot the VM to takes effect all the above changes. now we are done with OS configuration , shutdown the VM and Change the VM as template and deploy 7 VMs from this template. while i am deploying These external VMs, also we need to deploy vCAv Appliance. download the appliance from here.  and deploy the appliance by following below steps:

Choose OVF

12

Select appropriate Cluster/ host location to deploy35

Accept EULA67

Enter Domain name , IP address and others settings as per requirement.8

and Click Finish to Deploy.9

This Completes Template preparation and deployment of vCAv Appliance.

 

What is VMware vCloud Availability ?

While in Previous posts i was talking about VMware Based Cloud to Cloud DR  which means on One of the VMware Cloud Provider data center is having your VM running and you need to disaster recovery to another data center of the same provider thats where vCAv-C-to-C helps. but what about Customers who are having lots of VMs inside their own data center and wanted to move to VMware Cloud Provider data center or want to have DR as a Service on VMware Provider Datacenter. This is where VMware vCloud Availability helps.

vCloud Availability is a Disaster Recovery-as-a-Service (DRaaS) and  Migration Service solution that provides simple and secure asynchronous replication and failover  and fail back for vSphere managed workloads.  it supports vSphere version 5.5, 6.0, 6.5 and 6.7 environments. but only vSphere version 6.x provides replication in both directions.

1.png

Features:-

  • Multi-tenant support for Provider Environment
  • Self-service portal for Migration,Protection Failover and Fail back per VM basis.
  • Initial Data seeding can be done using disk shipping to the provider and tenant.
  • RPO from 15 Minutes to 24 Hours .( if you have latest version of vSphere it supports 5 Minute RPO also).
  • Build in support for encryption of replication traffic.
  • Simple to deploy , manager and support compare to using other solutions and getting in to integration and support issues.
  • Supports up to 24 previous restore points.
  • efficient , robust and proven as it uses existing vSphere Replication engine.

Architecture & Components:-

The architecture of the vCAv relies on the service provider environment that will be considered as the replication target and the customer/tenant environment that deploys vSphere Replication to move the data to the service provider considers as source. In the service provider environment, multiple components operate together to support replication, secure communication, and storage of the replicated data. Each service provider can support recovery for multiple customer environments that can scale to handle increasing loads for each tenant, and for multiple tenants.

Cloud/Service provider environment runs multiple components listed as below together to support replication, secure communication with compute and storage. Provider must running production grade vCD environment to successfully run vCloud availability. i am not here listing down vCD components as i assume vCD is deployed and configured successfully. i will be covering only vCAv components only..

vCAv Components Components Use
Cloud Proxy

Cloud Proxy creates a public listening  TCP port to which vCloud Tunneling Agents connect  and communicate using secure web sockets. Multiple  instances can be deployed behind a load balancer to  support scaling.

vCloud Tunnelling Agent

It communicates with the Cloud Proxy and  is responsible for orchestrating tunnel creation for both  to-the-cloud and from-the-cloud tunnels.

Cassandra

An open source distributed database that stores  metadata related to the configured replications.

RabbitMQ

An open source Message Queue Service.  All requests to the Cloud Service are routed  through the messaging service.

vSphere Replication Cloud Services

Rest API Entry Point for Replication Operations , interacts with vCloud director for Multi-tenancy and placement.

vSphere Replication Manager

Existing component of Sphere Replication , Manager and Monitors Replication.

vSpher

e Replication Server

Existing Component of vSphere Replication, handles replication of each protected VM.

vCloud Availability Portal

Tenet UI portal to manage replication on Cloud.

3.png

Above is the vCAv Diagram in conjunction with vCD and its components for your reference, thats cover the basic information about vCAv. below is the list of Service Provider Components needs to be deployed for successful DRaaS from your Cloud.

vCAv Components No of Nodes to be Installed
Cassandra 03 Nodes
RabbitMQ 02 Nodes
Cloud Proxy 02 Nodes
vSphere Replication Cloud Services 02 Nodes
vCloud Availability Administration Portal 01 Nodes
vSphere Replication Manager 01 Node
vSphere Replication Server 02 Nodes
vCloud Availability Portal 01 Node

in next few days i will be installing in my Lab based on above table and posting here in series of posts. In the mean time please refer documentation on installation and configuration on VMware website and get ready for installation and configuration.

 

 

Installing and Configuring VMware vCloud Availability for Cloud-to-Cloud DR – Part2

In Previous Post we had successfully installed VCAv-C2C on Site1 and following same steps i have installed on Site2 as well. now Next step is to establish a trust between vCloud Availability vApp Replication Service/Manager instances in these two different sites.we can initiate  site pairing from either one of the sites.In this post i will cover Pairing between sites and Administrator GUI walk through , so let’s Start….

In your Browser, Login to  https://Appliance-IP-Address:8046.

1

Go to Sites , Click on New site and In the Sites administration window, enter the site “B” vCloud Availability vApp Replication Service/Manager url , enter siteB appliance password and click OK.

2.png

Accept Certificate and if everything entered is correct and goes fine then we will see “Pairing Successful” message :).

3.png

To verify that the trust , Lets Go to Sites > Show all sitesThe new site displays in the Sites administration window.

4.png

We can reverify by going into “Diagnostics” -> Health -> Site Status

5.png

This completes Pairing , here is the Site2 View…

6

Login as Service Provider (SP admin GUI Walkthrough)

Cloud Provider System administrators can log in to the vCAv-C2C Portal to view information about DR workloads from the vCloud Director instances, monitor services health status and work as tenant organisation administrators.

Lets login https://Appliance_IP_Address:8443 with login as administrator@system or adminuser@system.

7 .            9

On vCloud Availability Portal  , after login as a service provider, you can review the Home dashboard. It contains read-only information that presents a site health summary.

A rolling bar on top of the dashboard view given over all health of the solution and we can pause/resume this operation.

10

Cloud Topology pane shows the incoming and outgoing vApps replications for both, the source and destination sites.

11.png

vApp Workload Status window lists the current health of vApp replications.

12.pngVM Workload Information window shows the health of the site you are currently logged in for VMs incoming and outgoing replications.

13.png

Provider vDC System Resource Utilization window tracks how much storage, memory, and CPU is used from the allocated system resources.

14.png

Top 3 Orgs window helps you to view the overall vCloud Director used storage by those organizations, and their vApps and VMs replication states.

15.png

Orgs page we can track the number of vCD orgs, usage of total storage, total pre-allocated CPU, and total pre-allocated memory for those orgs and other information.

16.png

DR Workloads page lists down workloads protected on source site1 and workloads protected to destination site2 respectively.

17.png

Administration page, to add a service component for monitoring and Configuration tab to Edit the vDC resource usage threshold shared by CPU, memory and storage. Optionally, you enable tenant vDC monitoring.

1819

This post covers pairing and Admin walkthrough of Interface , in next post i will cover configuration of vApp/VM DR and tenant portal walkthrough…