NFS DataStore on VMware Cloud on AWS using Amazon FSx for NetApp

Featured


Amazon FSx for NetApp ONTAP integration with VMware Cloud on AWS is an AWS-managed external NFS datastore built on NetApp’s ONTAP file system that can be attached to a cluster in your SDDC. It provides customers with flexible, high-performance virtualized storage infrastructure that scales independently of compute resources.

PROCESS

  • Make sure SDDC has been deployed on VMware Cloud on AWS with version 1.20
  • The SDDC is added to an SDDC Group. While creating the SDDC Group, a VMware Managed Transit Gateway (vTGW) is automatically deployed and configured
  • A Multi-AZ file system powered by Amazon FSx for NetApp ONTAP is deployed across two AWS Availability Zones (AZs). (You can also deploy in single AZ but not recommended for production)

DEPLOY VMWARE MANAGED TRANSIT GATEWAY

To use FSx for ONTAP as an external datastore, an SDDC must be a member of an SDDC group so that it can use the group’s vTGW and to configure you must be logged into the VMC console as a user with a VMC service role of Administrator and follow below steps:

  • Log in to the VMC Console and go on the Inventory page, click SDDC Groups
  • On the SDDC Groups tab, click ACTIONS and select Create SDDC Group
  • Give the group a Name and optional Description, then click NEXT
  • On the Membership grid, select the SDDCs to include as group members.The grid displays a list of all SDDCs in your organization. To qualify for membership in the group, an SDDC must meet several criteria:
    • It must be at SDDC version 1.11 or later. Members of a multi-region group must be at SDDC version 1.15 or later.
    • Its management network CIDR block cannot overlap the management CIDR block of any other group member.
    • It cannot be a member of another SDDC Group.
    When you have finished selecting members, click NEXT. You can edit the group later to add or remove members.
  • Acknowledge that you understand and take responsibility for the costs you incur when you create an SDDC group, then click CREATE GROUP to create the SDDC Group and its VMware Transit Connect network.

ATTACH VPC TO VMWARE MANAGED TRANSIT GATEWAY

After the SDDC Group is created, it shows up in your list of SDDC Groups. Select the SDDC Group, and then go to the External VPC tab and click on ADD ACCOUNT button, then provide the AWS account that will be used to provision the FSx file system, and then click Add.

Now it’s time for you to go back to the AWS console and sign in to the same AWS account where you will create Amazon FSx file system. Here navigate to the Resource Access Manager service page and

click on the Accept resource share button.

Next, we need to attach VMC Transit Gateway to the FSX VPC, for that you need to go to:

ATTACH VMWARE MANAGED TRANSIT GATEWAY TO VPC

  • Open the Amazon VPC console and navigate to Transit Gateway Attachments.
  • Choose Create transit gateway attachment
  • For Name tag, optionally enter a name for the transit gateway attachment.
  • For Transit gateway ID, choose the transit gateway for the attachment, make sure you choose a transit gateway that was shared with you.
  • For Attachment type, choose VPC.
  • For VPC ID, choose the VPC to attach to the transit gateway.This VPC must have at least one subnet associated with it.
  • For Subnet IDs, select one subnet for each Availability Zone to be used by the transit gateway to route traffic. You must select at least one subnet. You can select only one subnet per Availability Zone.
  • Choose Create transit gateway attachment.

Accept the Transit Gateway attachment as follows:

  • Navigating back to the SDDC Group, External VPC tab, select the AWS account ID used for creating your FSx NetApp ONTAP, and click Accept. This process takes some time..
  • Next, you need to add the routes so that the SDDC can see the FSx file system. This is done on the same External VPC tab, where you will find a table with the VPC. In that table, there is a button called Add Routes. In the Add Route section, add the CIDR of your VPC where the FSX will be deployed.

In the AWS console, create the route back to the SDDC by locating VPC on the VPC service page and navigating to the Route Table as seen below.

also ensure that you have the correct inbound rules for the SDDC Group CIDR to allow the inbound rules for SDDC Group CIDR. it this case i am using entire SDDC CIDR, Further to this Security Group, the ENI Security Group also needs the NFS port ranges adding as inbound and outbound rules to allow communication between VMware Cloud on AWS and the FSx service.

Deploy FSx for NetApp ONTAP file system in your AWS account

Next step is to create an FSx for NetApp ONTAP file system in your AWS account. To connect FSx to VMware cloud on AWS SDDC, we have two options:

  • Either create a new Amazon VPC under the same connected AWS account and connect it using VMware Transit Connect.
  • or Create a new AWS account in the same region as well as VPC, connect it using VMware Transit Connect.

In this blog, i am deploying in the same connected VPC and for it to deploy, Go to Amazon FSx service page, click on Create File System and on the Select file system type page, select Amazon FSx for NetApp ONTAP,

On Next page, select the Standard create method and enter require details like:

  • Select Deployment type (Multi-AZ) and Storage capacity
  • Select correct VPC, Security group and Subnet

After the file system is created, check the NFS IP address under the Storage virtual machines tab. The NFS IP address is the floating IP that is used to manage access between file system nodes, and this IP we will use to configuring in VMware Transit Connect to allow access volume from SDDC.

we are done with creating the FSx for NetApp ONTAP file system.

MOUNT NFS EXTERNAL STORAGE TO SDDC Cluster

Now it’s time for you to go back to the VMware Cloud on AWS console and open the Storage tab of your SDDC. Click ATTACH DATASTORE and fill in the required values.

  • Select a cluster. Cluster-1 is preselected if there are no other clusters.
  • Choose Attach a new datastore
  • The NFS IP address shown in the Endpoints section of the FSx Storage Virtual Machine tab. Click VALIDATE to validate the address and retrieve the list of mount points (NFS exports) from the server.

  • Pick one from the list of mount points exported by the server at the NFS server address. Each mount point must be added as a separate datastore
  • AWS FSx ONTAP
  • Give the datastore a name. Datastore names must be unique within an SDDC.
    • Click on ATTACH DATASTORE

VMware Cloud on AWS supports external storage starting with SDDC version 1.20. To request an upgrade to an existing SDDC, please contact VMware support or notify your Customer Success Manager.

Cross-Cloud Disaster Recovery with VMware Cloud on AWS and Azure VMware Solution

Featured

Disaster Recovery is an important aspect of any cloud deployment. It is always possible that an entire cloud data center or region of the cloud provider goes down. This has already happened to most cloud providers like Amazon AWS, Microsoft Azure, Google Cloud and will surely happen again in future. Cloud providers like Amazon AWS, Microsoft Azure and Google Cloud will readily suggest that you have a Disaster Recovery and Business Continuity strategy that spans across multiple regions, so that if a single geographic region goes down, business can continue to operate from another region. This only sounds good in theory, but there are several issues in the methodology of using the another region of a single cloud provider. Some of the key reasons which I think that single cloud provider’s Cross-Region DR will not be that effective.

  • A single Cloud Region failure might cause huge capacity issues for other regions used as DR
  • Cloud regions are not fully independent , like AWS RDS allows read replicas in other regions but one wrong entry will get replicated across read replicas which breaks the notion of “Cloud regions are independent
  • Data is better protected from accidental deletions when stored across clouds. For Example what if any malicious code or an employee or cloud providers employee runs a script which deletes all the data but in most cases this will not impact cross cloud.

In this blog post we will see how VMware cross cloud disaster recovery solution can help customers/partners to overcome BC/DR challenges.

Deployment Architecture

Here is my deployment architecture and connectivity:

  • One VMware Cloud on AWS SDDC
  • One Azure VMware Solution SDDC
  • Both SDDC’s are connected over MegaPort MCR

Activate VMware Site Recovery on VMware Cloud on AWS

To configure site recovery on VMware Cloud on AWS SDDC, go to SDDC page, click on the Add Ons tab and under the Site Recovery Add On, Click the ACTIVATE button

In the pop up window Click ACTIVATE again

This will deploy SRM on SDDC, wait for it to finish.

Deploy VMware Site Recovery Manager on Azure VMware Solution

In your Azure VMware Solution private cloud, under Manage, select Add-ons > Disaster recovery and click on “Get Started”

From the Disaster Recovery Solution drop-down, select VMware Site Recovery Manager (SRM) and provide the License key, select agree with terms and conditions, and then select Install

After the SRM appliance installs successfully, you’ll need to install the vSphere Replication appliances. Each replication server accommodates up to 200 protected VMs. Scale in or scale out as per your needs.

Move the vSphere server slider to indicate the number of replication servers you want based on the number of VMs to be protected. Then select Install

Once installed, verify that both SRM and the vSphere Replication appliances are installed.After installing VMware SRM and vSphere Replication, you need to complete the configuration and site pairing in vCenter Server.

  1. Sign in to vCenter Server as cloudadmin@vsphere.local.
  2. Navigate to Site Recovery, check the status of both vSphere Replication and VMware SRM, and then select OPEN Site Recovery to launch the client.

Configure site pairing in vCenter Server

Before starting site pair, make sure firewall rules between VMware cloud on AWS and Azure VMware solution has been opened as described Here and Here

To start pairing select NEW SITE PAIR in the Site Recovery (SR) client in the new tab that opens.

Enter the remote site details, and then select FIND VCENTER SERVER INSTANCES and select then select Remote vCenter and click on NEXT, At this point, the client should discover the VRM and SRM appliances on both sides as services to pair.

Select the appliances to pair and then select NEXT.

Review the settings and then select FINISH. If successful, the client displays another panel for the pairing. However, if unsuccessful, an alarm will be reported.

After you’ve created the site pairing, you can now view the site pairs and other related details as well as you are ready to plan for Disaster Recovery.

Planning

Mappings allow you to specify how Site Recovery Manager maps virtual machine resources on the protected site to resources on the recovery site, You can configure site-wide mappings to map objects in the vCenter Server inventory on the protected site to corresponding objects in the vCenter Server inventory on the recovery site.

  • Network Mapping
  • IP Customization
  • Folder Mapping
  • Resource Mapping
  • Storage Policy Mapping
  • Placeholder Datastores

Creating Protection Groups

A protection group is a collection of virtual machines that the Site Recovery Manager protects together. Protection group are per SDDC configuration and needs to be created on each SDDC if VMs are replicated in bi-directionally.

Recovery Plan

A recovery plan is like an automated run book. It controls every step of the recovery process, including the order in which Site Recovery Manager powers on and powers off virtual machines, the network addresses that recovered virtual machines use, and so on. Recovery plans are flexible and customizable.

A recovery plan runs a series of steps that must be performed in a specific order for a given workflow such as a planned migration or re-protection. You cannot change the order or purpose of the steps, but you can insert your own steps that display messages and run commands.

A recovery plan includes one or more protection groups. Conversely, you can include a protection group in more than one recovery plan. For example, you can create one recovery plan to handle a planned migration of services from the protected site to the recovery site for the whole SDDC and another set of plans per individual departments. Thus, having multiple recovery plans referencing one protection group allows you to decide how to perform recovery.

Steps to add a VM for replication:

there are multiple ways, i am explaining here one:

  • Choose VM and right click on it and select All Site Recovery actions and click on Configure Replication
  • Choose Target site and replication server to handle replication
  • VM validation happens and then choose Target datastore
  • under Replication setting , choose RPO, point in time instances etc..
  • Choose protection group to which you want to add this VM and check summary and click Finish

Cross-cloud disaster recovery ensures one of the most secure and reliable solutions for service availability, reason cross-cloud disaster recovery is often the best route for businesses is that it provides IT resilience and business continuity. This continuity is of most important when considering how companies operate, how customers and clients rely on them for continuous service and when looking at your company’s critical data, which you do not want to be exposed or compromised.

Frankly speaking IT disasters happen and happens everywhere including public clouds and much more frequently than you might think. When they occur, they present stressful situations which require fast action. Even with a strategic method for addressing these occurrences in place, it can seem to spin out of control. Even when posed with these situations, IT leaders must keep face, remain calm and be able to fully rely on the system they have in place or partner they are working with for disaster recovery measures.

Customer/Partner with VMware Cloud on AWS and Azure VMware Solution can build cross cloud disaster recovery solution to simplify disaster recovery with the only VMware-integrated solution that runs on any cloud. VMware Site Recovery Manager (SRM) provides policy-based management, minimizes downtime in case of disasters via automated orchestration, and enables non-disruptive testing of your disaster recovery plans.

Persistent Volumes for Tanzu on VMware Cloud on AWS using Amazon FSx for NetApp ONTAP

Featured

Amazon FSx for NetApp ONTAP provides fully managed shared storage in the AWS Cloud with the popular data access and management capabilities of ONTAP and this blog post we are going to use these volumes mount as Persistent Volumes on Tanzu Kubernetes Clusters running on VMware Cloud on AWS

With Amazon FSx for NetApp ONTAP, you pay only for the resources you use. There are no minimum fees or set-up charges. There are five Amazon FSx for NetApp ONTAP components to consider when storing and managing your data: SSD storage, SSD IOPS, capacity pool usage, throughput capacity, and backups.

The Amazon FSx console has two options for creating a file system – Quick create option and Standard create option. To rapidly and easily create an Amazon FSx for NetApp ONTAP file system with the service recommended configuration, I use the Quick create option.

The Quick create option creates a file system with a single storage virtual machine (SVM) and one volume. The Quick create option configures this file system to allow data access from Linux instances over the Network File System (NFS) protocol.

In the Quick configuration section, for File system name – optional, enter a name for your file system.

For Deployment type choose Multi-AZ or Single-AZ.

  • Multi-AZ file systems replicate your data and support failover across multiple Availablity Zones in the same AWS Region.
  • Single-AZ file systems replicate your data and offer automatic failover within a single Availability Zone, for this post i am creating in Single AZ
  • SSD storage capacity, specify the storage capacity of your file system, in gibibytes (GiBs). Enter any whole number in the range of 1,024–196,608.
  • For Virtual Private Cloud (VPC), choose the Amazon VPC that is associate with your VMware Cloud on AWS SDDC.

Review the file system configuration shown on the Create ONTAP file system page. For your reference, note which file system settings you can modify after the file system is created.

Choose Create file system.

Quick create creates a file system with one SVM (named fsx) and one volume (named vol1). The volume has a junction path of /vol1 and a capacity pool tiering policy of Auto.

For us to use this SVM, we need to get the IP address of SVM for NFS , Click on SVM ID and take a note of this IP, we will use this IP in our NFS configurations for Tanzu.

Kubernetes NFS-Client Provisioner

NFS subdir external provisioner is an automatic provisioner that use your existing and already configured NFS server to support dynamic provisioning of Kubernetes Persistent Volumes via Persistent Volume Claims. Persistent volumes are provisioned as ${namespace}-${pvcName}-${pvName}.

More Details – Explained here in detail https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner 

I am deploying this on my Tanzu Kubernetes cluster which is deployed on VMware Cloud on AWS.

  • Add the helm repo –
#helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
  • Install using as below:
#helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
    --set nfs.server=<IP address of Service> \
    --set nfs.path=/<Volume Name>
#My command will be like this#
#helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
    --set nfs.server=172.31.1.234 \
    --set nfs.path=/vol1

Post installation of chart, you can check the status of Pod, it is not in running state then describe and see where it stuck

Finally, Test Your Environment!

Now we’ll test your NFS subdir external provisioner by creating a persistent volume claim and a pod that writes a test file to the volume. This will make sure that the provisioner is provisioning and that the Amazon FSx for NetApp ONTAP service is reachable and writable.

As you can see deployed application created an PV and PVC successfully on Amazon FSx for NetApp ONTAP

Describe the Persistent Volume to see the source of it, as you can see below it is created on NFS running on SVM having IP – 172.31.1.234

This is the power of VMware Cloud on AWS and AWS native services, customers can use any AWS native service without worrying about egress charges as well as security as everything is being configured and accessed over the private connections.

Building Windows Custom Machine Image for Creating Tanzu Workload Clusters

Featured

If your organisation is building an application based on Windows components (such as .NET Framework) and willing to deploy Windows containers on VMware Tanzu, this blog post is on how to build a Windows custom machine image and deploy windows Kubernetes cluster.

Windows Image Prerequisites 

  • vSphere 6.7 Update 3 or greater
  • A macOS or Linux workstation, Docker Desktop and Ansible must be installed on workstation
  • Tanzu Kubernetes Grid v1.5.x or greater
  • Tanzu CLI
  • A Recent Image of Windows 2019 (newer than April 2021) and must be downloaded from Microsoft Developer Network (MSDN) or Volume Licensing (VL) account.
  • The latest VMware Tools Windows ISO image. Download from VMware Tools
  • on vCenter, Inside a data store create a folder such as iso and upload windows ISO and VMware Tools iso

Build a Windows Image 

  • Deploy Tanzu Management Cluster with Ubuntu 2004 Kubernetes v1.22.9 OVA
  • Create a YAML file named builder.yaml with the following configuration, On my local system I have saved this yaml as builder.yaml
apiVersion: v1
kind: Namespace
metadata:
 name: imagebuilder
---
apiVersion: v1
kind: Service
metadata:
 name: imagebuilder-wrs
 namespace: imagebuilder
spec:
 selector:
   app: image-builder-resource-kit
 type: NodePort
 ports:
 - port: 3000
   targetPort: 3000
   nodePort: 30008
---
apiVersion: apps/v1
kind: Deployment
metadata:
 name: image-builder-resource-kit
 namespace: imagebuilder
spec:
 selector:
   matchLabels:
     app: image-builder-resource-kit
 template:
   metadata:
     labels:
       app: image-builder-resource-kit
   spec:
     nodeSelector:
       kubernetes.io/os: linux
     containers:
     - name: windows-imagebuilder-resourcekit
       image: projects.registry.vmware.com/tkg/windows-resource-bundle:v1.22.9_vmware.1-tkg.1
       imagePullPolicy: Always
       ports:
         - containerPort: 3000

Connect the Kubernetes CLI to your management cluster by running:

#kubectl config use-context MY-MGMT-CLUSTER-admin@MY-MGMT-CLUSTER

Apply the builder.yaml file as below:

To ensure the container is running run below command:

List the cluster’s nodes, with wide output and take note of Internal IP address value of the node with ROLE listed as control-plane,master

#kubectl get nodes -o wide

Retrieve the containerd component’s URL and SHA, Query the control plane’s  nodePort  endpoint:

#curl http://CONTROLPLANENODE-IP:30008

Take note of containerd.path and containerd.sha256 values. The containerd.path value ends with something like containerd/cri-containerd-v1.5.9+vmware.2.windows-amd64.tar.

Create a JSON file in an empty folder named windows.json with the following configuration:

{
 "unattend_timezone": "WINDOWS-TIMEZONE",
 "windows_updates_categories": "CriticalUpdates SecurityUpdates UpdateRollups",
 "windows_updates_kbs": "",
 "kubernetes_semver": "v1.22.9",
 "cluster": "VSPHERE-CLUSTER-NAME",
 "template": "",
 "password": "VCENTER-PASSWORD",
 "folder": "",
 "runtime": "containerd",
 "username": "VCENTER-USERNAME",
 "datastore": "DATASTORE-NAME",
 "datacenter": "DATACENTER-NAME",
 "convert_to_template": "true",
 "vmtools_iso_path": "VMTOOLS-ISO-PATH",
 "insecure_connection": "true",
 "disable_hypervisor": "false",
 "network": "NETWORK",
 "linked_clone": "false",
 "os_iso_path": "OS-ISO-PATH",
 "resource_pool": "",
 "vcenter_server": "VCENTER-IP",
 "create_snapshot": "false",
 "netbios_host_name_compatibility": "false",
 "kubernetes_base_url": "http://CONTROLPLANE-IP:30008/files/kubernetes/",
 "containerd_url": "CONTAINERD-URL",
 "containerd_sha256_windows": "CONTAINERD-SHA",
 "pause_image": "mcr.microsoft.com/oss/kubernetes/pause:3.5",
 "prepull": "false",
 "additional_prepull_images": "mcr.microsoft.com/windows/servercore:ltsc2019",
 "additional_download_files": "",
 "additional_executables": "true",
 "additional_executables_destination_path": "c:/k/antrea/",
 "additional_executables_list": "http://CONTROLPLANE-IP:30008/files/antrea-windows/antrea-windows-advanced.zip",
 "load_additional_components": "true"
}

update the values in file as below:

Add the XML file that contains the Windows settings by following these steps:

  • Go to the autounattend.xml file on VMware {code} Sample Exchange.
  • Select Download.
  • If you are using the Windows Server 2019 evaluation version, remove <ProductKey>...</ProductKey>.
  • Name the file autounattend.xml.
  • Save the file in the same folder as the windows.json file and change permission of file to 777.

From your client VM run following command from folder containing your windows.json and autounattend.xml file:

#docker run -it --rm --mount type=bind,source=$(pwd)/windows.json,target=/windows.json --mount type=bind,source=$(pwd)/autounattend.xml,target=/home/imagebuilder/packer/ova/windows/windows-2019/autounattend.xml -e PACKER_VAR_FILES="/windows.json" -e IB_OVFTOOL=1 -e IB_OVFTOOL_ARGS='--skipManifestCheck' -e PACKER_FLAGS='-force -on-error=ask' -t projects.registry.vmware.com/tkg/image-builder:v0.1.11_vmware.3 build-node-ova-vsphere-windows-2019

NOTE: Before you run below command, make sure your workstation is running “Docker Desktop” as well “Ansible”

To ensure the Windows image is ready to use, select your host or cluster in vCenter, select the VMs tab, then select VM Templates to see the Windows image listed.

Use a Windows Image for a Workload Cluster

Use a Windows Image for a Workload Cluster, below yaml shows you how to deploy a workload cluster that uses your Windows image as a template. (This windows cluster is using NSX Advance LB)

#! ---------------------------------------------------------------------
#! non proxy env configs
#! ---------------------------------------------------------------------
CLUSTER_CIDR: 100.96.0.0/11
CLUSTER_NAME: tkg-workload02
CLUSTER_PLAN: dev
ENABLE_CEIP_PARTICIPATION: 'true'
IS_WINDOWS_WORKLOAD_CLUSTER: "true"
VSPHERE_WINDOWS_TEMPLATE: windows-2019-kube-v1.22.5
ENABLE_MHC: "false"

IDENTITY_MANAGEMENT_TYPE: oidc

INFRASTRUCTURE_PROVIDER: vsphere
SERVICE_CIDR: 100.64.0.0/13
TKG_HTTP_PROXY_ENABLED: false
DEPLOY_TKG_ON_VSPHERE7: 'true'
VSPHERE_DATACENTER: /SDDC-Datacenter
VSPHERE_DATASTORE: WorkloadDatastore
VSPHERE_FOLDER: /SDDC-Datacenter/vm/tkg-vmc-workload
VSPHERE_NETWORK: /SDDC-Datacenter/network/tkgvmc-workload-segment01
VSPHERE_PASSWORD: <encoded:T1V3WXpkbStlLUlDOTBG>
VSPHERE_RESOURCE_POOL: /SDDC-Datacenter/host/Cluster-1/Resources/Compute-ResourcePool/Tanzu/tkg-vmc-workload
VSPHERE_SERVER: 10.97.1.196
VSPHERE_SSH_AUTHORIZED_KEY: ssh-rsa....loudadmin@vmc.local

VSPHERE_USERNAME: cloudadmin@vmc.local
WORKER_MACHINE_COUNT: 3
VSPHERE_INSECURE: 'true'
ENABLE_AUDIT_LOGGING: 'true'
ENABLE_DEFAULT_STORAGE_CLASS: 'true'
ENABLE_AUTOSCALER: false
AVI_CONTROL_PLANE_HA_PROVIDER: 'true'
OS_ARCH: amd64
OS_NAME: photon
OS_VERSION: 3

WORKER_SIZE: small
CONTROLPLANE_SIZE: large
REMOVE_CP_TAINT: "true"

if your cluster yaml file is correct, you should see that new windows cluster has been started to deploy.

and after some time if should deploy cluster sucessfully.

In case if you are using NSX-ALB AKO or Pinniped and see that those pods are not running, please refer Here

NOTE – if you see this error during image build process : Permission denied: ‘./packer/ova/windows/windows-2019/autounattend.xml, check the permission of file autounattend.yaml

Cloud Director OIDC Configuration using OKTA IDP

Featured

OpenID Connect (OIDC) is an industry-standard authentication layer built on top of the OAuth 2.0 authorization protocol. The OAuth 2.0 protocol provides security through scoped access tokens, and OIDC provides user authentication and single sign-on (SSO) functionality. For more refer here (https://datatracker.ietf.org/doc/html/rfc6749). There are two main types of authentication that you can perform with Okta:

  • The OAuth 2.0 protocol controls authorization to access a protected resource, like your web app, native app, or API service.
  • The OpenID Connect (OIDC) protocol is built on the OAuth 2.0 protocol and helps authenticate users and convey information about them. It’s also more opinionated than plain OAuth 2.0, for example in its scope definitions.

So If you want to import users and groups from an OpenID Connect (OIDC) identity provider to your Cloud Director system (provider) or Tenant organization, you must configure provider/tenant organization with this OIDC identity provider. Imported users can log in to the system/tenant organization with the credentials established in the OIDC identity provider.

We can use VMware Workspace ONE Access (VIDM) or any public identity providers, but make sure OAuth authentication endpoint must be reachable from the VMware Cloud Director cells.in this blog post we will use OKTA OIDC and configure VMware Cloud to use this OIDC for authentication.

Step:1 – Configure OKTA OIDC

For this blog post, i created an developer account on OKTA at this url –https://developer.okta.com/signup and once account is ready, follow below steps to add cloud director as an application in OKTA console:

  • In the Admin Console, go to Applications > Applications.
  • Click Create App Integration.
  • To create an OIDC app integration, select OIDC – OpenID Connect as the Sign-in method.
  • Choose what type of application you plan to integrate with Okta, in Cloud Director case Select Web Application.
  • App integration name: Specify a name for Cloud Director
  • Logo (Optional): Add a logo to accompany your app integration in the Okta org
  • Grant type: Select from the different grant type options
  • Sign-in redirect URIs: The Sign-in redirect URI is where Okta sends the authentication response and ID token for the sign-in request, in our case for provider https://<vcd url>/login/oauth?service=provider and incase if you are doing it for tenant then use https://<vcd url>/login/oauth?service=tenant:<org name>
  • Sign-out redirect URIs: After your application contacts Okta to close the user session, Okta redirects the user to this URI.
  • AssignmentsControlled access: The default access option assigns and grants login access to this new app integration for everyone in your Okta org or you can choose to Limit access to selected groups

Click Save. This action creates the app integration and opens the settings page to configure additional options.

The Client Credentials section has the Client ID and Client secret values for Cloud Director integration, Copy both the values as we enter these in Cloud Director.

The General Settings section has the Okta Domain, for Cloud Director integration, Copy this value as we enter these in Cloud Director.

Step:2 – Cloud Director OIDC Configuration

Now I am going to configure OIDC authentication for provider side of cloud provider and with very minor changes (tenant URL) it can be configured for tenants too.

Let’s go to Cloud Director and from the top navigation bar, select Administration and on the left panel, under Identity Providers, click OIDC and click CONFIGURE

General: Make sure that OpenID Connect  Status is active, and enter the client ID and client secret information from the OKTA App registration which we captured above.

To use the information from a well-known endpoint to automatically fill in the configuration information, turn on the Configuration Discovery toggle and enter a URL, for OKTA the URL would look this – https://<domain.okta.com>/.well-known/openid-configuration and click on NEXT

Endpoint: Clicking on NEXT will populate “Endpoint” information automatically, it is however, essential that the information is reviewed and confirmed. 

Scopes: VMware Cloud Director uses the scopes to authorize access to user details. When a client requests an access token, the scopes define the permissions that this token has to access user information.enter the scope information, and click Next.

Claims: You can use this section to map the information VMware Cloud Director gets from the user info endpoint to specific claims. The claims are strings for the field names in the VMware Cloud Director response

This is the most critical piece of configuration. Mapping of this information is essential for VCD to interpret the token/user information correctly during the login process.

For OKTA developer account, user name is email id, so i am mapping Subject to email as below

Key Configuration:

OIDC uses a public key cryptography mechanism.A private key is used by the OIDC provider to sign the JWT Token and it can be verified by a 3rd party using the public keys published on the OIDC provider’s well-known URL.These keys form the basis of security between the parties. For security to be maintained, this is required to keep the private keys protected from any cyber-attacks.One of the best practices that has been identified to secure the keys from being compromised is known as key rollover or key Refresh.

From VMware Cloud Director 10.3.2 and above, if you want VMware Cloud Director to automatically refresh the OIDC key configurations, turn on the Automatic Key Refresh toggle.

  • Key Refresh Endpoint should get populated automatically as we choose auto discovery.
  • Select a Key Refresh Strategy.
    • AddPreferred option, add the incoming set of keys to the existing set of keys. All keys in the merged set are valid and usable.
    • Replace – Replace the existing set of keys with the incoming set of keys.
    • Expire After – You can configure an overlap period between the existing and incoming sets of keys. You can configure the overlapping time using the Expire Key After Period, which you can set in hourly increments from 1 hour up to 1 day.

If you did not use Configuration Discovery in Step 6, upload the private key that the identity provider uses to sign its tokens and click on SAVE

Now go to Cloud Director, under Users, Click on IMPORT USERS and choose Source as “OIDC” and add user which is there in OKTA and Assign Role to that user, thats it.

Now you can logout from the vCD console and try to login again, Cloud Director automatically redirects to OKTA and asks for credential to validate.

Once the user is authenticated by Okta, they will be redirected back to VCD and granted access per rights associated with the role that was assigned when the user was provisioned.

Verify that the Last Run and the Last Successful Run are identical. The runs start at the beginning of the hour. The Last Run is the time stamp of the last key refresh attempt. The Last Successful Run is the time stamp of the last successful key refresh. If the time stamps are different, the automatic key refresh is failing and you can diagnose the problem by reviewing the audit events. (This is only applicable if Automatic Key Refresh is enabled. Otherwise, these values are meaningless)

Bring on your Own OIDC – Tenant Configuration

For tenant configuration, i have created a video, please take a look here, Tenant can bring their own OIDC and self service in cloud director tenant portal.

This concludes the OIDC configuration with VMware Cloud Director. I would like to Thank my colleague Ankit Shah, for his guidance and review of this document.

Tanzu Service on VMware Cloud on AWS – Installing Tanzu Application Platform

Featured

VMware Tanzu Application Platform is a modular, application detecting platform that provides a rich set of developer tools and a paved path to production to build and deploy software quickly and securely on any compliant public cloud or on-premises Kubernetes cluster.

Tanzu Application Platform delivers a superior developer experience for enterprises building and deploying cloud-native applications on Kubernetes. It enables application teams to get to production faster by automating source-to-production pipelines. It clearly defines the roles of developers and operators so they can work collaboratively and integrate their efforts.

Operations teams can create application scaffolding templates with built-in security and compliance guardrails, making those considerations mostly invisible to developers. Starting with the templates, developers turn source code into a container and get a URL to test their app in minutes.

Pre-requisite

  1. You should have created an account on Tanzu Network to download Tanzu Application Platform packages.
  2. Servers should have Network access to https://registry.tanzu.vmware.com
  3. A container image registry and access from K8s cluster, in my case i have installed “Harbor” with let’s encrypt certificate.
  4. Registry credentials with read and write access made available to Tanzu Application Platform to store images.
  5. Git repository for the Tanzu Application Platform GUI’s software catalogs, along with a token allowing read access.

Kubernetes cluster requirements

Installation requires Kubernetes cluster v1.20, v1.21, or v1.22 on Tanzu Kubernetes Grid Service on VMware Cloud on VMC as well as pod security policies must be configured so that Tanzu Application Platform controller pods can run as root. To set the pod security policies, run:

#kubectl create clusterrolebinding default-tkg-admin-privileged-binding --clusterrole=psp:vmware-system-privileged --group=system:authenticated

Install Cluster Essentials for VMware Tanzu

The Cluster Essentials for VMware Tanzu package simplifies the process of installing the open-source Carvel tools on your cluster. It includes a script that uses the Carvel CLI tools to download and install the server-side components kapp-controller and secretgen-crontroller on the targeted cluster. Currently, only MacOS and Linux are supported for Cluster Essentials for VMware Tanzu.

  • Sign in to Tanzu Network.
  • Navigate to Cluster Essentials for VMware Tanzu on VMware Tanzu Network.
  • on Linux, download tanzu-cluster-essentials-linux-amd64-1.0.0.tgz.
  • Unpack the TAR file into the tanzu-cluster-essentials directory by running:
#mkdir $HOME/tanzu-cluster-essentials
#tar -xvf tanzu-cluster-essentials-linux-amd64-1.0.0.tgz -C $HOME/tanzu-cluster-essentials
  • Configure and run install.sh using below commands:
#export INSTALL_BUNDLE=registry.tanzu.vmware.com/tanzu-cluster-essentials/cluster-essentials-bundle@sha256:82dfaf70656b54dcba0d4def85ccae1578ff27054e7533d08320244af7fb0343
#export INSTALL_REGISTRY_HOSTNAME=registry.tanzu.vmware.com
#export INSTALL_REGISTRY_USERNAME=TANZU-NET-USER Name
#export INSTALL_REGISTRY_PASSWORD=TANZU-NET-USER PASSWORD
#cd $HOME/tanzu-cluster-essentials
#./install.sh

now Install kapp & imgpkg CLI onto your $PATH using below commands:

sudo cp $HOME/tanzu-cluster-essentials/kapp /usr/local/bin/kapp
sudo cp $HOME/tanzu-cluster-essentials/imgpkg /usr/local/bin/imgpkg

For Linux Client VM: Install the Tanzu CLI and Plugins

To install the Tanzu Tanzu command line interface (CLI) on a Linux operating system, Create a directory named Tanzu and download tanzu-framework-bundle-linux from Tanzu Net and unpack the TAR file into the Tanzu directory and install using below commands:

#mkdir $HOME/tanzu 
#tar -xvf tanzu-framework-linux-amd64.tar -C $HOME/tanzu
#export TANZU_CLI_NO_INIT=true
#cd $HOME/tanzu 
#sudo install cli/core/v0.11.1/tanzu-core-linux_amd64 /usr/local/bin/tanzu
#tanzu version
#cd $HOME/tanzu
#tanzu plugin install --local cli all
#tanzu plugin list

Ensure that you have the acceleratorappspackagesecret, and services plug-ins installed. You need these plug-ins to install and interact with the Tanzu Application Platform.

Installing the Tanzu Application Platform Package and Profiles

VMware recommends install of Tanzu Application Platform packages by relocating the images to your registry from VMware Tanzu Network registry, this will ease the deployment process, so lets do it by logging in to Tanzu Net Registry, setting some env variables and relocate images.

#docker login registry.tanzu.vmware.com
#export INSTALL_REGISTRY_USERNAME=MY-REGISTRY-USER
#export INSTALL_REGISTRY_PASSWORD=MY-REGISTRY-PASSWORD
#export INSTALL_REGISTRY_HOSTNAME=MY-REGISTRY
#export TAP_VERSION=VERSION-NUMBER
#imgpkg copy -b registry.tanzu.vmware.com/tanzu-application-platform/tap-packages:1.0.2 --to-repo ${INSTALL_REGISTRY_HOSTNAME}/TARGET-REPOSITORY/tap-packages

This completes the download and upload on images to local registry.

Create a registry secret by running below command:

#tanzu secret registry add tap-registry \
  --username ${INSTALL_REGISTRY_USERNAME} --password ${INSTALL_REGISTRY_PASSWORD} \
  --server ${INSTALL_REGISTRY_HOSTNAME} \
  --export-to-all-namespaces --yes --namespace tap-install

Add the Tanzu Application Platform package repository to the cluster by running:

#tanzu package repository add tanzu-tap-repository \
  --url ${INSTALL_REGISTRY_HOSTNAME}/TARGET-REPOSITORY/tap-packages:$TAP_VERSION \
  --namespace tap-install

Get the status of the Tanzu Application Platform package repository, and ensure the status updates to Reconcile succeeded by running:

#tanzu package repository get tanzu-tap-repository --namespace tap-install

Tanzu Application Platform profile

The tap.tanzu.vmware.com package installs predefined sets of packages based on your profile settings. This is done by using the package manager you installed using Tanzu Cluster Essentials.Here is my full profile sample file:

buildservice:
  descriptor_name: full
  enable_automatic_dependency_updates: true
  kp_default_repository: harbor.tkgsvmc.net/tbs/build-service
  kp_default_repository_password: <password>
  kp_default_repository_username: admin
  tanzunet_password: <password>
  tanzunet_username: tripathiavni@vmware.com
ceip_policy_disclosed: true
cnrs:
  domain_name: tap01.tkgsvmc.net
grype:
  namespace: default
  targetImagePullSecret: tap-registry
learningcenter:
  ingressDomain: learningcenter.tkgsvmc.net
metadata_store:
  app_service_type: LoadBalancer
ootb_supply_chain_basic:
  gitops:
    ssh_secret: ""
  registry:
    repository: tap
    server: harbor.tkgsvmc.net/tap
profile: full
supply_chain: basic
tap_gui:
  app_config:
    app:
      baseUrl: http://tap-gui.tap01.tkgsvmc.net
    backend:
      baseUrl: http://tap-gui.tap01.tkgsvmc.net
      cors:
        origin: http://tap-gui.tap01.tkgsvmc.net
    catalog:
      locations:
        - target: https://github.com/avnish80/tap/blob/main/catalog-info.yaml
          type: url
  ingressDomain: tap01.tkgsvmc.net
  ingressEnabled: "true"
  service_type: LoadBalancer

save this file with modified values as per your environment, for more details about details of settings, check Here.

Install Tanzu Application Platform

finally lets install TAP, to install the Tanzu Application Platform package run below commands:

#tanzu package install tap -p tap.tanzu.vmware.com -v $TAP_VERSION --values-file tap-values.yml -n tap-install

to verify the packages installed, you can go to TMC and check there

or you an run below command to verify too

#tanzu package installed get tap -n tap-install

This completes the installation of Tanzu Application platform, now developer can: Develop and promote an application, Create an application accelerator, Add testing and security scanning to an application, Administer, set up, and manage supply chains.

Tanzu Service on VMware Cloud on AWS – Kubernetes Cluster Operations

Featured

Tanzu Kubernetes Grid is a managed service offered by VMware Cloud on AWS. Activate Tanzu Kubernetes Grid in one or more SDDC clusters to configure Tanzu support in the SDDC vCenter Server.In my previous post (Getting Started with Tanzu Service on VMware Cloud on AWS),in this i walked you through how to enable Tanzu Service on VMware Cloud on AWS.

In this post i will deploy Tanzu Kubernetes Cluster by GUI (from Tanzu Mission Control) and as well as CLI but this CLI is updated API V2 version, so lets get started.

Deploy Tanzu Kubernetes Cluster using Tanzu Mission Control

Go to Tanzu Mission Control and validate that VMC supervisor cluster is registered and healthy by going to Tanzu Mission Control, Click on Administration, to go “management cluster” and check the status

Now on Tanzu Mission Control, click on “Clusters” and then click on “CREATE CLUSTER”

Select your VMC Tanzu Management Cluster and click on “CONTINUE TO CREATE CLUSTER”

on the next screen choose “Provisioner” (namespace name”). you add a provisioner by creating a vSphere namespace in the Supervisor Cluster, which you can do in VMC vCenter.

Next is select Kubernetes Version, latest supported version is preselected for you, Pod CIDR, and Service CIDR. You can also optionally select the default storage class for the cluster and allowed storage classes.The list of storage classes that you can choose from is taken from your vSphere namespace.

Select the type of cluster you want to create. the primary difference between the two is that the highly available cluster is deployed with multiple control plane nodes.

You can optionally select a different instance type for the cluster’s control plane node and its storage class as well as you can optionally additional storage volumes for your control plane.

To configure additional volumes, click Add Volume and then specify the name, mount path, and capacity for the volume. To add another, click Add Volume again.

Next is you can define the default node pool and create additional node pools for your cluster. specify the number of worker nodes to provision also select the instance type for workload clusters and select the storage class

When you ready to provision the new cluster, click Create Cluster and wait for few minutes

you can also view vCenter activities about creation of Tanzu Kubernetes cluster.

Once the cluster is fully created and TMC agent reported back, you should see below status on TMC console, which shows that cluster has been successfully created.

This complates Tanzu Kubernetes Cluster deployment using GUI.

Deploy Tanzu Kubernetes Grid Service using v1alpha2 API yaml

The Tanzu Kubernetes Grid Service v1alpha2 API provides a robust set of enhancements for provisioning Tanzu Kubernetes clusters. there is an YAML specification which i am using for provisioning a Tanzu Kubernetes Cluster Using the Tanzu Kubernetes Grid Service v1alpha2 API

apiVersion: run.tanzu.vmware.com/v1alpha2
kind: TanzuKubernetesCluster
metadata:
  name: tkgsv2
  namespace: wwmca
spec:
  topology:
    controlPlane:
      replicas: 1
      vmClass: guaranteed-medium
      storageClass: vmc-workload-storage-policy-cluster-1
      volumes:
        - name: etcd
          mountPath: /var/lib/etcd
          capacity:
            storage: 4Gi
      tkr:  
        reference:
          name: v1.21.2---vmware.1-tkg.1.ee25d55
    nodePools:
    - name: worker-nodepool-a1
      replicas: 2
      vmClass: best-effort-large
      storageClass: vmc-workload-storage-policy-cluster-1
      tkr:  
        reference:
          name: v1.21.2---vmware.1-tkg.1.ee25d55
  settings:
    storage:
      defaultClass: vmc-workload-storage-policy-cluster-1
    network:
      cni:
        name: antrea
      services:
        cidrBlocks: ["198.53.100.0/16"]
      pods:
        cidrBlocks: ["192.0.5.0/16"]
      serviceDomain: managedcluster.local
      trust:
        additionalTrustedCAs:
          - name: CompanyInternalCA-1
            data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tDQpNSUlG

Two key parameters which i am using for cluster provistioning

  • #tkr.reference.name is the TKR NAME #to be used by control plane nodes; supported format is “v1.21.2—vmware.1-tkg.1.ee25d55”
  • #trust configures additional certificates for the cluster #if omitted no additional certificate is configured

You can run below command to check the status of cluster provustioning:

#kubectl get tkc

Scale a Tanzu Kubernetes cluster

Publish the service Internally/Externally

Before we can make our service available over the Internet, it should be accessible from within the VMware Cloud on AWS instance. Platform operators can publish applications through a Kubernetes Service of type LoadBalancer. This ability is made possible through the NSX-T Container Plugin (NCP) functionality built into Tanzu Kubernetes Grid. lets deploy a basic container and exposed it as type “LoadBalancer”

#kubectl run nginx1 --image=nginx
#kubectl expose pod nginx1 --type=LoadBalancer --port=80

Now you can access the application internally by accessing internal

Access application from Internet

To make it publicly available, we must assign a public IP address, and configure a Destination NAT, let do it request an Public IP on VMC console and create a NAT rule on Internet Tab to access the application from internet.

Now access the application from Internet and you should be able to successfully access it using provided public ip.

Exposing a Kubernetes service to the Internet takes a couple of more steps to complete than exposing it to your internal networks, but the VMware Cloud Console makes those steps simple enough. After exposing the Kubernetes service using an NSX-T Load Balancer, you can request a new Public IP Address and then configure a NAT rule to send that traffic to the virtual IP address of the load balancer.

Getting Started with Tanzu Service on VMware Cloud on AWS

VMware Tanzu Kubernetes Grid (TKG) is a multi-cloud Kubernetes footprint that customers/partners can run both on-premises in vSphere, VMware Cloud on AWS and the public cloud on Amazon EC2 and Microsoft Azure VMs.

TKG provides a Container orchestration through Kubernetes is now built into the vSphere 7 platform.As a VMware Cloud on AWS customer you can take advantage of this new functionality to build Kubernetes clusters in the same platform you’ve grown accustomed to using to manage your virtual infrastructure.

Take control of Cloud Resources and give freedom to Developers based on Personas

Virtualization Administrator: They will be able to define resource allocations and permissions for your users to create their own Kubernetes clusters according to their own specifications.Define access policies, storage policies, memory and CPU restrictions for teams needing Kubernetes access.

Developer or Platform Administrator: They can create new Kubernetes clusters within the defined access policies, upgrade those clusters and scale clusters within the approved resource allocations.

VMware recognizes that not all environments are running on top of vSphere. Tanzu Kubernetes Grid(TKG) leverages the same ClusterAPI engine as VMware Tanzu to manage cluster lifecycles, and can run on any infrastructure. VMware provides three variants of the TKG:

  • Tanzu Kubernetes Grid Multi-Cloud (TKGm): Installer driven wizard to set up Kubernetes environment to run across multi clouds for example: on AWS EC2 or Azure Native VMs
  • Tanzu Kubernetes grid Service (TKGS) aka vSphere With Tanzu: Natively integrated with vSphere7+ and available to customers at no extra cost for basic version on VCF on-prem as well as VMware Cloud on AWS
  • Tanzu Kubernetes Grid Integrated Edition: VMware Tanzu Kubernetes Grid Integrated Edition (formerly known as VMware Enterprise PKS) is a Kubernetes-based container solution with advanced networking, a private container registry, and life cycle management.

Enable Tanzu Service on VMware Cloud on AWS

Pre-requisite:

  • Make sure we have at-least three node SDDC is deployed and running with enough available resources (at least 112 GB of available memory, and has sufficient free resources to support 16 vCPUs)
  • Get Three CIDR blocks for the deployment. These three needs to be ranges that does not overlap with the Management CIDR or any other networks used on-prem or in the VMware Cloud on AWS SDDC.
  • You can activate Tanzu Kubernetes Grid in any SDDC at version 1.16 and later.
  •  If Edge cluster has been configured with medium configuration, then a SDDC cluster requires a minimum of three hosts for activation.
  • If Edge cluster has been configured with Large configuration, then a SDDC cluster requires a minimum of four hosts for activation.

Once pre-requisites are ready, go to VMware Cloud on AWS SDDC and click on “Activate the Tanzu Kubernetes Service”

Activation process will check required resources and will only move ahead if you have pre-requisite completed.

on the next screen:

  • Leave the Service CIDR as default or pick of your choice but non-overlapping and used for Tanzu supervisor services for the cluster
  • Enter the “namespace Network CIDR”, non-overlapping
  • Enter an ‘Ingress CIDR”, non-overlapping
  • Enter an “Egress CIDR”, non-overlapping
  • next Click on “Validate and Proceed”

NOTE: CIDR blocks of size 16, 20, 23, or 26 are supported, and must be in one of the “private address space” blocks defined by RFC 1918 (10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16). 

and finally once validation is done, click on Activate Tanzu Kubernetes Grid

this will start activation process and you should be seeing “Activating Tanzu Kubernetes Grid” on your SDDC tile.This process should get completed within 20-30 minutes.

Such an easy process to make your SDDC enabled for running VMs and Containers together. When activation is completed, login to SDDC vCenter and click on Workload Management

Persona (Virtualization/vSphere Administrator) – vSphere Administrator create a vSphere Namespace on the Supervisor Cluster, sets resource limits to the namespace and permissions so that DevOps engineers can access it. he/she provide the URL of the Kubernetes control plane to DevOps engineers where they can run create their own Kubernetes clusters and run their workloads.

Step -1: Set permissions so that DevOps engineers can access the namespace.

From the Permissions pane, select Add Permissions.

Select an identity source, a user or a group, and a role, and click OK.

Step-2: Set persistent storage to the namespace.Storage policies that you assign to the namespace control how persistent volumes and Tanzu Kubernetes cluster nodes are placed within datastores in the SDDC environment.

From the Storage pane, select Add Storage.

Select a storage policy to control datastore placement of persistent volumes and click OK

The VM class is a VM specification that can be used to request a set of resources for a VM. The VM class is controlled and managed by a vSphere administrator, and defines such parameters as the number of virtual CPUs, memory capacity, and reservation settings. The defined parameters are backed and guaranteed by the underlying infrastructure resources of a Supervisor Cluster.

Workload Management offers several default VM classes. Generally, each default class type comes in two editions: guaranteed and best effort. A guaranteed edition fully reserves resources that a VM specification requests. A best effort class edition does not and allows resources to be overcommitted. Typically, a guaranteed type is used in a production environment.

vSphere Administrator can setup additional limits based on use cases and requirements.

Copy NameSpace URL by clicking on “Copy link” and give it to your DevOps/Platfrom admin)

Persona (DevOps/Platform Administrator)

How to Access and Work ?

Install a new VM (clientvm) or from their desktop/laptop, he/she can access this newly created “Namespace” and then create new Kubernetes cluster. When the new VM is provisioned, power it on and and ssh to it and Download the command line tools from vCenter, make sure the item below in red box is changed to your supervisor cluster address that you copied earlier by running:

#wget https://k8s.Cluster-1.vcenter.sddc-18-139-9-54.vmwarevmc.com/wcp/plugin/linux-amd64/vsphere-plugin.zip

Unzip using below command

Now lets login to the supervisor cluster by entering the following :

kubectl vsphere login --vsphere-username cloudadmin@vmc.local --server=https://k8s.Cluster-1.vcenter.sddc-18-139-9-54.vmwarevmc.com
enter the password for cloudadmin or any other user to complete the login

From here onwards, Devops can create their own K8s clusters and deploy applications, they can also utlize VMware’s multi-cloud mamagement platfrom to spin up kubernetes clusters using GUI.

For Devops to use GUI, vSphere Administrator need to Register VMware Cloud on AWS management cluster with Tanzu Mission Control. lets do that:

Register This Management Cluster with Tanzu Mission Control

Tanzu service ships with a namespace for Tanzu Mission Control. This namespace exists on the Supervisor Cluster where you install the Tanzu Mission Control agent.

The vSphere Namespace provided for Tanzu Mission Control is identified as svc-tmc-cXX

To integrate the Tanzu Kubernetes Grid Service with Tanzu Mission Control, install the agent on the Supervisor Cluster.

Register the Supervisor Cluster with Tanzu Mission Control and obtain the Registration URL. See Register a Management Cluster with Tanzu Mission Control.

On the client-vm, create a .yaml file with below content:

apiVersion: installers.tmc.cloud.vmware.com/v1alpha1
kind: AgentInstall
metadata:
  name: tmc-agent-installer-config
  namespace: <NAMESPACE captured in above step>
spec:
  operation: INSTALL
  registrationLink: <TMC-REGISTRATION-URL captured from TMC console>

Run this yaml file on using:

#kubectl create -f tmc.yaml

you can also check the status of TMC registration by running below command:

#kubectl get pods -n <ns name>

Now go back to Tanzu Mission Control and after some time you should see your Supervisor cluster ready

Devops/Platform admins are now ready to deploy your TKC clustes as well they can deploy containers, this completes this part of blog, in the next part i will write how to create TKC clusters, run applications within containers and how to expose to internet.

Load Balancer as a Service with Cloud Director

Featured

NSX Advance Load Balancer’s (AVI) Intent-based Software Load Balancer provides scalable application delivery across any infrastructure. AVI provides 100% software load balancing to ensure a fast, scalable and secure application experience. It delivers elasticity and intelligence across any environments. It scales from 0 to 1 million SSL transactions per second in minutes. It achieves 90% faster provisioning and 50% lower TCO than traditional appliance-based approach.

With the release of Cloud Director 10.2 , NSX ALB is natively integrated with Cloud Director to provider self service Load Balancing as a Service (LBaaS) where providers can release load balancing functionality to tenants and tenants consume load balancing functionality based on their requirement. In this blog post we will cover how to configure LBaaS.

Here is High Level workflow:

  1. Deploy NSX ALB Controller Cluster
  2. Configure NSX-T Cloud
  3. Discover NSX-T Inventory,Logical Segments, NSGroups (ALB does it automatically)
  4. Discover vCenter Inventory,Hosts, Clusters, Switches (ALB does it automatically)
  5. Upload SE OVA to content library (ALB does it automatically, you just need to specify name of content library)
  6. Register NSX ALB Controller, NSX-T Cloud and Service Engines to Cloud Director and Publish to tenants (Provider Controlled Configuration)
  7. Create Virtual Service,Pools and other settings (Tenant Self Service)
  8. Create/Delete SE VMs & connect to tenant network (ALB/VCD Automatically)

Deploy NSX ALB (AVI) Controller Cluster

The NSX ALB (AVI) Controller provides a single point of control and management for the cloud. The AVI Controller runs on a VM and can be managed using its web interface, CLI, or REST API but in this case Cloud Director.The AVI Controller stores and manages all policies related to services and management. To ensure AVI controllers High Availability we need to deploy 3 AVI Controller nodes to create a 3-node AVI Controller cluster.

Deployment Process is documented Here & Cluster creation Process is Here

Create NSX-T Cloud inside NSX ALB (AVI) Controller

NSX ALB (AVI) Controller which uses APIs to interface with the NSX-T manager and vCenter to discover the infrastructure.here is high level activities to configure NSX-T Cloud in NSX ALB management console:

  1. Configure NSX-T manager IP/URL (One per Cloud)
  2. Provide admin credentials
  3. Select Transport zone (One to One Mapping – One TZ per Cloud)
  4. Select Logical Segment to use as SE Management Network
  5. Configure vCenter server IP/URL (One per Cloud)
  6. Provide Login username and password
  7. Select Content Library to push SE OVA into Content Library

Service Engine Groups & Configuration

Service Engines are created within a group, which contains the definition of how the SEs should be sized, placed, and made highly available. Each cloud will have at least one SE group.

  1. SE Groups contain sizing, scaling, placement and HA properties
  2. A new SE will be created from the SE Group properties
  3. SE Group options will vary based upon the cloud type
  4. An SE is always a member of the group it was created within in this case NSX-T Cloud
  5. Each SE group is an isolation domain
  6. Apps may gracefully migrate, scale, or failover across SEs in the groups

​Service Engine High Availability:

Active/Standby

  1. VS is active on one SE, standby on another
  2. No VS scaleout support
  3. Primarily for default gateway / non-SNAT app support
  4. Fastest failover, but half of SE resources are idle

​Elastic N + M 

  1. All SEs are active
  2. N = number of SEs a new Virtual Service is scaled across
  3. M = the buffer, or number of failures the group can sustain
  4. SE failover decision determined at time of failure
  5. Session replication done after new SE is chosen
  6. Slower failover, less SE resource requirement

Elastic Active / Active 

  1. All SEs are active
  2. Virtual Services must be scaled across at least 2 Service engines
  3. Session info proactively replicated to other scaled service engines
  4. Faster failover, require more SE resources

Cloud Director Configuration

Cloud Director Configuration is two fold, Provider Config and Tenant Config, lets first cover provider Config…

Provider Configuration

Register AVI Controller: Provider administrator login as a admin and register AVI Controller with Cloud Director. provider has option to add multiple AVI controllers.

NOTE – incase if you are registering with NSX ALB’s default self sign certificate and if it throws error while registering , then regenerate self sign certificate in NSX ALB.

Register NSX-T cloud

Now next thing is we need to register NSX-T cloud with Cloud Director, which we had configured in ALB controller:

  1. Selecting one of the registered AVI Controller
  2. Provide a meaning full name to the controller
  3. Select the NSX-T cloud which we had registered in AVI
  4. Click on ADD.

Assign Service Engine groups

Now register service engine groups either “Dedicated” or “Reserved” based on tenant requirement or provider can have both type of groups and assign to tenant based on requirements.

  1. Select NSX-T Cloud which we had registered above
  2. Select the “Reservation Model”
    1. Dedicated Reservation Model:- For each tenant Organization VDC Edge gateway, AVI will create two Service Engine nodes for each LB enabled Org VDC Edge GW.
    2. Shared Reservation Model:- Shared is elastic and shared among all tenants. AVI will create pool of service engines that are going to be shared across tenant. Capacity allocation is managed in VCD, Avi elastically deploys and un-deploys service engines based on usage

Provider Enables and Allocates resources to Tenant

Provider enables LB functionality in the context of Org VCD Edge by following below steps:

  1. Click on Edges 
  2. Choose Edge on which he want to enable load balancing
  3. Go to “Load Balancer” and click on “General Settings”
  4. Click on “Edit”
  5. Toggle on to Activate to activate the load balancer
  6. Select Service Specification

Next step is to assign Service Engines to tenant based on requirement, for that go to Service Engine Group and Click on “ADD” and add one of the SE group which we had registered previously to customer’s one of the Edge.

Provider can restrict usage of Service Engines by configuring:

  1. Maximum Allowed: The maximum number of virtual services the Edge Gateway is allowed to use.
  2. Reserved: The number of guaranteed virtual services available to the Edge Gateway.

Tenant User Self Service Configuration

Pools: Pools maintain the list of servers assigned to them and perform health monitoring, load balancing, persistence.

  1. Inside General Settings some of the key settings are:
    1. Provide Name of the Pool
    2. Load Balancing Algorithm
    3. Default Server Port
    4. Persistence
    5. Health Monitor
  2. Inside Members section:
    1. Add Virtual Machine IP addresses which needs to be load balanced
    2. Define State, Port and Ratio
    3. SSL Settings allow SSL offload and Common Name Check

Virtual Services: A virtual service advertises an IP address and ports to the external world and listens for client traffic. When a virtual service receives traffic, it may be configured to:

  1. Proxy the client’s network connection.
  2. Perform security, acceleration, load balancing, gather traffic statistics, and other tasks.
  3. Forward the client’s request data to the destination pool for load balancing.

Tenant choose Service Engine Group which provider has assigned to tenant, then choose Load Balancer Pool which we created in above step and most important Virtual IP This IP address can be from External IP range of the Org VDC or if you want Internal IP , then you can use any IP.

So in my example, i am running two virtual machines having Org VDC Internal IP addresses and VIP is from external public IP address range, so if I browse VIP , i can reach to web servers sucessfully using VCD/AVI integration.

This completes basic integration and configuration of LBaaS using Cloud Director & NSX Advance Load Balancer. feel free to share feedback.