Dynamic Enforcement for Network Selection in vRA 7

December 22, 2016 - - 8 Comments

I’m working in the design and deployment of a large Enterprise Cloud project where the multi-tenancy is vital for the success of the project. When you work in a private cloud project a common shared infrastructure is the typical approach to make feasible the business case. Usually, the physical segregation is based mainly on security requirements and not anymore on whom purchased the hardware. The multi-tenancy segregation is moved up to the logical level, making the infrastructure a shared commodity.

From network standpoint, the logical segregation is supported by the use of VLAN, PVLAN, GRE, VXLAN protocols and so on. Every time a virtual machine is provisioned through vRealize Automation, it requires the network access to communicate with other workloads.

In the case of vSphere as an endpoint in vRA, the NIC(s) of the virtual machine will be connected to one of the port groups available within the reservation where that is being provisioned. The first level of logical segregation in vRA is the business group object, where it can have one or more reservations (the second level). If the user is not member of the business group with the linked reservation(s), it cannot provision any virtual machine on that.

Source: VMware.com

The reservation includes what network(s) the user can connect its virtual machine, but these networks are not populated dynamically in vRA. The easy approach is the creation of the custom property called “VirtualMachine.NetworkN.Name”, where the N is the index of your virtual machine NIC (starting at 0). During the creation of this custom property, you will have the chance to create a static list, or a dynamic one using any script action you have available in your vRealize Orchestrator platform that vRA is consuming.

The main challenge of a static list is the maintenance of the network list that is shown to the user, also this list is shared for all the users. From security standpoint, this is not an issue since the network must be enabled in the user reservation, if that is not enabled the user will get an error that the virtual machine cannot be provisioned because the reservation is not entitled to consume the network. While this is not involve a security breach, expose the whole network list is not the best approach since sensitive information can be contained in the network name. For this reason, create a dynamic network list on fly based in the user business group and the linked reservation to that, seems like the better approach I’ve found until now. When you’re dealing with a multi-tenancy environment, the security requirements are always the most importance piece of the design.

To achieve that, we are going to leverage a built-in vRO action available from version 7.1 called “getApplicableNetworks” (getApplicableNetworks). We will tweak a bit this action since out-of-the-box shows all the entitled networks for the user regardless for what business group is requesting the virtual machine. To filter the networks based in the business group, we will create a custom property in the business group with the value equal to the business group name. The reason to create this custom property in the business group is because this is not exposed during the wizard, for this reason vRO cannot get the value using the ASD properties. To solve this gap, the vRO action will include a string input populating the value of the business group custom property.

vRO Configuration

Let’s start with the configuration of our dynamic enforced network list:

  • Copy the built-in vRO action in a new location. With this step we ensure in future releases of the vRA plug-in nothing is broken. Also, the copy will be the action to modify and include the filter to retrieve only the business group’s reservation(s) that the requester is entitled for.

Copy the vRO script action “getApplicableNetworks”

In vRealize Orchestrator version 7.2, VMware has changed the action code and the previous dependency in 7.1 with the action called “getReservationsForUser”, has been changed to “getReservationsForUserAndComponent”. The issue is the action “getApplicableNetworks” doesn’t work in 7.2 anymore because VMware has forgotten to update the script to include as an input some values that the dependency “getReservationsForUserAndComponent” requires.

If you are using vRO 7.1, you can skip this step.

  • Duplicate the current “getReservationsForUserAndComponent” action

Duplicate the “getReservationsForUserAndComponent” action

  • Downgrade to version 1.0. The action name will be changed to “getReservationsForUser”.

Downgrade the copied action to version 1.0 (getReservationsForUser)

  • Edit the copied action and add a string input called “subtenant”

Add a string input

  • Go to the script tab and update the 4th line with the action folder and name you have duplicated and downgraded above.

var reservations = System.getModule(“com.joseluisgomez.vra.reservations“).getReservationsForUser(user, tenant, host);

  • Between the 4th line (var reservations) and the 5th line (var applicableNetworks), add the following line to find the business group populated with the custom property value from vRA.

var subtenants = vCACCAFEEntitiesFinder.findSubtenants(host, subtenant);

Code updated (lines 4th and 5th)

  • The last step with the action is to add a conditional. It will look for the business group name (subtenant name) matching with the input populated in vRA in all the gather reservations. Add the following line after the “for each“, also you can add an additional tabulation to the current code to properly align. In addition, an additional closing curly bracket is required to close the conditional. The red code is the new one to add.

for each(var res in reservations) {
 if(res.getSubTenantId() == subtenants[0].id){
  var extensionData = res.getExtensionData();
  if(extensionData) {
   var networks = extensionData.get(“reservationNetworks”);
   if(networks) {
    for each(var network in networks.getValue()) {
     var path = network.getValue().get(“networkPath”);
     applicableNetworks.put(path.label, path.label);
    }
   }
  }
 }
}
return applicableNetworks;

Conditional to add

We are done with vRO. Now it’s the moment to consume this action from vRA and our blueprint.

vRA Configuration

  • Create a custom property for the business group with your preferred name. In my case, I’ve used the following one:
    • Property name. NGDC.Software.VMware.vRA.Subtenant.Name
    • Property value. The business group name (ex.: Tenant_Environment_Service_Index)

Custom Property with Business Group’s name as value

  • Create a custom property definition called “VirtualMachine.Network0.Name” with the following settings:
    • Name: VirtualMachine.Network0.Name
    • Label: Select a network
    • Visibility: All tenants
    • Display order: 1
    • Data type: string
    • Required: yes
    • Display as: Dropdown
    • Values: External values
    • Script action: Click the change button and select the script action called getApplicableNetworksBySubtenant we have created in the previous steps.
    • Input parameters: Check the bind checkbox and as value type the business group custom property “NGDC.Software.VMware.vRA.Subtenant.Name“. Note: The business group custom properties are not automatically discovered if you click the dropdown list. You must type the custom property.

  • The last step is to add into the blueprint the custom property definition you have created in the step above. The blueprint doesn’t require any network object, just the vSphere virtual  machine object and add to that a virtual NIC. The virtual NIC0 must have as custom property the following one:
    • Name: VirtualMachine.Network0.Name
    • Value: Empty
    • Encrypted: No
    • Overridable: Yes
    • Show in Request: Yes

Blueprint custom property

Result

The best way to test our custom property definition is to create two business groups with one reservation each, make the same user the group manager and create two service catalogue entitlements, one for each business group.

The next screenshot shows the first business group (Tenant1-Global). As you can see, this business group has two networks. A static dvPortGroup (Main) and a NSX VXLAN dvPortGroup (5008-Photon-Hosts)

Tenant1-Global business group dynamic enforced network list

The following screenshot shows the networks for the second business group (CORP_PROD_APP1_01). As you can see, the business groups also has two networks but different ones. Two NSX VXLAN dvPortGroups (5005-NTXCE-Mgmt and 5004-NTXCE-VMs)

CORP_PROD_APP1_01 business group dynamic enforced network list

Conclusion

I hope you have found it useful and can help you with new and current deployments. This is the best approach I have found until now supporting the multi-tenancy based in business group. This approach also works if the multi-tenancy is done using the tenant functionality in vRA.

If you have liked it, don’t hesitate to share it with your contacts.

How to operate your Home Lab with a Raspberry Pi – Part 1

October 23, 2016 - - 2 Comments

For a long time I wanted to give some use to an old Raspberry Pi (Model B Revision 2.0 – 2011.12). Since I acquired my C6100 for home lab purpose, I was aware I couldn’t keep it powered on 24/7. From noise and consumption standpoint, it’s not the most friendly home lab you can buy. In the other hand, you have a bunch of resources to run your workloads for a reduced cost.

With the noise and power consumption as concern, I knew on some manner I should remotely control the home lab to power it on/off in the event I required to work on it, or run a demo for a customer from its facilities.

With the requirements above, I found the Raspberry Pi as the device to support the following user cases aligned with the requirements:

  • VPN server
  • Dynamic DNS client
  • Control station to operate the remote controlled sockets
  • Control station to operate the home lab power state

The following diagram depicts how to operate your Home Lab with a Raspberry Pi using different components and software.

How to operate your home lab with a raspberry pi

VPN server

This use case will be covered in the second part of this post series. But as a brief introduction, the VPN service will be deployed using an Ansible role I’ve created, pipoe2h.pivpn (https://galaxy.ansible.com/pipoe2h/pivpn/). This role will install and configure OpenVPN in your Raspberry Pi. Maybe you are wondering the reason to not use pfSense, it’ has not support for ARM.

Dynamic DNS client

This use case will be covered in the third part of this post series. But as a brief introduction, this use case doesn’t cover only the configuration of a dynamic DNS client. The idea is to run your own Dynamic DNS service if your web hosting runs CPanel. If you are one of those with CPanel, you will have the chance to create your own DynDNS service and keep alive the access to your home lab wherever you are. The DynDNS service will be deployed using an Ansible role I’ve created, pipoe2h.piddns (https://galaxy.ansible.com/pipoe2h/piddns/). This role will install and configure a PHP page in your website as entry point to dynamically configure your home lab DNS record. The DynDNS client is modified to support the integration with your own DynDNS service.

Control station

The Raspberry Pi gives you the chance to be the only machine to be powered on and reduce the power consumption. You can use the Raspi as the jump box to operate your entire home lab.

Operate the remote controlled sockets

This use case will be covered in the fourth part of this post series. But as a brief introduction, since the enterprise PDUs with management interface to power your devices on/off are expensive, I found a cheaper way to get at least the control to power on/off of those. You can install to your Raspi a remote control board. Using remote controlled sockets you can achieve a close experience like the enterprise PDUs. I bought the Energenie kit ENER002-2PI for £22.

How to operate your home lab with a raspberry pi

Operate the home lab power state

This use case will be covered in the fifth part of this post series. But as a brief introduction, once you have switched the socket on, you are able to use IPMI or WOL to power on your server(s). I’ll share with you the PowerCLI I’ve created to power on/off your ESXi hosts and the virtual machines within.

ESXi 6.0.x host doesn’t register Cisco ACI’s ARP reponses with Mellanox 10/40 Gb Nics and nmlx4_en driver loaded

August 8, 2016 - - 3 Comments

I’m currently working in a project designing and delivering a private cloud platform based on VMware vRealize and Cisco ACI as the SDN solution.

For almost two days we weren’t able to ping from the ESXi host (Mellanox) to its default gateway provided by a subnet within the Cisco ACI Bridge Domain (BD). However, a physical Windows box (Broadcom) member of the same EPG than the ESXi hosts, was able to ping the same default gateway. This behavior was odd since the ping between members of the same EPG worked fine like between ESXi hosts, or also with the physical Windows machine.

ACI

The first thought that comes to your mind is that you’re missing some setting in your ACI. Why?, because we’re talking about SDN solutions, the philosophy and logic behind that change radically. Now you must know about multi-tenancy, bridge domains, endpoint groups, contracts and so on, so it’s really easy to miss something during the configuration.

Environment

  • ESXi host.
    • HP DL360 Gen9
    • Mellanox 10/40 Gb – MT27520 Family (affected with ARP bug)
      • NIC Driver info:
        • Driver: nmlx4_en
        • Firmware Version: 2.35.5100
        • Version: 3.1.0.0
  • Cisco ACI version 2.0(1n)
  • VMware ESXi 6.0.x
    • Update 1
    • Update 2
    • VMware and HPE OEM ISOs tested

Symptom

  • ESXi host doesn’t reach its default gateway (ACI BD IP).
  • Any traffic routed through the gateway doesn’t reach its destination.
  • ACI replies the ARP request from ESXi but the last one doesn’t register that

Tcpdump-uw in ESXi didn’t show the ACI responses. When we run Wireshark in the physical machine, we could see to ACI reply the ARP requests from ESXi.

capture2

Resolution

After the installation of the last version of Mellanox driver available in the VMware website, the ESXi host began to see the ARP responses. These responses were registered and the communication from the ESXi hosts to the default gateway and other networks worked properly.

Troubleshooting Commands

The following commands were used to perform the troubleshooting from the ESXi host side.

# Display physical network adapter information (counters, ring and driver)
/usr/lib/vmware/vm-support/bin/nicinfo.sh

# Display ARP table
esxcli network ip neighbor list

# Display VMkernel network interfaces
esxcli network ip interface list

# Display the virtual switches
esxcli network vswitch standard list

# Verify port connection
nc -z IP Port

# Capture traffic
tcpdump-uw -vv

Nutanix .NEXT 2016 Conference Highlights

July 4, 2016 - - 0 Comment

My opinion of the Nutanix .NEXT 2016 Conference is coming a bit late but I did not have the chance to take a look until now about what Nutanix announced in the conference.

Let’s to analyze that I consider the most exciting announced features that come the next months. You can see the full list of announcements at Nutanix .NEXT 2016 Announcements: Innovation is Just a Click Away.

A Single Platform for All Workloads

Nutanix .NEXT 2016

Source: Nutanix.com – Single Nutanix Fabric

Nutanix goes step forward with its old message from its beginnings, #NoSAN. Many workloads still run in physical servers because their requirements around resources could jeopardize the performance of other virtual machines within the hyper-converged infrastructure. Those physical workloads require a SAN array, but until now Nutanix didn’t support out-of-the-box the block storage functionality. Even deploy a VSA software on top of Nutanix and expose iSCSI targets was not feasible, it could incur in a performance degradation for the virtual workloads running in the platform.

Nowadays with flash storage price coming down and emerging technologies like NVMe more and more adopted by vendors, starts to make sense the leveraging of unused IOPS and available space of the hyper-converged infrastructure, and expose them to the physical workloads. For this, Nutanix has developed the feature called Acropolis Block Services (ABS). This capability is planned to be available in the 4.7 release.

Acropolis Block Services

Based on the iSCSI protocol, customers can use it similarly to Amazon Elastic Block Store (EBS). I believe the customers will take a look to this feature when they require to replace their SAN arrays. In addition, the distributed storage architecture is a plus from reliability and performance standpoint. I love how easy is to scale a distributed storage solution and how quick customers get more storage and performance in minutes.

Nutanix .NEXT 2016

Source: Nutanix.com – Acropolis Block Services

But, this is not reason enough to replace a SAN array. Many of the SAN arrays are also NAS, that provides file services like NFS and CIFS/SMB. What does Nutanix have to say around this? Nutanix already announced in March 2016 the Acropolis File Services (AFS).

Nutanix .NEXT 2016

Source: Nutanix.com – Acropolis File Services

With both features, the new Acropolis Block Services and the recent Acropolis File Services, Partners are now in the position to keep discussions with customers around if the replacement of their SAN array should be a new array again, or otherwise they can extend their current hyper-converged platform with the deployment of Nutanix storage nodes and use both features, ABS + AFS.

In my opinion, Nutanix still has a step forward more to close the storage cycle. I miss the capability to provide object storage, it’s funny because the Nutanix Distributed File System (NDFS) is based on object storage, but they don’t provide this feature. Developers could use the Nutanix platform like they use Amazon S3. Also it’s true I don’t see many customers consuming object storage on premise.

All Flash on All Platforms

Like I mentioned above, the price of flash storage is coming down and this is an opportunity to include the technology across all platforms (we’re using all flash home labs, why not customers?). The only all flash appliance is the NX-9000, but the new all flash configurations for all platforms will be available this month.

I have the doubt if the all flash option will also be available for Nutanix Xpress platform.

Nutanix .NEXT 2016

Source: Nutanix.com – All-Flash Everywhere

Nutanix Self-Service

Many customers are looking to build their own private cloud using Cloud Management Platform software, but most of them have enough as foundation if they can provision virtual machines in an easy manner (IaaS). If customer uses the CMP just for virtual machine provisioning, they are wasting their investment as the licensing model is usually CPU-based and the entire platform must be licensed.

The Nutanix Self-Service will be a great feature and will help customers to reduce the TCO, same they’re doing now with the adoption of Acropolis Hypervisor (AHV)

Nutanix .NEXT 2016

Source: Nutanix.com – Nutanix Self Service

Operational Tools

Operational teams love Nutanix for its simplicity. In my opinion it’s the Veeam or Rubrik of the hyper-convergence. Nutanix is pushing hard its “Invisible Infrastructure” approach and I must say they’re doing a great job. The “One Click Everything”  functionalities are brilliant, making easy the life for operators.

I’m stunned how powerful and friendly is the analytics module. It’s pretty fast returning results on a readable format. At the same time you can trigger operations from your search, it means you can remediate undesirable situations on a quick and easy manner. Nutanix makes vast use of machine learning to predict and anticipate the operations.

The following functionalities around management and operations were announced:

  • The already mentioned Self-Service.
  • Capacity planning through scenario based modeling.
  • Network visualization.
Nutanix .NEXT 2016

Source: Nutanix.com – Nutanix Network Visualization

Acropolis Container Services

The differentiation of Nutanix’s offer about containers and its competitors is the support of stateful applications. The Acropolis Distributed Storage Fabric provides persistent storage support for containers through the Docker volume extension. How Nutanix manages the containers as virtual machines is not new, VMware already showed the same functionality almost a year ago.

Nutanix .NEXT 2016

Source: Nutanix.com – Acropolis Container Services

Conclusion of Nutanix .NEXT 2016 Conference

Exciting times ahead for Nutanix’s customers with all the new functionalities coming and the new ones in the roadmap. Nutanix has a big margin of improvement ahead and if they follow the same way like at the moment, I’m sure they will be in the market for a long time and will provide solutions for those customers that don’t want to move all their workloads to the public cloud.