Shared storage for clusters with vRealize Orchestrator – Part 2

Introduction

This is the last article in my two-part blog series on creating shared storage for the clustered VM’s using Dell VNX eNAS storage. Please check part 1 of this series before continuing to his article. https://fluffyclouds.blog/2020/07/26/shared-storage-for-clusters-with-vrealize-orchestrator-part-1/

Solution Design

In the first article of this series, I explained how to create the base service for storage with various methods like create/delete NFS & create/delete Qtrees. I this article I will elaborate on the business logic of this solution. There are a few key challenges and requirements that we will address:

  1. The workflow should be called as an EBS subscription as a part of VM Build.
  2. The storage operation should create a property on the VM containing share path and size, that would be used by the chef build to install the application solution.
  3. There should be a guard rail to ensure that this workflow only gets executed if a specific property is passed.
  4. The operation can only run on one node of the cluster, if you run it on all the nodes, the storage op will fail with the duplicate share name.
  5. VM decommission should automatically delete the shared storage.

Solution

I will try to address these issues as a part of my business logic. The business logic will follow the below structure.

  • Validate the initial data
  • Create NFS export
  • Delete NFS export
  • Update VM Properties
  • Property Defined
function VNXStorageService(virtualMachineProperties) {
  //#region VALIDATION 
  this.validateInit = function() {		
  }	
  //Create a Qtree & NFS Export
  this.createExport = function (virtualMachineEntity) {
  }
  //Delete the NFS Export & Qtree
  this.deleteExport = function () {
  }
  this.updateVMProperties = function (isNodePrimary,virtualMachineEntity) {
  }
  //Check if the property is defined 
  this.propertyDefined = function(state,virtualMachineName,virtualMachineEntity){
  }
  //#region init
  this.validateInit();
}
return VNXStorageService;	

High Level request flow

Check workflow trigger

In this section we will check the following

  1. Is the trigger property defined in the payload?
  2. Is the node primary or secondary?
  3. Workflow state
  • Is the trigger property defined in the payload?

It’s a check we do to control the workflow execution before any EBS subscription. The intent is to only run the vRO subscription workflow if a specific property is passed with a specific value.

With the below code, we ensure that propDefined returns false if the property is not passed in EBS with the value true. We read the propDefined value in vRO and end the workflow if it’s false. So, this answers the question “Is trigger property found” in the flow chart above.

this.iseNASPropertyDefined = function(state,virtualMachineName,virtualMachineEntity){
  var propDefined = false
  if (typeof fileShareEnabled === "string" && fileShareEnabled.toLowerCase() === "true") {  //values that come from machine properties are string values not boolean
    System.log(enableFeature + " exists, value is: " + fileShareEnabled);
    propDefined = true;
  } else {
    propDefined = false;  
    System.log(enableFeature + " value is: " + fileShareEnabled);
    System.warn("eNAS Share disabled. Use property " + enableFeature + " to enable.");
  }
}
//#region init
var enableFeature = "fluffy.enable.eNAS";
var fileShareEnabled = virtualMachineProperties.get(enableFeature);
  • Is the node primary or secondary?

We need to find this answer because the storage operation must only run on the primary node, if the workflow executes on both nodes, member nodes will fail with the duplicate data error. We will again leverage the “propertyDefined” variable to decide if the workflow should proceed or not. This was a tricky bit, luckily the customer was using a timestamp in the VMD Name. I decided to use a higher number. The node with bigger numeric value was declared as a primary node.

Before we find any property on all the VM’s under the same deployment, we need the property entities of each VM. To get the property entity we will use the below method. We will leverage the “__asd_correlationId” property to find all the machines under the deployment.

In the below script I input “__asd_correlationId” as the property name and “__asd_correlationId – Property Value” as the property value. This action returns me an array of entities.

var iaasHost = //Get IaaS Host
var propEntities = findVmEntitiesWithProperty(iaasHost.id, propertyName);
var filteredpropEntities = [];
for each (var propertyEntity in propEntities) {
    if (propertyEntity.getProperty("PropertyValue") == propertyValue) {
        filteredpropEntities.push(propertyEntity.getLink(iaasHost, "VirtualMachine")[0]);
    }
}
return filteredpropEntities;

function findVmEntitiesWithProperty(hostId, name) {
	var modelName = "ManagementModelEntities.svc";
	var entitySetName = "VirtualMachineProperties";	
	var filter = new Properties();
	filter.put("PropertyName", name);
	return vCACEntityManager.readModelEntitiesByCustomFilter(hostId, modelName, entitySetName, filter, null);
}

Once you get the entities, we can find the VM Names and do a name compare to find the superior node. Action name : getPrimaryDeploymentNode

// Start of action
var iaasHost = //GET IaaS host
var clusterNodes = [];
var numericNodeNames = [];
var virtualMachineEntities = //Get  an array of entities from the above action
if (virtualMachineEntities.length > 0) {
    //Loops through each entity and grabs VM Names for all machines in a deployment
    for each (var virtualMachineEntity in virtualMachineEntities) {
        vmProps = /*<Write an action to get properties from entity></Write>*/(iaasHost, virtualMachineEntity);
        var memberNode = vmProps["VirtualMachineName"];
		clusterNodes.push(memberNode);
		System.debug("Member Node Name Found: "+ memberNode);	
	}
}
//Get an array of numeric node names
if(clusterNodes.length >0){
    for each (var nodeName in clusterNodes) {
        var numericName = nodeName.substr(nodeName.length-6);
        numericNodeNames.push(numericName);
    }
} else {
    throw "Cluster node not found with correlationID";
}
//Sort to find the superior numeric node. 
var higherNumericNode = numericNodeNames.sort()[0];
//Get the actual node name for the superior node. 
var superiorNode = clusterNodes.filter(function(e){
	return e.indexOf(higherNumericNode) != -1;
});
System.log("Primary node for the cluster is: "+ superiorNode[0]);
return superiorNode[0];

The above action will return the primary node name based on a mathematical comparison. Now let’s jump back to our first action where we want to make a decision if the workflow should proceed or not under “iseNASPropertyDefined”

In next secton, we will do the following:

  1. Check the workflow state, if its “provisioned”, check if the node is primary, if primary set the propDefined to true
  2. If it’s not a primary node, update the VM properties with the share name and size. Set the propDefined to false.
  3. If the state is “Disposing”. Check for the primary node, if it’s primary proceed else set the propDefined to false.
 //Elect the superior node with the numerically high hostname value to decide if the storage operation will run or not. 
 if(propertyDefined) {
    if (state == "VMPSMasterWorkflow32.MachineProvisioned"){
        var superiorNodeName = System.getModule("com.fluffy.cluster").getPrimaryDeploymentNode(correlationId, <Value of correlationId></Value>);
        if(virtualMachineName == superiorNodeName){
            //If the node has higher wight, set the property defined to true. 
            System.log("Virtual Machine "+ virtualMachineName+ " has higher weight....procceding with the storage opertation..");
            propertyDefined = true; 
        } else {
            System.log("Virtual Machine "+ virtualMachineName+ " has lower weight, storage opertation will not run on this node....skipping the workflow after property update..");
            this.updateVMProperties (false,virtualMachineEntity);
            propertyDefined = false;   
        }
    }
    else if (state == "VMPSMasterWorkflow32.Disposing"){
        //Primary cluster node property is updated on the primary node
        var isPrimamryNode = virtualMachineProperties.get("fluffy.isPrimaryClusterNode");
        if(isPrimamryNode){
            System.log("Node is a primary cluster node, procceding with the deprovisioning operation");
            propertyDefined = true;
        } else {
            System.log("Virtual Machine "+ virtualMachineName+ " has lower weight, storage opertation will not run on this node....skipping the workflow");
            propertyDefined = false;   
        }
    } 
    else{
        throw "Invalid workflow state to run this operation"
    }
}

Create NFS export

The basic operation to create NFS export will be to get an array of all the member nodes IP addresses and create a qtree and NFS mount with permissions for all the member node IPs.

I observed that some of my requests were failing on the storage array if the array was busy, so I added a retry mechanism as a band-aid. The below action will create the NFS export with export permissions of the node IP’s.

this.createNFSExport = function (virtualMachineEntity) {
    // Get paramaters 
    var storageService = System.getModule("com.fluffy.cluster").eNASService();
    var vnxObj = new storageService(site,env,path);
    //Get NFS Export Path Names
    var qtreeCreated = false;
    var nfsExportCreated = false;
    //Get an array of IP Addresses
    var propertyArry = System.getModule("com.fluffy.cluster").getPropertyForAllDeploymentNodes(correlationId,"VirtualMachine.Network0.Address");
    //Get the node IP's from the object.
    var memberIPs = [];
    for each( var propValue in propertyArry){
        memberIPs.push(propValue.propertyValue);
    }
    while (timesTried < retryLimit) {
        try {
            //Size of log and qmgr qtree's are hardcoded to 30GB & 70GB respectively (in KB)
            //Create two qtrees
            qtreeCreated = vnxObj.createQtree(20000,500000);
            //Create NFS exports for the qtrees
            nfsExportCreated = vnxObj.createNFSExport(memberIPs);
            //Break the loop if mount created
            if(qtreeCreated && nfsExportCreated){
                System.debug("qtreeCreated: "+ qtreeCreated+ "\n nfsExportCreated: "+ nfsExportCreated);
                break;		
            }else {
                System.debug("qtreeCreated: "+ qtreeCreated+ "\n nfsExportCreated: "+ nfsExportCreated + "\n Retrying the operation");
                timesTried++;
            }
        } catch (ex){
            timesTried++;
            System.error(ex);
            errors.push("timesTried : "+ timesTried+ "\n Error: "+ ex);
            System.log("The operation will be retried. Attempts till now : "+timesTried);
            System.sleep(retryInterval);
        } 
    }
    if( qtreeCreated && nfsExportCreated){
        this.updateVMProperties(true,virtualMachineEntity);
    }else {
        throw ("Error creating QTree and NFS Exports: "+ JSON.stringify(errors));
    }
}

Update VM Properties

I will not go into the details of updating VM properties in this article. I used this feature to update the NFS Mounts share names and size on the VM. I also added a new property to classify the VM as the primary node. It will help with the decommission logic.

Delete NFS Export

This function will execute if the state is disposing, I leverage my core storage service to perform the delete operation. Just like create, I have a re-try mechanism for the delete operation as well.

this.deleteNFSExport = function () {
    // Get paramaters 
    var qtreeDeleted = false;
    var nfsExportDeleted = false;
    //Create the eNAS service object 
    var storageService = System.getModule("com.fluffy.cluster").eNASService();
    var vnxObj = new storageService(site,env,path);
    //Rety loop if the delete operation fails. 
    while (timesTried < retryLimit) {
        try {
            //Delete two qtrees 
            qtreeDeleted = vnxObj.deleteQtree();
            //Delete NFS exports for the qtrees
            nfsExportDeleted = vnxObj.deleteNFSExport();
            //Break the loop if mount created
            if(qtreeDeleted && nfsExportDeleted){
                System.debug("Q Tree Deleted: "+ qtreeDeleted+ "\n nfs Export Deleted: "+ nfsExportDeleted);
                break;		
            }else {
                timesTried++;
            }
        } catch (ex){
            timesTried++;
            System.error(ex);
            errors.push("timesTried : "+ timesTried+ "\n Error: "+ ex);
            System.log("The operation will be retried. Attempts till now : "+timesTried);
            System.sleep(retryInterval);
        } 
    }
    if( !qtreeDeleted && !nfsExportDeleted){
        throw ("Error deleting QTree and NFS Exports :"+ JSON.stringify(errors) );
    }
}

Conclusion

I believe this framework in general can apply to a lot of other cluster workload scenarios as well. I thoroughly enjoyed working on this solution. Please feel free to contact me if you have any additional questions or comments.

This Post Has One Comment

  1. Jaspreet

    Brilliantly Explained. Thanks alot

Leave a Reply