Custom Add Disk Day 2 in vRA Cloud / 8.x

Business Requirement

The customer wanted the ability to add additional disks as a day 2 resource action to the workloads with custom user inputs.

Why can’t they use the OOB “Add Disk” resource action?

vRA Cloud provides an Out Of the Box “Add Disk” resource action. However, there are several reasons a customer might want to use a custom day 2 resource action. Few factors are:

  • SCSI Selection: The customer doesn’t want to expose the SCSI selection to the end-users. They might want to do the SCSI allocation in the code.
  • Unit Number: End users are not expected to know these details and need to be hidden.
  • Constraint Tags: Do we really expect the end users to know your constraint tags?
  • Custom Inputs: OOB can’t be modified to add custom user inputs.
Add Disk Resource Action
OOB Add Disk Resource Action

Hence the need for writing a custom action.

Challenge with the custom day 2 action

When you trigger a Day 2 action from a vRA Resource (vSphere resource in this case). The entire deployment goes into a locking state, preventing any further API operations on any of the resources in that deployment. When a custom vRO workflow tries to make an API call to add the additional disk, the locking mechanism doesn’t allow any API operations and you will get an error :

{"message":"Another request is already in progress.","statusCode":409,"errorCode":20009}

The below diagram summarises the issue.

Solution

I spent several hours trying to troubleshoot this scenario (ignorant of the fact it’s locking) and reached out to my peer group for suggestions. I was informed that my workflow will never run due to the above constraint. One of our Consulting Architects in the US, Sky Cooper suggested a workaround he did in his lab (Thank You Sky). I adapted the solution he suggested for the customer. Here are the details.

I created a workflow that will execute as a day 2 action and the actual disk add task runs as an asynchronous workflow within that. This ensures that the main workflow completes first and Add Disk workflow runs after the request is completed as an independent entity. This ensures that the deployment lock is released by the request.

Request Data:

  1. Get the deployment, request and resource Id from the context information.
  2. Constraint tags can also be obtained from the metadata.
  3. Create the API Request Body.
  4. Create custom user inputs JSON (to be used by ansible later in the process)
  5. Trigger the ASYNC “Disk Request” workflow.
  6. Close the API call by completing the workflow.
var payload = System.getContext().getParameter("__metadata_resourceProperties");
deploymentID = payload.__deployment_id;
resourceID = payload.resourceId;
requestId = System.getContext().getParameter("__metadata_requestId");
//Calculate the SCSI controller ID with the disk number using a static JSON mapping. 
var scsiController = System.getModule("com.fluffyclouds.vRACLOUD").getSCSIControllerByDeploymentId(deploymentID);
//Inputs required for the API call 
var inputsObject = {};
inputsObject.name = diskName;
inputsObject.capacityGb = diskCapacity;
inputsObject.encrypted = false;
inputsObject.persistent = false;
inputsObject.SCSIController = scsiController;
inputsObject.constraints = payload["__vmw:provisioning:constraints.storage"];
//Update the API request body with inputs.
var requestObject = {};
requestObject.actionId = "Cloud.vSphere.Machine.Add.Disk";
requestObject.reason = "Day-2 Disk Add";
requestObject.inputs = inputsObject;
content = JSON.stringify(requestObject);
//Create JSON Object with all user inputs. It will be used by ansible to format the disks. 
var diskProperties = {};
diskProperties.mountpoint = mnt;
diskProperties.logicalVolumeSize = lSize;
diskProperties.fsType = fs;
diskProps = JSON.stringify(diskProperties);

Code for “getSCSIControllerByDeploymentId”

var scsiAllocation = {
"SCSI_Controller_0": [1, 2, 9,10,17,18],
"SCSI_Controller_1": [3, 4, 11,12,19,20],
"SCSI_Controller_2": [5, 6, 13,14,21,22],
"SCSI_Controller_3": [7, 8, 15,16,23,24]
};
var extendedURL = "/deployment/api/deployments/" + deploymentId + "/resources?$filter=type%20eq%20'Cloud.vSphere.Disk'";
var diskInfo = System.getModule("com.fluffyclouds.vRACLOUD").executeRestOperationvRA("GET", "application/json", null, "application/json", extendedURL);
diskInfo = JSON.parse(diskInfo);
//Adding one to count for the OS disk
var diskLength = ((diskInfo.content).length + 1);
System.log("Number of existing disks: " + diskLength + " . Adding new disk to the disk count. Finding SCSI controller for disk # " + (diskLength + 1));
diskLength += 1;
var selectedSCSI = "";
var keys = Object.keys(scsiAllocation);
for (var i = 0; i < keys.length; i++) {
if (scsiAllocation[keys[i]].indexOf(diskLength) > -1) {
System.debug("Match found for: " + scsiAllocation[keys[i]] + " : " + keys[i]);
selectedSCSI = keys[i];
}
}
System.log("Selcted SCSI Controller: " + selectedSCSI);
return selectedSCSI;

Disk Request (Asynchronous Workflow)

Custom Decision

In this task, I query the request Id to ensure the request is completed before proceeding to the disk add task. This will ensure that deployment is not locked.

var httpMethod = "GET";
var contentType = "application/json";
var acceptHeader = "application/json";
var extendedUrl = "/deployment/api/deployments/" + deploymentId + "/requests/" + requestId;
var completed = false;
var count = 0;
var maxRetries = 5;
while (!completed && (count < maxRetries)) {
    System.sleep(10000);
    var createOpResponse = System.getModule("com.fluffyClouds.vRACloud").executeRestOp(httpMethod, contentType, null, acceptHeader, extendedUrl);
    System.debug(createOpResponse);
    createOpResponse = JSON.parse(createOpResponse);
    if (createOpResponse.statusCode >= 400) {
        throw "Failed to retrieve request (" + createOpResponse.statusCode + " Error). Details: " + createOpResponse.responseString;
    }
    var status = createOpResponse.status;
    System.log("Disk Day2 vRA request status : " + status);
    if (status == "SUCCESSFUL") {
        completed = true;
    }
    count++;
}
return (completed);

Add Disk

This scriptable task runs the main API call to add the disk. I wait for the operation to complete before moving on to the next task of formatting the disk and mounting it using ansible.

var httpMethod = "POST";
var contentType = "application/json";
var acceptHeader = "application/json";
var extendedUrl = "/deployment/api/deployments/"+deploymentId+"/resources/"+resourceId+"/requests";
var createOpResponse = System.getModule("com.fluffyClouds.vRA").executeRestOp(httpMethod,contentType,content,acceptHeader,extendedUrl);
var maxRetries = 50;
var count = 0;
var completed = false;
System.debug("Rest Response: "+ createOpResponse);
createOpResponse = JSON.parse(createOpResponse);
var diskRequestId = createOpResponse.id
if (createOpResponse.statusCode >= 400) {
throw "Failed to submit deployment action (" + createOpResponse.statusCode + " Error). Details: " + createOpResponse.responseString;
}
while (!completed && (count < maxRetries)) {
  System.sleep(10000);
  var requestURL = "/deployment/api/deployments/"+deploymentId+"/requests/"+ diskRequestId;
  var response = System.getModule("com.fluffyClouds.vRA").executeRestOp("GET",contentType,null,acceptHeader,requestURL);
  response = JSON.parse(response);
  var status = response.status;
  System.log("Add disk task status : "+ status);
  if ( status == "SUCCESSFUL") {
      completed = true;
  }
count++;
}

Conclusion

The above process is a simple workaround to beat resource locking. VMware might make the locking more granular to resources instead of deployment in future releases (No official confirmation yet, so don’t quote me on this). The only downside to this approach is that your day 2 request will always be successful regardless of disk add task status but in my opinion, it’s a small price to pay to get the ball rolling.

This Post Has 9 Comments

  1. Jaspreet Singh Arora

    Very well explained 👏

  2. Ano Nymous

    Thank you for this very insightful article. VMware seems to take pleasure in making VRA customization more and more complex, maybe to push more pso engagements? The official documentation only touches on the basics, there is no mention of the locking mechanism, thanks to people like you clients do not have to spend days to search fro a solution.
    This is getting ridiculous, when you compare how easy it was to configure resource actions in 7.x.

  3. craig

    Your code references a custom rest action. Can you post that as well?

    1. Barjinder Singh

      //Get token and rest host
      var refreshToken = //>
      var bearerToken = System.getModule(“com.fluffyClouds.vRACloud”).getBearerToken(refreshToken); //Get bearer token from refreshToken
      var restHost = System.getModule(“com.fluffyClouds.vRACloud”).getvRACloudRestHost(); //Get rest host with name or config element

      try {
      var request = restHost.createRequest(httpMethod,extendedUrl ,content );
      request.contentType = contentType;
      request.setHeader(“Accept”, acceptHeader);
      request.setHeader(“Authorization”, “Bearer ” + bearerToken);
      System.log(“URL to execute: ” + request.fullUrl)
      var response = request.execute();
      var statusCode = response.statusCode;
      System.debug(“Status code: ” + statusCode);
      } catch (e) {
      throw (“Error! executing rest operation. “+ e)
      }
      return (response.contentAsString);

  4. Frank

    Hi,

    this is a really great help. Thank you very much for that article.
    Can you please also share the code for the action “getSCSIControllerByDeploymentId”?

    Best Regards,
    Frank

    1. Barjinder Singh

      Thank you for your comment, Frank. I have updated the article with “getSCSIControllerByDeploymentId” code. I use static mapping to allocate a SCSI controller based on the disk number. You could avoid static mapping and write your algo to dynamically assign a disk if required.

  5. aenagy

    What is the source for com.fluffyClouds.vRACloud/getBearerToken() ?

  6. RjW

    Thank you for this. Just starting on 8.6 coming from 7.6 and that OOTB Add Disk is a non-starter. The users won’t know half of what it means and our vSphere team won’t want them specifying most of it anyways.

Leave a Reply