Welcome to Knowledge Base!

KB at your finger tips

This is one stop global knowledge base where you can learn about all the products, solutions and support features.

Categories
All

Storage and Backups-Nutanix

AHV Administration Guide

AHV 5.20

Product Release Date: 2021-05-17

Last updated: 2022-12-06

AHV Overview

As the default option for Nutanix HCI, the native Nutanix hypervisor, AHV, represents a unique approach to virtualization that offers the powerful virtualization capabilities needed to deploy and manage enterprise applications. AHV compliments the HCI value by integrating native virtualization along with networking, infrastructure, and operations management with a single intuitive interface - Nutanix Prism.

Virtualization teams find AHV easy to learn and transition to from legacy virtualization solutions with familiar workflows for VM operations, live migration, VM high availability, and virtual network management. AHV includes resiliency features, including high availability and dynamic scheduling without the need for additional licensing, and security is integral to every aspect of the system from the ground up. AHV also incorporates the optional Flow Security and Networking, allowing easy access to hypervisor-based network microsegmentation and advanced software-defined networking.

See the Field Installation Guide for information about how to deploy and create a cluster. Once you create the cluster by using Foundation, you can use this guide to perform day-to-day management tasks.

AOS and AHV Compatibility

For information about the AOS and AHV compatibility with this release, see the Compatibility and Interoperability Matrix.

Limitations

For information about AHV configuration limitations, see Nutanix Configuration Maximums webpage.

Nested Virtualization

Nutanix does not support nested virtualization (nested VMs) in an AHV cluster.

Storage Overview

AHV uses a Distributed Storage Fabric to deliver data services such as storage provisioning, snapshots, clones, and data protection to VMs directly.

In AHV clusters, AOS passes all disks to the VMs as raw SCSI block devices. By that means, the I/O path is lightweight and optimized. Each AHV host runs an iSCSI redirector, which establishes a highly resilient storage path from each VM to storage across the Nutanix cluster.

QEMU is configured with the iSCSI redirector as the iSCSI target portal. Upon a login request, the redirector performs an iSCSI login redirect to a healthy Stargate (preferably the local one).

Figure. AHV Storage Click to enlarge

AHV Turbo

AHV Turbo represents significant advances to the data path in AHV. AHV Turbo provides an I/O path that bypasses QEMU and services storage I/O requests, which lowers CPU usage and increases the amount of storage I/O available to VMs.

AHV Turbo represents significant advances to the data path in AHV.

When you use QEMU, all I/O travels through a single queue that can impact system performance. AHV Turbo provides an I/O path that uses the multi-queue approach to bypasses QEMU. The multi-queue approach allows the data to flow from a VM to the storage more efficiently. This results in a much higher I/O capacity and lower CPU usage. The storage queues automatically scale out to match the number of vCPUs configured for a given VM, and results in a higher performance as the workload scales up.

AHV Turbo is transparent to VMs and is enabled by default on VMs that runs in AHV clusters. For maximum VM performance, ensure that the following conditions are met:

  • The latest Nutanix VirtIO package is installed for Windows VMs. For information on how to download and install the latest VirtIO package, see Installing or Upgrading Nutanix VirtIO for Windows.
    Note: No additional configuration is required at this stage.
  • The VM has more than one vCPU.
  • The workloads are multi-threaded.
Note: Multi-queue is enabled by default in current Linux distributions. For details, refer your vendor-specific documentation for Linux distribution.
In addition to multi-queue approach for storage I/O, you can also achieve the maximum network I/O performance using the multi-queue approach for any vNICs in the system. For information about how to enable multi-queue and set an optimum number of queues, see Enabling RSS Virtio-Net Multi-Queue by Increasing the Number of VNIC Queues .
Note: Ensure that the guest operating system fully supports multi-queue before you enable it. For details, refer your vendor-specific documentation for Linux distribution.

Acropolis Dynamic Scheduling in AHV

Acropolis Dynamic Scheduling (ADS) proactively monitors your cluster for any compute and storage I/O contentions or hotspots over a period of time. If ADS detects a problem, ADS creates a migration plan that eliminates hotspots in the cluster by migrating VMs from one host to another.

You can monitor VM migration tasks from the Task dashboard of the Prism Element web console.

Following are the advantages of ADS:

  • ADS improves the initial placement of the VMs depending on the VM configuration.
  • Nutanix Volumes uses ADS for balancing sessions of the externally available iSCSI targets.
Note: ADS honors all the configured host affinities, VM-host affinities, VM-VM antiaffinity policies, and HA policies.

By default, ADS is enabled and Nutanix recommends you keep this feature enabled. However, see Disabling Acropolis Dynamic Scheduling for information about how to disable the ADS feature. See Enabling Acropolis Dynamic Scheduling for information about how to enable the ADS feature if you previously disabled the feature.

ADS monitors the following resources:

  • VM CPU Utilization: Total CPU usage of each guest VM.
  • Storage CPU Utilization: Storage controller (Stargate) CPU usage per VM or iSCSI target

ADS does not monitor memory and networking usage.

How Acropolis Dynamic Scheduling Works

Lazan is the ADS service in an AHV cluster. AOS selects a Lazan manager and Lazan solver among the hosts in the cluster to effectively manage ADS operations.

ADS performs the following tasks to resolve compute and storage I/O contentions or hotspots:

  • The Lazan manager gathers statistics from the components it monitors.
  • The Lazan solver (runner) checks the statistics for potential anomalies and determines how to resolve them, if possible.
  • The Lazan manager invokes the tasks (for example, VM migrations) to resolve the situation.
Note:
  • During migration, a VM consumes resources on both the source and destination hosts as the High Availability (HA) reservation algorithm must protect the VM on both hosts. If a migration fails due to lack of free resources, turn off some VMs so that migration is possible.
  • If a problem is detected and ADS cannot solve the issue (for example, because of limited CPU or storage resources), the migration plan might fail. In these cases, an alert is generated. Monitor these alerts from the Alerts dashboard of the Prism Element web console and take necessary remedial actions.
  • If the host, firmware, or AOS upgrade is in progress and if any resource contention occurs during the upgrade period, ADS does not perform any resource contention rebalancing.

When Is a Hotspot Detected?

Lazan runs every 15 minutes and analyzes the resource usage for at least that period of time. If the resource utilization of an AHV host remains >85% for the span of 15 minutes, Lazan triggers migration tasks to remove the hotspot.

Note: For a storage hotspot, ADS looks at the last 40 minutes of data and uses a smoothing algorithm to use the most recent data. For a CPU hotspot, ADS looks at the last 10 minutes of data only, that is, the average CPU usage over the last 10 minutes.

Following are the possible reasons if there is an obvious hotspot, but the VMs did not migrate:

  • Lazan cannot resolve a hotspot. For example:
    • If there is a huge VM (16 vCPUs) at 100% usage, and accounts for 75% of the AHV host usage (which is also at 100% usage).
    • The other hosts are loaded at ~ 40% usage.

    In these situations, the other hosts cannot accommodate the large VM without causing contention there as well. Lazan does not prioritize one host or VM over others for contention, so it leaves the VM where it is hosted.

  • Number of all-flash nodes in the cluster is less than the replication factor.

    If the cluster has an RF2 configuration, the cluster must have a minimum of two all-flash nodes for successful migration of VMs on all the all-flash nodes.

Migrations Audit

Prism Central displays the list of all the VM migration operations generated by ADS. In Prism Central, go to Menu -> Activity -> Audits to display the VM migrations list. You can filter the migrations by clicking Filters and selecting Migrate in the Operation Type tab. The list displays all the VM migration tasks created by ADS with details such as the source and target host, VM name, and time of migration.

Disabling Acropolis Dynamic Scheduling

Perform the procedure described in this topic to disable ADS. Nutanix recommends you keep ADS enabled.

Procedure

  1. Log on to a Controller VM in your cluster with SSH.
  2. Disable ADS.
    nutanix@cvm$ acli ads.update enable=false

    No action is taken by ADS to solve the contentions after you disable the ADS feature. You must manually take the remedial actions or you can enable the feature.

Enabling Acropolis Dynamic Scheduling

If you have disabled the ADS feature and want to enable the feature, perform the following procedure.

Procedure

  1. Log onto a Controller VM in your cluster with SSH.
  2. Enable ADS.
    nutanix@cvm$ acli ads.update enable=true

Virtualization Management Web Console Interface

You can manage the virtualization management features by using the Prism GUI (Prism Element and Prism Central web consoles).

You can do the following by using the Prism web consoles:

  • Configure network connections
  • Create virtual machines
  • Manage virtual machines (launch console, start/shut down, take snapshots, migrate, clone, update, and delete)
  • Monitor virtual machines
  • Enable VM high availability

See Prism Web Console Guide and Prism Central Guide for more information.

Viewing the AHV Version on Prism Element

You can see the AHV version installed in the Prism Element web console.

About this task

To view the AHV version installed on the host, do the following.

Procedure

  1. Log on to Prism Web Console
  2. The Hypervisor Summary widget widget on the top left side of the Home page displays the AHV version.
    Figure. LCM Page Displays AHV Version Click to enlarge Displaying the LCM page which shows the AHV version installed.

Viewing the AHV Version on Prism Central

You can see the AHV version installed in the Prism Central console.

About this task

To view the AHV version installed on any host in the clusters managed by the Prism Central, do the following.

Procedure

  1. Log on to Prism Central.
  2. In side bar, select Hardware > Hosts > Summary tab.
  3. Click the host you want to see the hypervisor version for.
  4. The Host detail view page displays the Properties widget that lists the Hypervisor Version .
    Figure. Hypervisor Version in Host Detail View Click to enlarge Displaying the Host details page showing the Hypervisor Version.

Node Management

Nonconfigurable AHV Components

The components listed here are configured by the Nutanix manufacturing and installation processes. Do not modify any of these components except under the direction of Nutanix Support.

Nutanix Software

Modifying any of the following Nutanix software settings may inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.

  • Local datastore name.
  • Configuration and contents of any CVM (except memory configuration to enable certain features).
Important: Note the following important considerations about CVMs.
  • Do not delete the Nutanix CVM.
  • Do not take a snapshot of the CVM for backup.
  • Do not rename, modify, or delete the admin and nutanix user accounts of the CVM.
  • Do not create additional CVM user accounts.

    Use the default accounts ( admin or nutanix ), or use sudo to elevate to the root account.

  • Do not decrease CVM memory below recommended minimum amounts required for cluster and add-in features.

    Nutanix Cluster Checks (NCC), preupgrade cluster checks, and the AOS upgrade process detect and monitor CVM memory.

  • Nutanix does not support the usage of third-party storage on the host part of Nutanix clusters.

    Normal cluster operations might be affected if there are connectivity issues with the third-party storage you attach to the hosts in a Nutanix cluster.

  • Do not run any commands on a CVM that are not in the Nutanix documentation.

AHV Settings

Nutanix AHV is a cluster-optimized hypervisor appliance.

Alteration of the hypervisor appliance (unless advised by Nutanix Technical Support) is unsupported and may result in the hypervisor or VMs functioning incorrectly.

Unsupported alterations include (but are not limited to):

  • Hypervisor configuration, including installed packages
  • Controller VM virtual hardware configuration file (.xml file). Each AOS version and upgrade includes a specific Controller VM virtual hardware configuration. Therefore, do not edit or otherwise modify the Controller VM virtual hardware configuration file.
  • iSCSI settings
  • Open vSwitch settings

  • Installation of third-party software not approved by Nutanix
  • Installation or upgrade of software packages from non-Nutanix sources (using yum, rpm, or similar)
  • Taking snapshots of the Controller VM
  • Creating user accounts on AHV hosts
  • Changing the timezone of the AHV hosts. By default, the timezone of an AHV host is set to UTC.
  • Joining AHV hosts to Active Directory or OpenLDAP domains

Controller VM Access

Although each host in a Nutanix cluster runs a hypervisor independent of other hosts in the cluster, some operations affect the entire cluster.

Most administrative functions of a Nutanix cluster can be performed through the web console (Prism), however, there are some management tasks that require access to the Controller VM (CVM) over SSH. Nutanix recommends restricting CVM SSH access with password or key authentication.

This topic provides information about how to access the Controller VM as an admin user and nutanix user.

admin User Access

Use the admin user access for all tasks and operations that you must perform on the controller VM. As an admin user with default credentials, you cannot access nCLI. You must change the default password before you can use nCLI. Nutanix recommends that you do not create additional CVM user accounts. Use the default accounts ( admin or nutanix ), or use sudo to elevate to the root account.

For more information about admin user access, see Admin User Access to Controller VM.

nutanix User Access

Nutanix strongly recommends that you do not use the nutanix user access unless the procedure (as provided in a Nutanix Knowledge Base article or user guide) specifically requires the use of the nutanix user access.

For more information about nutanix user access, see Nutanix User Access to Controller VM.

You can perform most administrative functions of a Nutanix cluster through the Prism web consoles or REST API. Nutanix recommends using these interfaces whenever possible and disabling Controller VM SSH access with password or key authentication. Some functions, however, require logging on to a Controller VM with SSH. Exercise caution whenever connecting directly to a Controller VM as it increases the risk of causing cluster issues.

Warning: When you connect to a Controller VM with SSH, ensure that the SSH client does not import or change any locale settings. The Nutanix software is not localized, and running the commands with any locale other than en_US.UTF-8 can cause severe cluster issues.

To check the locale used in an SSH session, run /usr/bin/locale . If any environment variables are set to anything other than en_US.UTF-8 , reconnect with an SSH configuration that does not import or change any locale settings.

Admin User Access to Controller VM

You can access the Controller VM as the admin user ( admin user name and password) with SSH. For security reasons, the password of the admin user must meet Controller VM Password Complexity Requirements . When you log on to the Controller VM as the admin user for the first time, you are prompted to change the default password.

See Controller VM Password Complexity Requirements to set a secure password.

After you have successfully changed the password, the new password is synchronized across all Controller VMs and interfaces (Prism web console, nCLI, and SSH).

Note:
  • As an admin user, you cannot access nCLI by using the default credentials. If you are logging in as the admin user for the first time, you must log on through the Prism web console or SSH to the Controller VM. Also, you cannot change the default password of the admin user through nCLI. To change the default password of the admin user, you must log on through the Prism web console or SSH to the Controller VM.
  • When you make an attempt to log in to the Prism web console for the first time after you upgrade to AOS 5.1 from an earlier AOS version, you can use your existing admin user password to log in and then change the existing password (you are prompted) to adhere to the password complexity requirements. However, if you are logging in to the Controller VM with SSH for the first time after the upgrade as the admin user, you must use the default admin user password (Nutanix/4u) and then change the default password (you are prompted) to adhere to the Controller VM Password Complexity Requirements .
  • You cannot delete the admin user account.
  • The default password expiration age for the admin user is 60 days. You can configure the minimum and maximum password expiration days based on your security requirement.
    • nutanix@cvm$ sudo chage -M MAX-DAYS admin
    • nutanix@cvm$ sudo chage -m MIN-DAYS admin

When you change the admin user password, you must update any applications and scripts using the admin user credentials for authentication. Nutanix recommends that you create a user assigned with the admin role instead of using the admin user for authentication. The Prism Web Console Guide describes authentication and roles.

Following are the default credentials to access a Controller VM.

Table 1. Controller VM Credentials
Interface Target User Name Password
SSH client Nutanix Controller VM admin Nutanix/4u
nutanix nutanix/4u
Prism web console Nutanix Controller VM admin Nutanix/4u

Accessing the Controller VM Using the Admin User Account

About this task

Perform the following procedure to log on to the Controller VM by using the admin user with SSH for the first time.

Procedure

  1. Log on to the Controller VM with SSH by using the management IP address of the Controller VM and the following credentials.
    • User name: admin
    • Password: Nutanix/4u
    You are now prompted to change the default password.
  2. Respond to the prompts, providing the current and new admin user password.
    Changing password for admin.
    Old Password:
    New password:
    Retype new password:
    Password changed.
    

    See the requirements listed in Controller VM Password Complexity Requirements to set a secure password.

    For information about logging on to a Controller VM by using the admin user account through the Prism web console, see Logging Into The Web Console in the Prism Web Console Guide .

Nutanix User Access to Controller VM

You can access the Controller VM as the nutanix user ( nutanix user name and password) with SSH. For security reasons, the password of the nutanix user must meet the Controller VM Password Complexity Requirements. When you log on to the Controller VM as the nutanix user for the first time, you are prompted to change the default password.

See Controller VM Password Complexity Requirementsto set a secure password.

After you have successfully changed the password, the new password is synchronized across all Controller VMs and interfaces (Prism web console, nCLI, and SSH).

Note:
  • As a nutanix user, you cannot access nCLI by using the default credentials. If you are logging in as the nutanix user for the first time, you must log on through the Prism web console or SSH to the Controller VM. Also, you cannot change the default password of the nutanix user through nCLI. To change the default password of the nutanix user, you must log on through the Prism web console or SSH to the Controller VM.

  • When you make an attempt to log in to the Prism web console for the first time after you upgrade the AOS from an earlier AOS version, you can use your existing nutanix user password to log in and then change the existing password (you are prompted) to adhere to the password complexity requirements. However, if you are logging in to the Controller VM with SSH for the first time after the upgrade as the nutanix user, you must use the default nutanix user password (nutanix/4u) and then change the default password (you are prompted) to adhere to the Controller VM Password Complexity Requirements.

  • You cannot delete the nutanix user account.
  • You can configure the minimum and maximum password expiration days based on your security requirement.
    • nutanix@cvm$ sudo chage -M MAX-DAYS admin
    • nutanix@cvm$ sudo chage -m MIN-DAYS admin

When you change the nutanix user password, you must update any applications and scripts using the nutanix user credentials for authentication. Nutanix recommends that you create a user assigned with the nutanix role instead of using the nutanix user for authentication. The Prism Web Console Guide describes authentication and roles.

Following are the default credentials to access a Controller VM.

Table 1. Controller VM Credentials
Interface Target User Name Password
SSH client Nutanix Controller VM admin Nutanix/4u
nutanix nutanix/4u
Prism web console Nutanix Controller VM admin Nutanix/4u

Accessing the Controller VM Using the Nutanix User Account

About this task

Perform the following procedure to log on to the Controller VM by using the nutanix user with SSH for the first time.

Procedure

  1. Log on to the Controller VM with SSH by using the management IP address of the Controller VM and the following credentials.
    • User name: nutanix
    • Password: nutanix/4u
    You are now prompted to change the default password.
  2. Respond to the prompts, providing the current and new nutanix user password.
    Changing password for nutanix.
    Old Password:
    New password:
    Retype new password:
    Password changed.
    

    See Controller VM Password Complexity Requirementsto set a secure password.

    For information about logging on to a Controller VM by using the nutanix user account through the Prism web console, see Logging Into The Web Console in the Prism Web Console Guide .

Controller VM Password Complexity Requirements

The password must meet the following complexity requirements:

  • At least eight characters long.
  • At least one lowercase letter.
  • At least one uppercase letter.
  • At least one number.
  • At least one special character.
    Note: Ensure that the following conditions are met for the special characters usage in the CVM password:
    • The special characters are appropriately used while setting up the CVM password. In some cases, for example when you use ! followed by a number in the CVM password, it leads to a special meaning at the system end, and the system may replace it with a command from the bash history. In this case, you may generate a password string different from the actual password that you intend to set.
    • The special character used in the CVM password are ASCII printable characters only. For information about ACSII printable characters, refer ASCII printable characters (character code 32-127) article on ASCII code website.
  • At least four characters difference from the old password.
  • Must not be among the last 5 passwords.
  • Must not have more than 2 consecutive occurrences of a character.
  • Must not be longer than 199 characters.

AHV Host Access

You can perform most of the administrative functions of a Nutanix cluster using the Prism web consoles or REST API. Nutanix recommends using these interfaces whenever possible. Some functions, however, require logging on to an AHV host with SSH.

Note: From AOS 5.15.5 with AHV 20190916.410 onwards, AHV has two new user accounts— admin and nutanix .

Nutanix provides the following users to access the AHV host:

  • root —It is used internally by the AOS. The root user is used for the initial access and configuration of the AHV host.
  • admin —It is used to log on to an AHV host. The admin user is recommended for accessing the AHV host.
  • nutanix —It is used internally by the AOS and must not be used for interactive logon.

Exercise caution whenever connecting directly to an AHV host as it increases the risk of causing cluster issues.

Following are the default credentials to access an AHV host:

Table 1. AHV Host Credentials
Interface Target User Name Password
SSH client AHV Host root nutanix/4u
admin

There is no default password for admin. You must set it during the initial configuration.

nutanix nutanix/4u

Initial Configuration

About this task

The AHV host is shipped with the default password for the root and nutanix users, which must be changed using SSH when you log on to the AHV host for the first time. After changing the default passwords and the admin password, all subsequent logins to the AHV host must be with the admin user.

Perform the following procedure to change admin user account password for the first time:
Note: Perform this initial configuration on all the AHV hosts.

Procedure

  1. Use SSH and log on to the AHV host using the root account.
    $ ssh root@<AHV Host IP Address>
    Nutanix AHV
    root@<AHV Host IP Address> password: # default password nutanix/4u
    
  2. Change the default root user password.
    root@ahv# passwd root
    Changing password for user root.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
  3. Change the default nutanix user password.
    root@ahv# passwd nutanix
    Changing password for user nutanix.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
  4. Change the admin user password.
    root@ahv# passwd admin
    Changing password for user admin.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    

Accessing the AHV Host Using the Admin Account

About this task

After setting the admin password in the Initial Configuration, use the admin user for all subsequent logins.

Perform the following procedure to log on to the Controller VM by using the admin user with SSH for the first time.

Procedure

  1. Log on to the AHV host with SSH using the admin account.
    $ ssh admin@ <AHV Host IP Address> 
    Nutanix AHV
    
  2. Enter the admin user password configured in the Initial Configuration.
    admin@<AHV Host IP Address> password:
  3. Append sudo to the commands if privileged access is required.
    $ sudo ls /var/log

Changing Admin User Password

About this task

Perform these steps to change the admin password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Enter the admin user password configured in the Initial Configuration.
  3. Run the sudo command to change to admin user password.
    $ sudo passwd admin
  4. Respond to the prompts and provide the new password.
    [sudo] password for admin: 
    Changing password for user admin.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

Changing the Root User Password

About this task

Perform these steps to change the root password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Run the sudo command to change to root user.
  3. Change the root password.
    root@ahv# passwd root
  4. Respond to the prompts and provide the current and new root password.
    Changing password for root.
    New password:
    Retype new password:
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

Changing Nutanix User Password

About this task

Perform these steps to change the nutanix password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Run the sudo command to change to root user.
  3. Change the nutanix password.
    root@ahv# passwd nutanix
  4. Respond to the prompts and provide the current and new nutanix password.
    Changing password for nutanix.
    New password:
    Retype new password:
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

AHV Host Password Complexity Requirements

The password you choose must meet the following complexity requirements:

  • In configurations with high-security requirements, the password must contain:
    • At least 15 characters.
    • At least one upper case letter (A–Z).
    • At least one lower case letter (a–z).
    • At least one digit (0–9).
    • At least one printable ASCII special (non-alphanumeric) character. For example, a tilde (~), exclamation point (!), at sign (@), number sign (#), or dollar sign ($).
    • At least eight characters different from the previous password.
    • At most three consecutive occurrences of any given character.
    • At most four consecutive occurrences of any given class.

The password cannot be the same as the last 5 passwords.

  • In configurations without high-security requirements, the password must contain:
    • At least eight characters.
    • At least one upper case letter (A–Z).
    • At least one lower case letter (a–z).
    • At least one digit (0–9).
    • At least one printable ASCII special (non-alphanumeric) character. For example, a tilde (~), exclamation point (!), at sign (@), number sign (#), or dollar sign ($).
    • At least three characters different from the previous password.
    • At most three consecutive occurrences of any given character.

The password cannot be the same as the last 5 passwords.

In both types of configuration, if a password for an account is entered three times unsuccessfully within a 15-minute period, the account is locked for 15 minutes.

Verifying the Cluster Health

Before you perform operations such as restarting a CVM or AHV host and putting an AHV host into maintenance mode, check if the cluster can tolerate a single-node failure.

Before you begin

Ensure that you are running the most recent version of NCC.

About this task

Note: If you see any critical alerts, resolve the issues by referring to the indicated KB articles. If you are unable to resolve any issues, contact Nutanix Support.

Perform the following steps to avoid unexpected downtime or performance issues.

Procedure

  1. Review and resolve any critical alerts. Do one of the following:
    • In the Prism Element web console, go to the Alerts page.
    • Log on to a Controller VM (CVM) with SSH and display the alerts.
      nutanix@cvm$ ncli alert ls
    Note: If you receive alerts indicating expired encryption certificates or a key manager is not reachable, resolve these issues before you shut down the cluster. If you do not resolve these issues, data loss of the cluster might occur.
  2. Verify if the cluster can tolerate a single-node failure. Do one of the following:
    • In the Prism Element web console, in the Home page, check the status of the Data Resiliency Status dashboard.

      Verify that the status is OK . If the status is anything other than OK , resolve the indicated issues before you perform any maintenance activity.

    • Log on to a Controller VM (CVM) with SSH and check the fault tolerance status of the cluster.
      nutanix@cvm$ ncli cluster get-domain-fault-tolerance-status type=node
      

      An output similar to the following is displayed:

      Important:
      Domain Type               : NODE
          Component Type            : STATIC_CONFIGURATION
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 14:22:09 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : ERASURE_CODE_STRIP_SIZE
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : METADATA
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Mon Sep 28 14:35:25 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : ZOOKEEPER
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Thu Sep 17 11:09:39 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : EXTENT_GROUPS
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : OPLOG
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : FREE_SPACE
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 14:20:57 GMT+05:00 2015
      

      The value of the Current Fault Tolerance column must be at least 1 for all the nodes in the cluster.

Putting a Node into Maintenance Mode

You may be required to put a node into maintenance mode in certain situations such as making changes to the network configuration of a node or for performing manual firmware upgrades.

Before you begin

Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

About this task

When a host is in maintenance mode, AOS marks the host as unschedulable so that no new VM instances are created on it. Next, an attempt is made to evacuate VMs from the host.

If the evacuation attempt fails, the host remains in the "entering maintenance mode" state, where it is marked unschedulable, waiting for user remediation. You can shut down VMs on the host or move them to other nodes. Once the host has no more running VMs, it is in maintenance mode.

When a host is in maintenance mode, VMs are moved from that host to other hosts in the cluster. After exiting maintenance mode, those VMs are automatically returned to the original host, eliminating the need to manually move them.

VMs with GPU, CPU passthrough, PCI passthrough, and host affinity policies are not migrated to other hosts in the cluster. You can choose to shut down such VMs while putting the node into maintenance mode.

Agent VMs are always shut down if you put a node in maintenance mode and are powered on again after exiting maintenance mode.

Perform the following steps to put the node into maintenance mode.

Procedure

  1. Use SSH to log on to a Controller VM in the cluster.
  2. Determine the IP address of the node that you want to put into maintenance mode.
    nutanix@cvm$ acli host.list

    Note the value of Hypervisor IP for the node that you want to put in maintenance mode.

  3. Put the node into maintenance mode.
    nutanix@cvm$ acli host.enter_maintenance_mode host-IP-address [wait="{ true | false }" ] [non_migratable_vm_action="{ acpi_shutdown | block }" ]
    Note: Never put Controller VM and AHV hosts into maintenance mode on single-node clusters. It is recommended to shut down guest VMs before proceeding with disruptive changes.

    Replace host-IP-address with either the IP address or host name of the AHV host that you want to shut down.

    The following are optional parameters for running the acli host.enter_maintenance_mode command:

    • wait : Set the wait parameter to true to wait for the host evacuation attempt to finish.
    • non_migratable_vm_action : By default the non_migratable_vm_action parameter is set to block , which means VMs with GPU, CPU passthrough, PCI passthrough, and host affinity policies are not migrated or shut down when you put a node into maintenance mode.

      If you want to automatically shut down such VMs, set the non_migratable_vm_action parameter to acpi_shutdown .

  4. Verify if the host is in the maintenance mode.
    nutanix@cvm$ acli host.get host-ip

    In the output that is displayed, ensure that node_state equals to EnteredMaintenanceMode and schedulable equals to False .

    Do not continue if the host has failed to enter the maintenance mode.

  5. See Verifying the Cluster Health to once again check if the cluster can tolerate a single-node failure.
  6. Put the CVM into the maintenance mode.
    nutanix@cvm$ ncli host edit id=host-ID enable-maintenance-mode=true

    Replace host-ID with the ID of the host.

    This step prevents the CVM services from being affected by any connectivity issues.

  7. Determine the ID of the host.
    nutanix@cvm$ ncli host list

    An output similar to the following is displayed:

    Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
    Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
    Name                      : XXXXXXXXXXX-X 
    IPMI Address              : X.X.Z.3 
    Controller VM Address     : X.X.X.1 
    Hypervisor Address        : X.X.Y.2
    

    In this example, the host ID is 1234.

    Wait for a few minutes until the CVM is put into the maintenance mode.

  8. Verify if the CVM is in the maintenance mode.

    Run the following command on the CVM that you put in the maintenance mode.

    nutanix@cvm$ genesis status | grep -v "\[\]"

    An output similar to the following is displayed:

    nutanix@cvm$ genesis status | grep -v "\[\]"
    2021-09-24 05:28:03.827628: Services running on this node:
      genesis: [11189, 11390, 11414, 11415, 15671, 15672, 15673, 15676]
      scavenger: [27241, 27525, 27526, 27527]
      xmount: [25915, 26055, 26056, 26074]
      zookeeper: [13053, 13101, 13102, 13103, 13113, 13130]
    nutanix@cvm$ 

    Only the Genesis, Scavenger, Xmount, and Zookeeper processes must be running (process ID is displayed next to the process name).

    Do not continue if the CVM has failed to enter the maintenance mode, because it can cause a service interruption.

What to do next

Perform the maintenance activity. Once the maintenance activity is complete, remove the node from the maintenance mode. See Exiting a Node from the Maintenance Mode for more information.

Exiting a Node from the Maintenance Mode

After you perform any maintenance activity, exit the node from the maintenance mode.

About this task

Perform the following to exit the host from the maintenance mode.

Procedure

  1. Remove the CVM from the maintenance mode.
    1. Determine the ID of the host.
      nutanix@cvm$ ncli host list

      An output similar to the following is displayed:

      Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
      Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
      Name                      : XXXXXXXXXXX-X 
      IPMI Address              : X.X.Z.3 
      Controller VM Address     : X.X.X.1 
      Hypervisor Address        : X.X.Y.2
      

      In this example, the host ID is 1234.

    1. From any other CVM in the cluster, run the following command to exit the CVM from the maintenance mode.
      nutanix@cvm$ ncli host edit id=host-ID enable-maintenance-mode=false

      Replace host-ID with the ID of the host.

      Note: The command fails if you run the command from the CVM that is in the maintenance mode.
    2. Verify if all processes on all the CVMs are in the UP state.
      nutanix@cvm$ cluster status | grep -v UP
    Do not continue if the CVM has failed to exit the maintenance mode.
  2. Remove the AHV host from the maintenance mode.
    1. From any CVM in the cluster, run the following command to exit the AHV host from the maintenance mode.
      nutanix@cvm$ acli host.exit_maintenance_mode host-ip 
      

      Replace host-ip with the new IP address of the host.

      This command migrates (live migration) all the VMs that were previously running on the host back to the host.

    2. Verify if the host has exited the maintenance mode.
      nutanix@cvm$ acli host.get host-ip 

      In the output that is displayed, ensure that node_state equals to kAcropolisNormal or AcropolisNormal and schedulable equals to True .

    Contact Nutanix Support if any of the steps described in this document produce unexpected results.

Shutting Down a Node in a Cluster (AHV)

Before you begin

  • Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

    See Verifying the Cluster Health to check if the cluster can tolerate a single-node failure. Do not proceed if the cluster cannot tolerate a single-node failure.

  • Put the node you want to shut down into maintenance mode.

    See Putting a Node into Maintenance Mode for instructions about how to put a node into maintenance mode.

    You can list all the hosts in the cluster by running nutanix@cvm$ acli host.list command, and note the value of Hypervisor IP for the node you want to shut down.

About this task

Perform the following procedure to shut down a node.

Procedure

  1. Using SSH, log on to the Controller VM on the host you want to shut down.
  2. Shut down the Controller VM.
    nutanix@cvm$ cvm_shutdown -P now

    Note: Once the cvm_shutdown command is issued, it might take a few minutes before CVM is powered off completely. After the cvm_shutdown command is completed successfully, Nutanix recommends that you wait up to 4 minutes before shutting down the AHV host.
  3. Log on to the AHV host with SSH.
  4. Shut down the host.
    root@ahv# shutdown -h now

What to do next

See Starting a Node in a Cluster (AHV) for instructions about how to start a node, including how to start a CVM and how to exit a node from maintenance mode.

Starting a Node in a Cluster (AHV)

About this task

Procedure

  1. On the hardware appliance, power on the node. The CVM starts automatically when your reboot the node.
  2. If the node is in maintenance mode, log on (SSH) to the Controller VM and remove the node from maintenance mode.
    See Exiting a Node from the Maintenance Mode for more information.
  3. Log on to another CVM in the Nutanix cluster with SSH.
  4. Verify that the status of all services on all the CVMs are Up.
    nutanix@cvm$ cluster status
    If the Nutanix cluster is running properly, output similar to the following is displayed for each node in the Nutanix cluster.
    CVM: <host IP-Address> Up
                                    Zeus   UP       [9935, 9980, 9981, 9994, 10015, 10037]
                               Scavenger   UP       [25880, 26061, 26062]
                                  Xmount   UP       [21170, 21208]
                        SysStatCollector   UP       [22272, 22330, 22331]
                               IkatProxy   UP       [23213, 23262]
                        IkatControlPlane   UP       [23487, 23565]
                           SSLTerminator   UP       [23490, 23620]
                          SecureFileSync   UP       [23496, 23645, 23646]
                                  Medusa   UP       [23912, 23944, 23945, 23946, 24176]
                      DynamicRingChanger   UP       [24314, 24404, 24405, 24558]
                                  Pithos   UP       [24317, 24555, 24556, 24593]
                              InsightsDB   UP       [24322, 24472, 24473, 24583]
                                  Athena   UP       [24329, 24504, 24505]
                                 Mercury   UP       [24338, 24515, 24516, 24614]
                                  Mantle   UP       [24344, 24572, 24573, 24634]
                              VipMonitor   UP       [18387, 18464, 18465, 18466, 18474]
                                Stargate   UP       [24993, 25032]
                    InsightsDataTransfer   UP       [25258, 25348, 25349, 25388, 25391, 25393, 25396]
                                   Ergon   UP       [25263, 25414, 25415]
                                 Cerebro   UP       [25272, 25462, 25464, 25581]
                                 Chronos   UP       [25281, 25488, 25489, 25547]
                                 Curator   UP       [25294, 25528, 25529, 25585]
                                   Prism   UP       [25718, 25801, 25802, 25899, 25901, 25906, 25941, 25942]
                                     CIM   UP       [25721, 25829, 25830, 25856]
                            AlertManager   UP       [25727, 25862, 25863, 25990]
                                Arithmos   UP       [25737, 25896, 25897, 26040]
                                 Catalog   UP       [25749, 25989, 25991]
                               Acropolis   UP       [26011, 26118, 26119]
                                   Uhura   UP       [26037, 26165, 26166]
                                    Snmp   UP       [26057, 26214, 26215]
                       NutanixGuestTools   UP       [26105, 26282, 26283, 26299]
                              MinervaCVM   UP       [27343, 27465, 27466, 27730]
                           ClusterConfig   UP       [27358, 27509, 27510]
                                Aequitas   UP       [27368, 27567, 27568, 27600]
                             APLOSEngine   UP       [27399, 27580, 27581]
                                   APLOS   UP       [27853, 27946, 27947]
                                   Lazan   UP       [27865, 27997, 27999]
                                  Delphi   UP       [27880, 28058, 28060]
                                    Flow   UP       [27896, 28121, 28124]
                                 Anduril   UP       [27913, 28143, 28145]
                                   XTrim   UP       [27956, 28171, 28172]
                           ClusterHealth   UP       [7102, 7103, 27995, 28209,28495, 28496, 28503, 28510,	
    28573, 28574, 28577, 28594, 28595, 28597, 28598, 28602, 28603, 28604, 28607, 28645, 28646, 28648, 28792,	
    28793, 28837, 28838, 28840, 28841, 28858, 28859, 29123, 29124, 29127, 29133, 29135, 29142, 29146, 29150,	
    29161, 29162, 29163, 29179, 29187, 29219, 29268, 29273]

Shutting Down an AHV Cluster

You might need to shut down an AHV cluster to perform a maintenance activity or tasks such as relocating the hardware.

Before you begin

Ensure the following before you shut down the cluster.

  1. Upgrade to the most recent version of NCC.
  2. Log on to a Controller VM (CVM) with SSH and run the complete NCC health check.
    nutanix@cvm$ ncc health_checks run_all

    If you receive any failure or error messages, resolve those issues by referring to the KB articles indicated in the output of the NCC check results. If you are unable to resolve these issues, contact Nutanix Support.

    Warning: If you receive alerts indicating expired encryption certificates or a key manager is not reachable, resolve these issues before you shut down the cluster. If you do not resolve these issues, data loss of the cluster might occur.

About this task

Shut down an AHV cluster in the following sequence.

Procedure

  1. Shut down the services or VMs associated with AOS features or Nutanix products. For example, shut down all the Nutanix file server VMs (FSVMs). See the documentation of those features or products for more information.
  2. Shut down all the guest VMs in the cluster in one of the following ways.
    • Shut down the guest VMs from within the guest OS.
    • Shut down the guest VMs by using the Prism Element web console.
    • If you are running many VMs, shut down the VMs by using aCLI:
    1. Log on to a CVM in the cluster with SSH.
    2. Shut down all the guest VMs in the cluster.
      nutanix@cvm$ for i in `acli vm.list power_state=on | awk '{print $1}' | grep -v NTNX` ; do acli vm.shutdown $i ; done
      
    3. Verify if all the guest VMs are shut down.
      nutanix@CVM$ acli vm.list power_state=on
    4. If any VMs are on, consider powering off the VMs from within the guest OS. To force shut down through AHV, run the following command:
      nutanix@cvm$ acli vm.off vm-name

      Replace vm-name with the name of the VM you want to shut down.

  3. Stop the Nutanix cluster.
    1. Log on to any CVM in the cluster with SSH.
    2. Stop the cluster.
      nutanix@cvm$ cluster stop
    3. Verify if the cluster services have stopped.
      nutanix@CVM$ cluster status

      The output displays the message The state of the cluster: stop , which confirms that the cluster has stopped.

      Note: Some system services continue to run even if the cluster has stopped.
  4. Shut down all the CVMs in the cluster. Log on to each CVM in the cluster with SSH and shut down that CVM.
    nutanix@cvm$ sudo shutdown -P now
  5. Shut down each node in the cluster. Perform the following steps for each node in the cluster.
    1. Log on to the IPMI web console of each node.
    2. Under Remote Control > Power Control , select Power Off Server - Orderly Shutdown to gracefully shut down the node.
    3. Ping each host to verify that all AHV hosts are shut down.
  6. Complete the maintenance activity or any other tasks.
  7. Start all the nodes in the cluster.
    1. Press the power button on the front of the block for each node.
    2. Log on to the IPMI web console of each node.
    3. On the System tab, check the Power Control status to verify if the node is powered on.
  8. Start the cluster.
    1. Wait for approximately 5 minutes after you start the last node to allow the cluster services to start.
      All CVMs start automatically after you start all the nodes.
    2. Log on to any CVM in the cluster with SSH.
    3. Start the cluster.
      nutanix@cvm$ cluster start
    4. Verify that all the cluster services are in the UP state.
      nutanix@cvm$ cluster status
    5. Start the guest VMs from within the guest OS or use the Prism Element web console.

      If you are running many VMs, start the VMs by using aCLI:

      nutanix@cvm$ for i in `acli vm.list power_state=off | awk '{print $1}' | grep -v NTNX` ; do acli vm.on $i; done
    6. Start the services or VMs associated with AOS features or Nutanix products. For example, start all the FSVMs. See the documentation of those features or products for more information.
    7. Verify if all guest VMs are powered on by using the Prism Element web console.

Rebooting an AHV Node in a Nutanix Cluster

About this task

The Request Reboot operation in the Prism web console gracefully restarts the selected nodes, including each local CVM one after the other.

The Request Reboot operation in the Prism web console gracefully restarts the selected nodes one after the other.

Perform the following procedure to restart the nodes in the cluster.

Before you begin

  • Ensure the Cluster Resiliency is OK on the Prism web console prior to any restart activities.
  • For successful automated restarts of hosts, ensure that the cluster has HA or resource capacity.
  • Ensure that the guest VMs can migrate between hosts as the hosts are placed in maintenance mode. If not, manual intervention may be required.

Perform the following procedure to restart the nodes in the cluster.

Procedure

  1. Click the gear icon in the main menu and then select Reboot in the Settings page.
  2. In the Request Reboot window, select the nodes you want to restart, and click Reboot .
    Figure. Request Reboot of AHV Node Click to enlarge

    A progress bar is displayed that indicates the progress of the restart of each node.

Changing CVM Memory Configuration (AHV)

About this task

You can increase the memory reserved for each Controller VM in your cluster by using the 1-click Controller VM Memory Upgrade available from the Prism Element web console. Increase memory size depending on the workload type or to enable certain AOS features. See the Increasing the Controller VM Memory Size topic in the Prism Web Console Guide for CVM memory sizing recommendations and instructions about how to increase the CVM memory.

Changing the AHV Hostname

To change the name of an AHV host, log on to any Controller VM (CVM) in the cluster as admin or nutanix user and run the change_ahv_hostname script.

About this task

Perform the following procedure to change the name of an AHV host:

Procedure

  1. Log on to any CVM in the cluster with SSH.
  2. Change the hostname of the AHV host.
    • If you are logged in as nutanix user, run the following command:
      nutanix@cvm$ change_ahv_hostname --host_ip=host-IP-address --host_name=new-host-name
    • If you are logged in as admin user, run the following command:
      admin@cvm$ sudo change_ahv_hostname --host_ip=host-IP-address --host_name=new-host-name
    Note: The system prompts you to enter the admin user password if you run the change_ahv_hostname command with sudo.

    Replace host-IP-address with the IP address of the host whose name you want to change and new-host-name with the new hostname for the AHV host.

    Note: This entity must fulfill the following naming conventions:
    • The maximum length is 63 characters.
    • Allowed characters are uppercase and lowercase letters (A-Z and a-z), decimal digits (0-9), dots (.), and hyphens (-).
    • The entity name must start and end with a number or letter.

    If you want to update the hostname of multiple hosts in the cluster, run the script for one host at a time (sequentially).

    Note: The Prism Element web console displays the new hostname after a few minutes.

Changing the Name of the CVM Displayed in the Prism Web Console

You can change the CVM name that is displayed in the Prism web console. The procedure described in this document does not change the CVM name that is displayed in the terminal or console of an SSH session.

About this task

You can change the CVM name by using the change_cvm_display_name script. Run this script from a CVM other than the CVM whose name you want to change. When you run the change_cvm_display_name script, AOS performs the following steps:

    1. Checks if the new name starts with NTNX- and ends with -CVM . The CVM name must have only letters, numbers, and dashes (-).
    2. Checks if the CVM has received a shutdown token.
    3. Powers off the CVM. The script does not put the CVM or host into maintenance mode. Therefore, the VMs are not migrated from the host and continue to run with the I/O operations redirected to another CVM while the current CVM is in a powered off state.
    4. Changes the CVM name, enables autostart, and powers on the CVM.

Perform the following to change the CVM name displayed in the Prism web console.

Procedure

  1. Use SSH to log on to a CVM other than the CVM whose name you want to change.
  2. Change the name of the CVM.
    nutanix@cvm$ change_cvm_display_name --cvm_ip=CVM-IP --cvm_name=new-name

    Replace CVM-IP with the IP address of the CVM whose name you want to change and new-name with the new name for the CVM.

    The CVM name must have only letters, numbers, and dashes (-), and must start with NTNX- and end with -CVM .

    Note: Do not run this command from the CVM whose name you want to change, because the script powers off the CVM. In this case, when the CVM is powered off, you lose connectivity to the CVM from the SSH console and the script abruptly ends.

Compute-Only Node Configuration (AHV Only)

A compute-only (CO) node allows you to seamlessly and efficiently expand the computing capacity (CPU and memory) of your AHV cluster. The Nutanix cluster uses the resources (CPUs and memory) of a CO node exclusively for computing purposes.

Note: Clusters that have compute-only nodes do not support virtual switches. Instead, use bridge configurations for network connections. For more information, see Virtual Switch Limitations.

You can use a supported server or an existing hyperconverged (HC) node as a CO node. To use a node as CO, image the node as CO by using Foundation and then add that node to the cluster by using the Prism Element web console. For more information about how to image a node as a CO node, see the Field Installation Guide.

Note: If you want an existing HC node that is already a part of the cluster to work as a CO node, remove that node from the cluster, image that node as CO by using Foundation, and add that node back to the cluster. For more information about how to remove a node, see Modifying a Cluster.

Key Features of Compute-Only Node

Following are the key features of CO nodes.

  • CO nodes do not have a Controller VM (CVM) and local storage.
  • AOS sources the storage for vDisks associated with VMs running on CO nodes from the hyperconverged (HC) nodes in the cluster.
  • You can seamlessly manage your VMs (CRUD operations, ADS, and HA) by using the Prism Element web console.
  • AHV runs on the local storage media of the CO node.
  • To update AHV on a cluster that contains a compute-only node, use the Life Cycle Manager. For more information, see the LCM Updates topic in the Life Cycle Manager Guide.

Use Case of Compute-Only Node

CO nodes enable you to achieve more control and value from restrictive licenses such as Oracle. A CO node is part of a Nutanix HC cluster, and there is no CVM running on the CO node (VMs use CVMs running on the HC nodes to access disks). As a result, licensed cores on the CO node are used only for the application VMs.

Applications or databases that are licensed on a per CPU core basis require the entire node to be licensed and that also includes the cores on which the CVM runs. With CO nodes, you get a much higher ROI on the purchase of your database licenses (such as Oracle and Microsoft SQL Server) since the CVM does not consume any compute resources.

Minimum Cluster Requirements

Following are the minimum cluster requirements for compute-only nodes.

  • The Nutanix cluster must be at least a three-node cluster before you add a compute-only node.

    However, Nutanix recommends that the cluster has four nodes before you add a compute-only node.

  • The ratio of compute-only to hyperconverged nodes in a cluster must not exceed the following:

    1 compute-only : 2 hyperconverged

  • All the hyperconverged nodes in the cluster must be all-flash nodes.
  • The number of vCPUs assigned to CVMs on the hyperconverged nodes must be greater than or equal to the total number of available cores on all the compute-only nodes in the cluster. The CVM requires a minimum of 12 vCPUs. For more information about how Foundation allocates memory and vCPUs to your platform model, see CVM vCPU and vRAM Allocation in the Field Installation Guide .
  • The total amount of NIC bandwidth allocated to all the hyperconverged nodes must be twice the amount of the total NIC bandwidth allocated to all the compute-only nodes in the cluster.

    Nutanix recommends you use dual 25 GbE on CO nodes and quad 25 GbE on an HC node serving storage to a CO node.

  • The AHV version of the compute-only node must be the same as the other nodes in the cluster.

    When you are adding a CO node to the cluster, AOS checks if the AHV version of the node matches with the AHV version of the existing nodes in the cluster. If there is a mismatch, the add node operation fails.

For general requirements about adding a node to a Nutanix cluster, see Expanding a Cluster.

Restrictions

Nutanix does not support the following features or tasks on a CO node in this release:

  1. Host boot disk replacement
  2. Network segmentation
  3. Virtual Switch configuration: Use bridge configurations instead.

Supported AOS Versions

Nutanix supports compute-only nodes on AOS releases 5.11 or later.

Supported Hardware Platforms

Compute-only nodes are supported on the following hardware platforms.

  • All the NX series hardware
  • Dell XC Core
  • Cisco UCS

Networking Configuration

To perform network tasks on a compute-only node such as creating or modifying bridges or uplink bonds or uplink load balancing, you must use the manage_ovs commands and add the --host flag to the manage_ovs commands as shown in the following example:

Note: If you have storage-only AHV nodes in clusters with compute-only nodes being ESXI or Hyper-V, deployment of default virtual switch vs0 fails. In such cases, the Prism Element, Prism Central or CLI workflows for virtual switch management are unavailable to manage the bridges and bonds. Use the manage_ovs command options to manage the bridges and bonds.
nutanix@cvm$ manage_ovs --host IP_address_of_co_node --bridge_name bridge_name create_single_bridge

Replace IP_address_of_co_node with the IP address of the CO node and bridge_name with the name of bridge you want to create.

Note: Run the manage_ovs commands for a CO from any CVM running on a hyperconverged node.

Perform the networking tasks for each CO node in the cluster individually.

For more information about networking configuration of the AHV hosts, see Host Network Management in the AHV Administration Guide .

Adding a Compute-Only Node to an AHV Cluster

About this task

Perform the following procedure to add a compute-only node to a Nutanix cluster.

Procedure

  1. Log on to the Prism Element web console.
  2. Do one of the following:
    • Click the gear icon in the main menu and select Expand Cluster in the Settings page.
    • Go to the hardware dashboard (see Hardware Dashboard) and click Expand Cluster .
  3. In the Select Host screen, scroll down and, under Manual Host Discovery , click Discover Hosts Manually .
    Figure. Discover Hosts Manually Click to enlarge

  4. Click Add Host .
    Figure. Add Host Click to enlarge

  5. Under Host or CVM IP , type the IP address of the AHV host and click Save .
    This node does not have a Controller VM and you must therefore provide the IP address of the AHV host.
  6. Click Discover and Add Hosts .
    Prism Element discovers this node and the node appears in the list of nodes in the Select Host screen.
  7. Select the node to display the details of the compute-only node.
  8. Click Next .
  9. In the Configure Host screen, click Expand Cluster .

    The add node process begins and Prism Element performs a set of checks before the node is added to the cluster.

    Check the progress of the operation in the Tasks menu of the Prism Element web console. The operation takes approximately five to seven minutes to complete.

  10. Check the Hardware Diagram view to verify if the node is added to the cluster.
    You can identity a node as a CO node if the Prism Element web console does not display the IP address for the CVM.

Host Network Management

Network management in an AHV cluster consists of the following tasks:

  • Configuring Layer 2 switching through virtual switch and Open vSwitch bridges. When configuring virtual switch vSwitch, you configure bridges, bonds, and VLANs.
  • Optionally changing the IP address, netmask, and default gateway that were specified for the hosts during the imaging process.

Virtual Networks (Layer 2)

Each VM network interface is bound to a virtual network. Each virtual network is bound to a single VLAN; trunking VLANs to a virtual network is not supported. Networks are designated by the L2 type ( vlan ) and the VLAN number.

By default, each virtual network maps to virtual switch br0 . However, you can change this setting to map a virtual network to a custom virtual switch. The user is responsible for ensuring that the specified virtual switch exists on all hosts, and that the physical switch ports for the virtual switch uplinks are properly configured to receive VLAN-tagged traffic.

A VM NIC must be associated with a virtual network. You can change the virtual network of a vNIC without deleting and recreating the vNIC.

You can configure VM NICs in trunk mode to support applications that use trunk mode. For information about configuring virtual NICs in trunk mode, see Configuring a Virtual NIC to Operate in Access or Trunk Mode.

Managed Networks (Layer 3)

A virtual network can have an IPv4 configuration, but it is not required. A virtual network with an IPv4 configuration is a managed network ; one without an IPv4 configuration is an unmanaged network . A VLAN can have at most one managed network defined. If a virtual network is managed, every NIC is assigned an IPv4 address at creation time.

A managed network can optionally have one or more non-overlapping DHCP pools. Each pool must be entirely contained within the network's managed subnet.

If the managed network has a DHCP pool, the NIC automatically gets assigned an IPv4 address from one of the pools at creation time, provided at least one address is available. Addresses in the DHCP pool are not reserved. That is, you can manually specify an address belonging to the pool when creating a virtual adapter. If the network has no DHCP pool, you must specify the IPv4 address manually.

All DHCP traffic on the network is rerouted to an internal DHCP server, which allocates IPv4 addresses. DHCP traffic on the virtual network (that is, between the guest VMs and the Controller VM) does not reach the physical network, and vice versa.

A network must be configured as managed or unmanaged when it is created. It is not possible to convert one to the other.

Figure. AHV Networking Architecture Click to enlarge AHV Networking Architecture image

Prerequisites for Configuring Networking

Change the configuration from the factory default to the recommended configuration. See AHV Networking Recommendations.

AHV Networking Recommendations

Nutanix recommends that you perform the following OVS configuration tasks from the Controller VM, as described in this documentation:

  • Viewing the network configuration
  • Configuring uplink bonds with desired interfaces using the Virtual Switch (VS) configurations.
  • Assigning the Controller VM to a VLAN

For performing other network configuration tasks such as adding an interface to a bridge and configuring LACP for the interfaces in a bond, follow the procedures described in the AHV Networking best practices documentation.

Nutanix recommends that you configure the network as follows:

Table 1. Recommended Network Configuration
Network Component Best Practice
Virtual Switch

Do not modify the OpenFlow tables of any bridges configured in any VS configurations in the AHV hosts.

Do not delete or rename OVS bridge br0.

Do not modify the native Linux bridge virbr0.

Switch Hops Nutanix nodes send storage replication traffic to each other in a distributed fashion over the top-of-rack network. One Nutanix node can, therefore, send replication traffic to any other Nutanix node in the cluster. The network should provide low and predictable latency for this traffic. Ensure that there are no more than three switches between any two Nutanix nodes in the same cluster.
Switch Fabric

A switch fabric is a single leaf-spine topology or all switches connected to the same switch aggregation layer. The Nutanix VLAN shares a common broadcast domain within the fabric. Connect all Nutanix nodes that form a cluster to the same switch fabric. Do not stretch a single Nutanix cluster across multiple, disconnected switch fabrics.

Every Nutanix node in a cluster should therefore be in the same L2 broadcast domain and share the same IP subnet.

WAN Links A WAN (wide area network) or metro link connects different physical sites over a distance. As an extension of the switch fabric requirement, do not place Nutanix nodes in the same cluster if they are separated by a WAN.
VLANs

Add the Controller VM and the AHV host to the same VLAN. Place all CVMs and AHV hosts in a cluster in the same VLAN. By default the CVM and AHV host are untagged, shown as VLAN 0, which effectively places them on the native VLAN configured on the upstream physical switch.

Note: Do not add any other device (including guest VMs) to the VLAN to which the CVM and hypervisor host are assigned. Isolate guest VMs on one or more separate VLANs.

Nutanix recommends configuring the CVM and hypervisor host VLAN as the native, or untagged, VLAN on the connected switch ports. This native VLAN configuration allows for easy node addition and cluster expansion. By default, new Nutanix nodes send and receive untagged traffic. If you use a tagged VLAN for the CVM and hypervisor hosts instead, you must configure that VLAN while provisioning the new node, before adding that node to the Nutanix cluster.

Use tagged VLANs for all guest VM traffic and add the required guest VM VLANs to all connected switch ports for hosts in the Nutanix cluster. Limit guest VLANs for guest VM traffic to the smallest number of physical switches and switch ports possible to reduce broadcast network traffic load. If a VLAN is no longer needed, remove it.

Default VS bonded port (br0-up)

Aggregate the fastest links of the same speed on the physical host to a VS bond on the default vs0 and provision VLAN trunking for these interfaces on the physical switch.

By default, interfaces in the bond in the virtual switch operate in the recommended active-backup mode.
Note: The mixing of bond modes across AHV hosts in the same cluster is not recommended and not supported.
1 GbE and 10 GbE interfaces (physical host)

If 10 GbE or faster uplinks are available, Nutanix recommends that you use them instead of 1 GbE uplinks.

Recommendations for 1 GbE uplinks are as follows:

  • If you plan to use 1 GbE uplinks, do not include them in the same bond as the 10 GbE interfaces.

    Nutanix recommends that you do not use uplinks of different speeds in the same bond.

  • If you choose to configure only 1 GbE uplinks, when migration of memory-intensive VMs becomes necessary, power off and power on in a new host instead of using live migration. In this context, memory-intensive VMs are VMs whose memory changes at a rate that exceeds the bandwidth offered by the 1 GbE uplinks.

    Nutanix recommends the manual procedure for memory-intensive VMs because live migration, which you initiate either manually or by placing the host in maintenance mode, might appear prolonged or unresponsive and might eventually fail.

    Use the aCLI on any CVM in the cluster to start the VMs on another AHV host:

    nutanix@cvm$ acli vm.on vm_list host=host

    Replace vm_list with a comma-delimited list of VM names and replace host with the IP address or UUID of the target host.

  • If you must use only 1GbE uplinks, add them into a bond to increase bandwidth and use the balance-TCP (LACP) or balance-SLB bond mode.
IPMI port on the hypervisor host Do not use VLAN trunking on switch ports that connect to the IPMI interface. Configure the switch ports as access ports for management simplicity.
Upstream physical switch

Nutanix does not recommend the use of Fabric Extenders (FEX) or similar technologies for production use cases. While initial, low-load implementations might run smoothly with such technologies, poor performance, VM lockups, and other issues might occur as implementations scale upward (see Knowledge Base article KB1612). Nutanix recommends the use of 10Gbps, line-rate, non-blocking switches with larger buffers for production workloads.

Cut-through versus store-and-forward selection depends on network design. In designs with no oversubscription and no speed mismatches you can use low-latency cut-through switches. If you have any oversubscription or any speed mismatch in the network design, then use a switch with larger buffers. Port-to-port latency should be no higher than 2 microseconds.

Use fast-convergence technologies (such as Cisco PortFast) on switch ports that are connected to the hypervisor host.

Physical Network Layout Use redundant top-of-rack switches in a traditional leaf-spine architecture. This simple, flat network design is well suited for a highly distributed, shared-nothing compute and storage architecture.

Add all the nodes that belong to a given cluster to the same Layer-2 network segment.

Other network layouts are supported as long as all other Nutanix recommendations are followed.

Jumbo Frames

The Nutanix CVM uses the standard Ethernet MTU (maximum transmission unit) of 1,500 bytes for all the network interfaces by default. The standard 1,500 byte MTU delivers excellent performance and stability. Nutanix does not support configuring the MTU on network interfaces of a CVM to higher values.

You can enable jumbo frames (MTU of 9,000 bytes) on the physical network interfaces of AHV hosts and guest VMs if the applications on your guest VMs require them. If you choose to use jumbo frames on hypervisor hosts, be sure to enable them end to end in the desired network and consider both the physical and virtual network infrastructure impacted by the change.

Controller VM Do not remove the Controller VM from either the OVS bridge br0 or the native Linux bridge virbr0.
Rack Awareness and Block Awareness Block awareness and rack awareness provide smart placement of Nutanix cluster services, metadata, and VM data to help maintain data availability, even when you lose an entire block or rack. The same network requirements for low latency and high throughput between servers in the same cluster still apply when using block and rack awareness.
Note: Do not use features like block or rack awareness to stretch a Nutanix cluster between different physical sites.
Oversubscription

Oversubscription occurs when an intermediate network device or link does not have enough capacity to allow line rate communication between the systems connected to it. For example, if a 10 Gbps link connects two switches and four hosts connect to each switch at 10 Gbps, the connecting link is oversubscribed. Oversubscription is often expressed as a ratio—in this case 4:1, as the environment could potentially attempt to transmit 40 Gbps between the switches with only 10 Gbps available. Achieving a ratio of 1:1 is not always feasible. However, you should keep the ratio as small as possible based on budget and available capacity. If there is any oversubscription, choose a switch with larger buffers.

In a typical deployment where Nutanix nodes connect to redundant top-of-rack switches, storage replication traffic between CVMs traverses multiple devices. To avoid packet loss due to link oversubscription, ensure that the switch uplinks consist of multiple interfaces operating at a faster speed than the Nutanix host interfaces. For example, for nodes connected at 10 Gbps, the inter-switch connection should consist of multiple 10 Gbps or 40 Gbps links.

The following diagrams show sample network configurations using Open vSwitch and Virtual Switch.

Figure. Virtual Switch Click to enlarge Displaying Virtual Switch mechanism

Figure. AHV Bridge Chain Click to enlarge Displaying Virtual Switch mechanism

Figure. Default factory configuration of Open vSwitch in AHV Click to enlarge

Figure. Open vSwitch Configuration Click to enlarge

IP Address Management

IP Address Management (IPAM) is a feature of AHV that allows it to assign IP addresses automatically to VMs by using DHCP. You can configure each virtual network with a specific IP address subnet, associated domain settings, and IP address pools available for assignment to VMs.

An AHV network is defined as a managed network or an unmanaged network based on the IPAM setting.

Managed Network

Managed network refers to an AHV network in which IPAM is enabled.

Unmanaged Network

Unmanaged network refers to an AHV network in which IPAM is not enabled or is disabled.

IPAM is enabled, or not, in the Create Network dialog box when you create a virtual network for Guest VMs. See Configuring a Virtual Network for Guest VM Interfaces topic in the Prism Web Console Guide .
Note: You can enable IPAM only when you are creating a virtual network. You cannot enable or disable IPAM for an existing virtual network.

IPAM enabled or disabled status has implications. For example, when you want to reconfigure the IP address of a Prism Central VM, the procedure to do so may involve additional steps for managed networks (that is, networks with IPAM enabled) where the new IP address belongs to an IP address range different from the previous IP address range. See Reconfiguring the IP Address and Gateway of Prism Central VMs in Prism Central Guide .

Layer 2 Network Management

AHV uses virtual switch (VS) to connect the Controller VM, the hypervisor, and the guest VMs to each other and to the physical network. Virtual switch is configured by default on each AHV node and the VS services start automatically when you start a node.

To configure virtual networking in an AHV cluster, you need to be familiar with virtual switch. This documentation gives you a brief overview of virtual switch and the networking components that you need to configure to enable the hypervisor, Controller VM, and guest VMs to connect to each other and to the physical network.

About Virtual Switch

Virtual switches or VS are used to manage multiple bridges and uplinks.

The VS configuration is designed to provide flexibility in configuring virtual bridge connections. A virtual switch (VS) defines a collection of AHV nodes and the uplink ports on each node. It is an aggregation of the same OVS bridge on all the compute nodes in a cluster. For example, vs0 is the default virtual switch is an aggregation of the br0 bridge and br0-up uplinks of all the nodes.

After you configure a VS, you can use the VS as reference for physical network management instead of using the bridge names as reference.

For overview about Virtual Switch, see Virtual Switch Considerations.

For information about OVS, see About Open vSwitch.

Virtual Switch Workflow

A virtual switch (VS) defines a collection of AHV compute nodes and the uplink ports on each node. It is an aggregation of the same OVS bridge on all the compute nodes in a cluster. For example, vs0 is the default virtual switch is an aggregation of the br0 bridge of all the nodes.

The system creates the default virtual switch vs0 connecting the default bridge br0 on all the hosts in the cluster during installation of or upgrade to the compatible versions of AOS and AHV. Default virtual switch vs0 has the following characteristics:

  • The default virtual switch cannot be deleted.

  • The default bridges br0 on all the nodes in the cluster map to vs0. thus, vs0 is not empty. It has at least one uplink configured.

  • The default management connectivity to a node is mapped to default bridge br0 that is mapped to vs0.

  • The default parameter values of vs0 - Name, Description, MTU and Bond Type - can be modified subject to aforesaid characteristics.

  • The default virtual switch is configured with the Active-Backup uplink bond type.

    For more information about bond types, see the Bond Type table.

The virtual switch aggregates the same bridges on all nodes in the cluster. The bridges (for example, br1) connect to the physical port such as eth3 (Ethernet port) via the corresponding uplink (for example, br1-up). The uplink ports of the bridges are connected to the same physical network. For example, the following illustration shows that vs0 is mapped to the br0 bridge, in turn connected via uplink br0-up to various (physical) Ethernet ports on different nodes.

Figure. Virtual Switch Click to enlarge Displaying Virtual Switch mechanism

Uplink configuration uses bonds to improve traffic management. The bond types are defined for the aggregated OVS bridges.A new bond type - No uplink bond - provides a no-bonding option. A virtual switch configured with the No uplink bond uplink bond type has 0 or 1 uplinks.

When you configure a virtual switch with any other bond type, you must select at least two uplink ports on every node.

If you change the uplink configuration of vs0, AOS applies the updated settings to all the nodes in the cluster one after the other (the rolling update process). To update the settings in a cluster, AOS performs the following tasks when configuration method applied is Standard :

  1. Puts the node in maintenance mode (migrates VMs out of the node)
  2. Applies the updated settings
  3. Checks connectivity with the default gateway
  4. Exits maintenance mode
  5. Proceeds to apply the updated settings to the next node

AOS does not put the nodes in maintenance mode when the Quick configuration method is applied.

Table 1. Bond Types
Bond Type Use Case

Maximum VM NIC Throughput

Maximum Host Throughput

Active-Backup

Recommended. Default configuration, which transmits all traffic over a single active adapter. 10 Gb 10 Gb

Active-Active with MAC pinning

Also known as balance-slb

Works with caveats for multicast traffic. Increases host bandwidth utilization beyond a single 10 Gb adapter. Places each VM NIC on a single adapter at a time. Do not use this bond type with link aggregation protocols such as LACP. 10 Gb 20 Gb

Active-Active

Also known as LACP with balance-tcp

LACP and link aggregation required. Increases host and VM bandwidth utilization beyond a single 10 Gb adapter by balancing VM NIC TCP and UDP sessions among adapters. Also used when network switches require LACP negotiation.

The default LACP settings are:

  • Speed—Fast (1s)
  • Mode—Active fallback-active-backup
  • Priority—Default. This is not configurable.
20 Gb 20 Gb
No Uplink Bond

No uplink or a single uplink on each host.

Virtual switch configured with the No uplink bond uplink bond type has 0 or 1 uplinks. When you configure a virtual switch with any other bond type, you must select at least two uplink ports on every node.

- -

Note the following points about the uplink configuration.

  • Virtual switches are not enabled in a cluster that has one or more compute-only nodes. See Virtual Switch Limitations and Virtual Switch Requirements.
  • If you select the Active-Active policy, you must manually enable LAG and LACP on the corresponding ToR switch for each node in the cluster.
  • If you reimage a cluster with the Active-Active policy enabled, the default virtual switch (vs0) on the reimaged cluster is once again the Active-Backup policy. The other virtual switches are removed during reimage.
  • Nutanix recommends configuring LACP with fallback to active-backup or individual mode on the ToR switches. The configuration and behavior varies based on the switch vendor. Use a switch configuration that allows both switch interfaces to pass traffic after LACP negotiation fails.

Virtual Switch Considerations

Virtual Switch Deployment

A VS configuration is deployed using rolling update of the clusters. After the VS configuration (creation or update) is received and execution starts, every node is first put into maintenance mode before the VS configuration is made or modified on the node. This is called the Standard recommended default method of configuring a VS.

You can select the Quick method of configuration also where the rolling update does not put the clusters in maintenance mode. The VS configuration task is marked as successful when the configuration is successful on the first node. Any configuration failure on successive nodes triggers corresponding NCC alerts. There is no change to the task status.

Note:

If you are modifying an existing bond, AHV removes the bond and then re-creates the bond with the specified interfaces.

Ensure that the interfaces you want to include in the bond are physically connected to the Nutanix appliance before you run the command described in this topic. If the interfaces are not physically connected to the Nutanix appliance, the interfaces are not added to the bond.

Note: If you are modifying an existing bond, AHV removes the bond and then re-creates the bond with the specified interfaces.

Ensure that the interfaces you want to include in the bond are physically connected to the Nutanix appliance before you run the command described in this topic. If the interfaces are not physically connected to the Nutanix appliance, the interfaces are not added to the bond.

Ensure that the pre-checks listed in LCM Prechecks section of the Life Cycle Manager Guide and the Always and Host Disruptive Upgrades types of pre-checks listed KB-4584 pass for Virtual Switch deployments.

The VS configuration is stored and re-enforced at system reboot.

The VM NIC configuration also displays the VS details. When you Update VM configuration or Create NIC for a VM, the NIC details show the virtual switches that can be associated. This view allows you to change a virtual network and the associated virtual switch.

To change the virtual network, select the virtual network in the Subnet Name dropdown list in the Create NIC or Update NIC dialog box..

Figure. Create VM - VS Details Click to enlarge

Figure. VM NIC - VS Details Click to enlarge

Impact of Installation of or Upgrade to Compatible AOS and AHV Versions

See Virtual Switch Requirements for information about minimum and compatible AOS and AHV versions.

When you upgrade the AOS to a compatible version from an older version, the upgrade process:

  • Triggers the creation of the default virtual switch vs0, which is mapped to bridge br0on all the nodes.

  • Validates bridge br0 and its uplinks for consistency in terms of MTU and bond-type on every node.

    If valid, it adds the bridge br0 of each node to the virtual switch vs0.

    If br0 configuration is not consistent, the system generates an NCC alert which provides the failure reason and necessary details about it.

    The system migrates only the bridge br0 on each node to the default virtual switch vs0 because the connectivity of bridge br0 is guaranteed.

  • Does not migrate any other bridges to any other virtual switches during upgrade. You need to manually migrate the other bridges after install or upgrade is complete.

Bridge Migration

After upgrading to a compatible version of AOS, you can migrate bridges other than br0 that existed on the nodes. When you migrate the bridges, the system converts the bridges to virtual switches.

See Virtual Switch Migration Requirements in Virtual Switch Requirements.

Note: You can migrate only those bridges that are present on every compute node in the cluster. See Migrating Bridges after Upgrade topic in Prism Web Console Guide .

Cluster Scaling Impact

VS management for cluster scaling (addition or removal of nodes) is seamless.

Node Removal

When you remove a node, the system detects the removal and automatically removes the node from all the VS configurations that include the node and generates an internal system update. For example, a node has two virtual switches, vs1 and vs2, configured apart from the default vs0. When you remove the node from the cluster, the system removes the node for the vs1 and vs2 configurations automatically with internal system update.

Node Addition

When you add a new node or host to a cluster, the bridges or virtual switches on the new node are treated in the following manner:

Note: If a host already included in a cluster is removed and then added back, it is treated as a new host.
  • The system validates the default bridge br0 and uplink bond br0-up to check if it conforms to the default virtual switch vs0 already present on the cluster.

    If br0 and br0-up conform, the system includes the new host and its uplinks in vs0.

    If br0 and br0-up do not conform,then the system generates an NCC alert.

  • The system does not automatically add any other bridge configured on the new host to any other virtual switch in the cluster.

    It generates NCC alerts for all the other non-default virtual switches.

  • You can manually include the host in the required non-default virtual switches. Update a non-default virtual switch to include the host.

    For information about updating a virtual switch in Prism Element Web Console, see the Configuring a Virtual Network for Guest VM Interfaces section in Prism Web Console Guide .

    For information about updating a virtual switch in Prism Central, see the Network Connections section in the Prism Central Guide .

VS Management

You can manage virtual switches from Prism Central or Prism Web Console. You can also use aCLI or REST APIs to manage them. See the Acropolis API Reference and Command Reference guides for more information.

You can also use the appropriate aCLI commands for virtual switches from the following list:

  • net.create_virtual_switch

  • net.list_virtual_switch

  • net.get_virtual_switch

  • net.update_virtual_switch

  • net.delete_virtual_switch

  • net.migrate_br_to_virtual_switch

  • net.disable_virtual_switch

About Open vSwitch

Open vSwitch (OVS) is an open-source software switch implemented in the Linux kernel and designed to work in a multiserver virtualization environment. By default, OVS behaves like a Layer 2 learning switch that maintains a MAC address learning table. The hypervisor host and VMs connect to virtual ports on the switch.

Each hypervisor hosts an OVS instance, and all OVS instances combine to form a single switch. As an example, the following diagram shows OVS instances running on two hypervisor hosts.

Figure. Open vSwitch Click to enlarge

Default Factory Configuration

The factory configuration of an AHV host includes a default OVS bridge named br0 (configured with the default virtual switch vs0) and a native linux bridge called virbr0.

Bridge br0 includes the following ports by default:

  • An internal port with the same name as the default bridge; that is, an internal port named br0. This is the access port for the hypervisor host.
  • A bonded port named br0-up. The bonded port aggregates all the physical interfaces available on the node. For example, if the node has two 10 GbE interfaces and two 1 GbE interfaces, all four interfaces are aggregated on br0-up. This configuration is necessary for Foundation to successfully image the node regardless of which interfaces are connected to the network.
    Note:

    Before you begin configuring a virtual network on a node, you must disassociate the 1 GbE interfaces from the br0-up port. This disassociation occurs when you modify the default virtual switch (vs0) and create new virtual switches. Nutanix recommends that you aggregate only the 10 GbE or faster interfaces on br0-up and use the 1 GbE interfaces on a separate OVS bridge deployed in a separate virtual switch.

    See Virtual Switch Management for information about virtual switch management.

The following diagram illustrates the default factory configuration of OVS on an AHV node:

Figure. Default factory configuration of Open vSwitch in AHV Click to enlarge

The Controller VM has two network interfaces by default. As shown in the diagram, one network interface connects to bridge br0. The other network interface connects to a port on virbr0. The Controller VM uses this bridge to communicate with the hypervisor host.

Virtual Switch Requirements

The requirements to deploy virtual switches are as follows:

  1. Virtual switches are supported on AOS 5.19 or later with AHV 20201105.12 or later. Therefore you must install or upgrade to AOS 5.19 or later, with AHV 20201105.12 or later, to use virtual switches in your deployments.

  2. Virtual bridges used for a VS on all the nodes must have the same specification such as name, MTU and uplink bond type. For example, if vs1 is mapped to br1 (virtual or OVS bridge 1) on a node, it must be mapped to br1 on all the other nodes of the same cluster.

Virtual Switch Migration Requirements

The AOS upgrade process initiates the virtual switch migration. The virtual switch migration is successful only when the following requirements are fulfilled:

  • Before migrating to Virtual Switch, all bridge br0 bond interfaces should have the same bond type on all hosts in the cluster. For example, all hosts should use the Active-backup bond type or balance-tcp . If some hosts use Active-backup and other hosts use balance-tcp , virtual switch migration fails.
  • Before migrating to Virtual Switch, if using LACP:
    • Confirm that all bridge br0 lacp-fallback parameters on all hosts are set to the case sensitive value True with manage_ovs show_uplinks |grep lacp-fallback: . Any host with lowercase true causes virtual switch migration failure.
    • Confirm that the LACP speed on the physical switch is set to fast or 1 second. Also ensure that the switch ports are ready to fallback to individual mode if LACP negotiation fails due to a configuration such as no lacp suspend-individual .
  • Before migrating to the Virtual Switch, confirm that the upstream physical switch is set to spanning-tree portfast or spanning-tree port type edge trunk . Failure to do so may lead to a 30-second network timeout and the virtual switch migration may fail because it uses 20-second non-modifiable timer.
  • Ensure that the pre-checks listed in LCM Prechecks section of the Life Cycle Manager Guide and the Always and Host Disruptive Upgrades types of pre-checks listed KB-4584 pass for Virtual Switch deployments.

  • For the default virtual switch vs0,
    • All configured uplink ports must be available for connecting the network. In Active-Backup bond type, the active port is selected from any configured uplink port that is linked. Therefore, the virtual switch vs0 can use all the linked ports for communication with other CVMs/hosts.
    • All the host IP addresses in the virtual switch vs0 must be resolvable to the configured gateway using ARP.

Virtual Switch Limitations

MTU Restriction

The Nutanix Controller VM uses the standard Ethernet MTU (maximum transmission unit) of 1,500 bytes for all the network interfaces by default. The standard 1,500-byte MTU delivers excellent performance and stability. Nutanix does not support configuring higher values of MTU on the network interfaces of a Controller VM.

You can enable jumbo frames (MTU of 9,000 bytes) on the physical network interfaces of AHV, ESXi, or Hyper-V hosts and guest VMs if the applications on your guest VMs require such higher MTU values. If you choose to use jumbo frames on the hypervisor hosts, enable the jumbo frames end to end in the specified network, considering both the physical and virtual network infrastructure impacted by the change.

Single node and Two-node cluster configuration.

Virtual switch cannot be deployed is your single-node or two-node cluster has any instantiated user VMs. The virtual switch creation or update process involves a rolling restart, which checks for maintenance mode and whether you can migrate the VMs. On a single-node or two-node cluster, any instantiated user VMs cannot be migrated and the virtual switch operation fails.

Therefore, power down all user VMs for virtual switch operations in a single-node or two-node cluster.

Compute-only node is not supported.

Virtual switch is not compatible with Compute-only (CO) nodes. If a CO node is present in the cluster, then the virtual switches are not deployed (including the default virtual switch). You need to use the net.disable_virtual_switch aCLI command to disable the virtual switch workflow if you want to expand a cluster which has virtual switches and includes a CO node.

The net.disable_virtual_switch aCLI command cleans up all the virtual switch entries from the IDF. All the bridges mapped to the virtual switch or switches are retained as they are.

See Compute-Only Node Configuration (AHV Only).

Including a storage-only node in a VS is not necessary.

Virtual switch is compatible with Storage-only (SO) nodes but you do not need to include an SO node in any virtual switch, including the default virtual switch.

Mixed-mode Clusters with AHV Storage-only Nodes
Consider that you have deployed a mixed-node cluster where the compute-only nodes are ESXi or Hyper-V nodes and the storage-only nodes are AHV nodes. In such a case, the default virtual switch deployment fails.

Without the default VS, the Prism Element, Prism Central and CLI workflows for virtual switch required to manage the bridges and bonds are not available. You need to use the manage_ovs command options to update the bridge and bond configurations on the AHV hosts.

Virtual Switch Management

Virtual Switch can be viewed, created, updated or deleted from both Prism Web Console as well as Prism Central.

Virtual Switch Views and Visualization

For information on the virtual switch network visualization in Prism Element Web Console, see the Network Visualization topic in the Prism Web Console Guide .

Virtual Switch Create, Update and Delete Operations

For information about the procedures to create, update and delete a virtual switch in Prism Element Web Console, see the Configuring a Virtual Network for Guest VM Interfaces section in the Prism Web Console Guide .

For information about the procedures to create, update and delete a virtual switch in Prism Central, see the Network Connections section in the Prism Central Guide .

Re-Configuring Bonds Across Hosts Manually

If you are upgrading AOS to 5.20, 6.0 or later, you need to migrate the existing bridges to virtual switches. If there are inconsistent bond configurations across hosts before migration of the bridges, then after migration of bridges the virtual switches may not be properly deployed. To resolve such issues, you must manually configure the bonds to make them consistent.

About this task

Important: Use this procedure only when you need to modify the inconsistent bonds in a migrated bridge across hosts in a cluster, that is preventing Acropolis (AOS) from deploying the virtual switch for the migrated bridge.

Do not use ovs-vsctl commands to make the bridge level changes. Use the manage_ovs commands, instead.

The manage_ovs command allows you to update the cluster configuration. The changes are applied and retained across host restarts. The ovs-vsctl command allows you to update the live running host configuration but does not update the AOS cluster configuration and the changes are lost at host restart. This behavior of ovs-vsctl introduces connectivity issues during maintenance, such as upgrades or hardware replacements.

ovs-vsctl is usually used during a break/fix situation where a host may be isolated on the network and requires a workaround to gain connectivity before the cluster configuration can actually be updated using manage_ovs .

Note: Disable the virtual switch before you attempt to change the bonds or bridge.

If you hit an issue where the virtual switch is automatically re-created after it is disabled (with AOS versions 5.20.0 or 5.20.1), follow steps 1 and 2 below to disable such an automatically re-created virtual switch again before migrating the bridges. For more information, see KB-3263.

Be cautious when using the disable_virtual_switch command because it deletes all the configurations from the IDF, not only for the default virtual switch vs0, but also any virtual switches that you may have created (such as vs1 or vs2). Therefore, before you use the disable_virtual_switch command, ensure that you check a list of existing virtual switches, that you can get using the acli net.get_virtual_switch command.

Complete this procedure on each host Controller VM that is sharing the bridge that needs to be migrated to a virtual switch.

Procedure

  1. To list the virtual switches, use the following command.
    nutanix@cvm$ acli net.list_virtual_switch
  2. Disable all the virtual switches.
    nutanix@cvm$ acli net.disable_virtual_switch 

    This disables all the virtual switches.

    Note: You can use the nutanix@cvm$ acli net.delete_virtual_switch vs_name command to delete a specific VS and re-create it with the appropriate bond type.
  3. Change the bond type to align with the same bond type on all the hosts for the specified virtual switch
    nutanix@cvm$ manage_ovs --bridge_name bridge-name --bond_name bond_name --bond_mode bond-type update_uplinks

    Where:

    • bridge-name : Provide the name of the bridge, such as br0 for the virtual switch on which you want to set the uplink bond mode.
    • bond-name : Provide the name of the uplink port such as br0-up for which you want to set the bond mode.
    • bond-type : Provide the bond mode that you require to be used uniformly across the hosts on the named bridge.

    Use the manage_ovs --help command for help on this command.

    Note: To disable LACP, change the bond type from LACP Active-Active (balance-tcp) to Active-Backup/Active-Active with MAC pinning (balance-slb) by setting the bond_mode using this command as active-backup or balance-slb .

    Ensure that you turn off LACP on the connected ToR switch port as well. To avoid blocking of the bond uplinks during the bond type change on the host, ensure that you follow the ToR switch best practices to enable LACP fallback or passive mode.

    To enable LACP, configure bond-type as balance-tcp (Active-Active) with additional variables --lacp_mode fast and --lacp_fallback true .

  4. (If migrating to AOS version earlier than 5.20.2) Check if the issue in the note and disable the virtual switch.

What to do next

After making the bonds consistent across all the hosts configured in the bridge, migrate the bridge or enable the virtual switch. For more information, see:

To check whether LACP is enabled or disabled, use the following command.

nutanix@cvm$ manage_ovs show_uplinks

Enabling LACP and LAG (AHV Only)

If you select the Active-Active bond type, you must enable LACP and LAG on the corresponding ToR switch for each node in the cluster one after the other. This section describes the procedure to enable LAG and LACP in AHV nodes and the connected ToR switch.

About this task

Procedure

  1. Change the uplink Bond Type for the virtual switch.
    1. Open the Edit Virtual Switch window.
      • In Prism Central, open Network & Security > Subnets > Network Configuration > Virtual Switch .
      • In Prism Element or Web Console, open Settings > Network Configuration > Virtual Switch
    2. Click the Edit Edit icon
      icon of the virtual switch you want to configure LAG and LACP.
    3. On the Edit Virtual Switch page, in the General tab, ensure that the Standard option is selected for the Select Configuration Method parameter. Click Next .
      The Standard configuration method puts each node in maintenance mode before applying the updated settings. After applying the updated settings, the node exits from maintenance mode. See Virtual Switch Workflow .
    4. On the Uplink Configuration tab, in Bond Type , select Active-Active .
    5. Click Save .
    The Active-Active bond type configures all AHV hosts with the fast setting for LACP speed, causing the AHV host to request LACP control packets at the rate of one per second from the physical switch. In addition, the Active-Active bond type configuration sets LACP fallback to Active-Backup on all AHV hosts. You cannot modify these default settings after you have configured them in Prism, even by using the CLI.

    This completes the LAG and LACP configuration on the cluster.

Perform the following steps on each node, one at a time.
  1. Put the node and the Controller VM into maintenance mode.
    Before you put a node in maintenance mode, see Verifying the Cluster Health and carry out the necessary checks.

    See Putting a Node into Maintenance Mode . Step 6 in this procedure puts the Controller VM in maintenance mode.

  2. Change the settings for the interface on the ToR switch that the node connects to, to match the LACP and LAG setting made on the cluster in step 1 above.
    This is an important step. See the documentation provided by the ToR switch vendor for more information about changing the LACP settings of the switch interface that the node is physically connected to.
    • Nutanix recommends that you enable LACP fallback.

    • Consider the LACP time options ( slow and fast ). If the switch has a fast configuration, set the LACP time to fast . This is to prevent an outage due to a mismatch on LACP speeds of the cluster and the ToR switch. Keep in mind that the Active-Active bond type configuration set the LACP of cluster to fast .

    Verify that LACP negotiation status is negotiated.

  3. Remove the node and Controller VM from maintenance mode.
    See Exiting a Node from the Maintenance Mode . The Controller VM exits maintenance mode during the same process.

What to do next

Do the following after completing the procedure to enable LAG and LACP in all the AHV nodes the connected ToR switches:
  • Verify that the status of all services on all the CVMs are Up. Run the following command and check if the status of the services is displayed as Up in the output:
    nutanix@cvm$ cluster status
  • Log on to the Prism Element of the node and check the Data Resiliency Status widget displays OK .
    Figure. Data Resiliency Status Click to enlarge

VLAN Configuration

You can set up a VLAN-based segmented virtual network on an AHV node by assigning the ports on virtual bridges managed by virtual switches to different VLANs. VLAN port assignments are configured from the Controller VM that runs on each node.

For best practices associated with VLAN assignments, see AHV Networking Recommendations. For information about assigning guest VMs to a virtual switch and VLAN, see Network Connections in the Prism Central Guide .

Assigning an AHV Host to a VLAN

About this task

Note: Perform the following procedure during a scheduled downtime. Before you begin, stop the cluster. Once the process begins, hosts and CVMs partially lose network access to each other and VM data or storage containers become unavailable until the process completes.

To assign an AHV host to a VLAN, do the following on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host with SSH.
  2. Put the AHV host and the CVM in maintenance mode.
    See Putting a Node into Maintenance Mode for instructions about how to put a node into maintenance mode.
  3. Assign port br0 (the internal port on the default OVS bridge, br0) to the VLAN that you want the host be on.
    root@ahv# ovs-vsctl set port br0 tag=host_vlan_tag

    Replace host_vlan_tag with the VLAN tag for hosts.

  4. Confirm VLAN tagging on port br0.
    root@ahv# ovs-vsctl list port br0
  5. Check the value of the tag parameter that is shown.
  6. Verify connectivity to the IP address of the AHV host by performing a ping test.
  7. Exit the AHV host and the CVM from the maintenance mode.
    See Exiting a Node from the Maintenance Mode for more information.

Assigning the Controller VM to a VLAN

By default, the public interface of a Controller VM is assigned to VLAN 0. To assign the Controller VM to a different VLAN, change the VLAN ID of its public interface. After the change, you can access the public interface from a device that is on the new VLAN.

About this task

Note: Perform the following procedure during a scheduled downtime. Before you begin, stop the cluster. Once the process begins, hosts and CVMs partially lose network access to each other and VM data or storage containers become unavailable until the process completes.
Note: To avoid losing connectivity to the Controller VM, do not change the VLAN ID when you are logged on to the Controller VM through its public interface. To change the VLAN ID, log on to the internal interface that has IP address 192.168.5.254.

Perform these steps on every Controller VM in the cluster. To assign the Controller VM to a VLAN, do the following:

Procedure

  1. Log on to the AHV host with SSH.
  2. Put the AHV host and the Controller VM in maintenance mode.
    See Putting a Node into Maintenance Mode for instructions about how to put a node into maintenance mode.
  3. Check the Controller VM status on the host.
    root@host# virsh list

    An output similar to the following is displayed:

    root@host# virsh list
     Id    Name                           State
    ----------------------------------------------------
     1     NTNX-CLUSTER_NAME-3-CVM            running
     3     3197bf4a-5e9c-4d87-915e-59d4aff3096a running
     4     c624da77-945e-41fd-a6be-80abf06527b9 running
    
    root@host# logout
  4. Log on to the Controller VM.
    root@host# ssh nutanix@192.168.5.254

    Accept the host authenticity warning if prompted, and enter the Controller VM nutanix password.

  5. Assign the public interface of the Controller VM to a VLAN.
    nutanix@cvm$ change_cvm_vlan vlan_id

    Replace vlan_id with the ID of the VLAN to which you want to assign the Controller VM.

    For example, add the Controller VM to VLAN 201.

    nutanix@cvm$ change_cvm_vlan 201
  6. Confirm VLAN tagging on the Controller VM.
    root@host# virsh dumpxml cvm_name

    Replace cvm_name with the CVM name or CVM ID to view the VLAN tagging information.

    Note: Refer to step 3 for Controller VM name and Controller VM ID.

    An output similar to the following is displayed:

    root@host# virsh dumpxml 1 | grep "tag id" -C10 --color
          <target dev='vnet2'/>
          <model type='virtio'/>
          <driver name='vhost' queues='4'/>
          <alias name='net2'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
        </interface>
        <interface type='bridge'>
          <mac address='50:6b:8d:b9:0a:18'/>
          <source bridge='br0'/>
          <vlan>
               <tag id='201'/> 
          </vlan>
          <virtualport type='openvswitch'>
            <parameters interfaceid='c46374e4-c5b3-4e6b-86c6-bfd6408178b5'/>
          </virtualport>
          <target dev='vnet0'/>
          <model type='virtio'/>
          <driver name='vhost' queues='4'/>
          <alias name='net3'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
        </interface>
    root@host#
  7. Check the value of the tag parameter that is shown.
  8. Restart the network service.
    nutanix@cvm$ sudo service network restart
  9. Verify connectivity to the Controller VMs external IP address by performing a ping test from the same subnet. For example, perform a ping from another Controller VM or directly from the host itself.
  10. Exit the AHV host and the Controller VM from the maintenance mode.
    See Exiting a Node from the Maintenance Mode for more information.

Enabling RSS Virtio-Net Multi-Queue by Increasing the Number of VNIC Queues

Multi-Queue in VirtIO-net enables you to improve network performance for network I/O-intensive guest VMs or applications running on AHV hosts.

About this task

You can enable VirtIO-net multi-queue by increasing the number of VNIC queues. If an application uses many distinct streams of traffic, Receive Side Scaling (RSS) can distribute the streams across multiple VNIC DMA rings. This increases the amount of RX buffer space by the number of VNIC queues (N). Also, most guest operating systems pin each ring to a particular vCPU, handling the interrupts and ring-walking on that vCPU, by that means achieving N-way parallelism in RX processing. However, if you increase the number of queues beyond the number of vCPUs, you cannot achieve extra parallelism.

Following workloads have the greatest performance benefit of VirtIO-net multi-queue:

  • VMs where traffic packets are relatively large
  • VMs with many concurrent connections
  • VMs with network traffic moving:
    • Among VMs on the same host
    • Among VMs across hosts
    • From VMs to the hosts
    • From VMs to an external system
  • VMs with high VNIC RX packet drop rate if CPU contention is not the cause

You can increase the number of queues of the AHV VM VNIC to allow the guest OS to use multi-queue VirtIO-net on guest VMs with intensive network I/O. Multi-Queue VirtIO-net scales the network performance by transferring packets through more than one Tx/Rx queue pair at a time as the number of vCPUs increases.

Nutanix recommends that you be conservative when increasing the number of queues. Do not set the number of queues larger than the total number of vCPUs assigned to a VM. Packet reordering and TCP retransmissions increase if the number of queues is larger than the number vCPUs assigned to a VM. For this reason, start by increasing the queue size to 2. The default queue size is 1. After making this change, monitor the guest VM and network performance. Before you increase the queue size further, verify that the vCPU usage has not dramatically or unreasonably increased.

Perform the following steps to make more VNIC queues available to a guest VM. See your guest OS documentation to verify if you must perform extra steps on the guest OS to apply the additional VNIC queues.

Note: You must shut down the guest VM to change the number of queues. Therefore, make this change during a planned maintenance window. The VNIC status might change from Up->Down->Up or a restart of the guest OS might be required to finalize the settings depending on the guest OS implementation requirements.

Procedure

  1. (Optional) Nutanix recommends that you ensure the following:
    1. AHV and AOS are running the latest version.
    2. AHV guest VMs are running the latest version of the Nutanix VirtIO driver package.
      For RSS support, ensure you are running Nutanix VirtIO 1.1.6 or later. See Nutanix VirtIO for Windows for more information about Nutanix VirtIO.
  2. Determine the exact name of the guest VM for which you want to change the number of VNIC queues.
    nutanix@cvm$ acli vm.list

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.list
    VM name          VM UUID
    ExampleVM1       a91a683a-4440-45d9-8dbe-xxxxxxxxxxxx
    ExampleVM2       fda89db5-4695-4055-a3d4-xxxxxxxxxxxx
    ...
  3. Determine the MAC address of the VNIC and confirm the current number of VNIC queues.
    nutanix@cvm$ acli vm.nic_get VM-name

    Replace VM-name with the name of the VM.

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.nic_get VM-name
    ...
    mac_addr: "50:6b:8d:2f:zz:zz"
    ...
    (queues: 2)    <- If there is no output of 'queues', the setting is default (1 queue).
    Note: AOS defines queues as the maximum number of Tx/Rx queue pairs (default is 1).
  4. Check the number of vCPUs assigned to the VM.
    nutanix@cvm$ acli vm.get VM-name | grep num_vcpus

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.get VM-name | grep num_vcpus
    num_vcpus: 1
  5. Shut down the guest VM.
    nutanix@cvm$ acli vm.shutdown VM-name

    Replace VM-name with the name of the VM.

  6. Increase the number of VNIC queues.
    nutanix@cvm$acli vm.nic_update VM-name vNIC-MAC-address queues=N

    Replace VM-name with the name of the guest VM, vNIC-MAC-address with the MAC address of the VNIC, and N with the number of queues.

    Note: N must be less than or equal to the vCPUs assigned to the guest VM.
  7. Start the guest VM.
    nutanix@cvm$ acli vm.on VM-name

    Replace VM-name with the name of the VM.

  8. Confirm in the guest OS documentation if any additional steps are required to enable multi-queue in VirtIO-net.
    Note: Microsoft Windows has RSS enabled by default.

    For example, for RHEL and CentOS VMs, do the following:

    1. Log on to the guest VM.
    2. Confirm if irqbalance.service is active or not.
      uservm# systemctl status irqbalance.service

      An output similar to the following is displayed:

      irqbalance.service - irqbalance daemon
         Loaded: loaded (/usr/lib/systemd/system/irqbalance.service; enabled; vendor preset: enabled)
         Active: active (running) since Tue 2020-04-07 10:28:29 AEST; Ns ago
    3. Start irqbalance.service if it is not active.
      Note: It is active by default on CentOS VMs. You might have to start it on RHEL VMs.
      uservm# systemctl start irqbalance.service
    4. Run the following command:
      uservm$ ethtool -L ethX combined M

      Replace M with the number of VNIC queues.

    Note the following caveat from the RHEL 7 Virtualization Tuning and Optimization Guide : 5.4. NETWORK TUNING TECHNIQUES document:

    "Currently, setting up a multi-queue virtio-net connection can have a negative effect on the performance of outgoing traffic. Specifically, this may occur when sending packets under 1,500 bytes over the Transmission Control Protocol (TCP) stream."

  9. Monitor the VM performance to make sure that the expected network performance increase is observed and that the guest VM vCPU usage is not dramatically increased to impact the application on the guest VM.
    For assistance with the steps described in this document, or if these steps do not resolve your guest VM network performance issues, contact Nutanix Support.

Changing the IP Address of an AHV Host

Change the IP address, netmask, or gateway of an AHV host.

Before you begin

Perform the following tasks before you change the IP address, netmask, or gateway of an AHV host:
Caution: All Controller VMs and hypervisor hosts must be on the same subnet.
Warning: Ensure that you perform the steps in the exact order as indicated in this document.
  1. Verify the cluster health by following the instructions in KB-2852.

    Do not proceed if the cluster cannot tolerate failure of at least one node.

  2. Put the AHV host into the maintenance mode.

    See Putting a Node into Maintenance Mode for instructions about how to put a node into maintenance mode.

About this task

Perform the following procedure to change the IP address, netmask, or gateway of an AHV host.

Procedure

  1. Edit the settings of port br0, which is the internal port on the default bridge br0.
    1. Log on to the host console as root.

      You can access the hypervisor host console either through IPMI or by attaching a keyboard and monitor to the node.

    2. Open the network interface configuration file for port br0 in a text editor.
      root@ahv# vi /etc/sysconfig/network-scripts/ifcfg-br0
    3. Update entries for host IP address, netmask, and gateway.

      The block of configuration information that includes these entries is similar to the following:

      ONBOOT="yes" 
      NM_CONTROLLED="no" 
      PERSISTENT_DHCLIENT=1
      NETMASK="subnet_mask" 
      IPADDR="host_ip_addr" 
      DEVICE="br0" 
      TYPE="ethernet" 
      GATEWAY="gateway_ip_addr"
      BOOTPROTO="none"
      • Replace host_ip_addr with the IP address for the hypervisor host.
      • Replace subnet_mask with the subnet mask for host_ip_addr.
      • Replace gateway_ip_addr with the gateway address for host_ip_addr.
    4. Save your changes.
    5. Restart network services.

      systemctl restart network.service
    6. Assign the host to a VLAN. For information about how to add a host to a VLAN, see Assigning an AHV Host to a VLAN.
    7. Verify network connectivity by pinging the gateway, other CVMs, and AHV hosts.
  2. Log on to the Controller VM that is running on the AHV host whose IP address you changed and restart genesis.
    nutanix@cvm$ genesis restart

    If the restart is successful, output similar to the following is displayed:

    Stopping Genesis pids [1933, 30217, 30218, 30219, 30241]
    Genesis started on pids [30378, 30379, 30380, 30381, 30403]

    See Controller VM Access for information about how to log on to a Controller VM.

    Genesis takes a few minutes to restart.

  3. Verify if the IP address of the hypervisor host has changed. Run the following nCLI command from any CVM other than the one in the maintenance mode.
    nutanix@cvm$ ncli host list 

    An output similar to the following is displayed:

    nutanix@cvm$ ncli host list 
        Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
        Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
        Name                      : XXXXXXXXXXX-X 
        IPMI Address              : X.X.Z.3 
        Controller VM Address     : X.X.X.1 
        Hypervisor Address        : X.X.Y.4 <- New IP Address 
    ... 
  4. Stop the Acropolis service on all the CVMs.
    1. Stop the Acropolis service on all the CVMs in the cluster.
      nutanix@cvm$ allssh genesis stop acropolis
      Note: You cannot manage your guest VMs after the Acropolis service is stopped.
    2. Verify if the Acropolis service is DOWN on all the CVMs, except the one in the maintenance mode.
      nutanix@cvm$ cluster status | grep -v UP 

      An output similar to the following is displayed:

      nutanix@cvm$ cluster status | grep -v UP 
      
      2019-09-04 14:43:18 INFO zookeeper_session.py:143 cluster is attempting to connect to Zookeeper 
      
      2019-09-04 14:43:18 INFO cluster:2774 Executing action status on SVMs X.X.X.1, X.X.X.2, X.X.X.3 
      
      The state of the cluster: start 
      
      Lockdown mode: Disabled 
              CVM: X.X.X.1 Up 
                                 Acropolis DOWN       [] 
              CVM: X.X.X.2 Up, ZeusLeader 
                                 Acropolis DOWN       [] 
              CVM: X.X.X.3 Maintenance
  5. From any CVM in the cluster, start the Acropolis service.
    nutanix@cvm$ cluster start 
  6. Verify if all processes on all the CVMs, except the one in the maintenance mode, are in the UP state.
    nutanix@cvm$ cluster status | grep -v UP 
  7. Exit the AHV host and the CVM from the maintenance mode.
    See Exiting a Node from the Maintenance Mode for more information.

Virtual Machine Management

The following topics describe various aspects of virtual machine management in an AHV cluster.

Supported Guest VM Types for AHV

The compatibility matrix available on the Nutanix Support portal includes the latest supported AHV guest VM OSes.

AHV Configuration Maximums

The Nutanix configuration maximums available on the Nutanix support portal includes all the latest configuration limits applicable to AHV. Select the appropriate AHV version to view version specific information.

Creating a VM (AHV)

In AHV clusters, you can create a new virtual machine (VM) through the Prism Element web console.

About this task

When creating a VM, you can configure all of its components, such as number of vCPUs and memory, but you cannot attach a volume group to the VM. Attaching a volume group is possible only when you are modifying a VM.

To create a VM, do the following:

Procedure

  1. In the VM dashboard, click the Create VM button.
    Note: This option does not appear in clusters that do not support this feature.
    The Create VM dialog box appears.
    Figure. Create VM Dialog Box Click to enlarge Create VM screen

  2. Do the following in the indicated fields:
    1. Name : Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Timezone : Select the timezone that you want the VM to use. If you are creating a Linux VM, select (UTC) UTC .
      Note:

      The RTC of Linux VMs must be in UTC, so select the UTC timezone if you are creating a Linux VM.

      Windows VMs preserve the RTC in the local timezone, so set up the Windows VM with the hardware clock pointing to the desired timezone.

    4. Use this VM as an agent VM : Select this option to make this VM as an agent VM.

      You can use this option for the VMs that must be powered on before the rest of the VMs (for example, to provide network functions before the rest of the VMs are powered on on the host) and must be powered off after the rest of the VMs are powered off (for example, during maintenance mode operations). Agent VMs are never migrated to any other host in the cluster. If an HA event occurs or the host is put in maintenance mode, agent VMs are powered off and are powered on on the same host once that host comes back to a normal state.

      If an agent VM is powered off, you can manually start that agent VM on another host and the agent VM now permanently resides on the new host. The agent VM is never migrated back to the original host. Note that you cannot migrate an agent VM to another host while the agent VM is powered on.

    5. vCPU(s) : Enter the number of virtual CPUs to allocate to this VM.
    6. Number of Cores per vCPU : Enter the number of cores assigned to each virtual CPU.
    7. Memory : Enter the amount of memory (in GiB) to allocate to this VM.
  3. (For GPU-enabled AHV clusters only) To configure GPU access, click Add GPU in the Graphics section, and then do the following in the Add GPU dialog box:
    Figure. Add GPU Dialog Box Click to enlarge

    For more information, see GPU and vGPU Support .

    1. To configure GPU pass-through, in GPU Mode , click Passthrough , select the GPU that you want to allocate, and then click Add .
      If you want to allocate additional GPUs to the VM, repeat the procedure as many times as you need to. Make sure that all the allocated pass-through GPUs are on the same host. If all specified GPUs of the type that you want to allocate are in use, you can proceed to allocate the GPU to the VM, but you cannot power on the VM until a VM that is using the specified GPU type is powered off.

      For more information, see GPU and vGPU Support .

    2. To configure virtual GPU access, in GPU Mode , click virtual GPU , select a GRID license, and then select a virtual GPU profile from the list.
      Note: This option is available only if you have installed the GRID host driver on the GPU hosts in the cluster.

      For more information about the NVIDIA GRID host driver installation instructions, see the NVIDIA Grid Host Driver for Nutanix AHV Installation Guide .

      You can assign multiple virtual GPU to a VM. A vGPU is assigned to the VM only if a vGPU is available when the VM is starting up.

      Before you add multiple vGPUs to the VM, see Multiple Virtual GPU Support and Restrictions for Multiple vGPU Support .

      Note:

      Multiple vGPUs are supported on the same VM only if you select the highest vGPU profile type.

      After you add the first vGPU, to add multiple vGPUs, see Adding Multiple vGPUs to the Same VM .

  4. To attach a disk to the VM, click the Add New Disk button.
    The Add Disks dialog box appears.
    Figure. Add Disk Dialog Box Click to enlarge configure a disk screen

    Do the following in the indicated fields:
    1. Type : Select the type of storage device, DISK or CD-ROM , from the pull-down list.
      The following fields and options vary depending on whether you choose DISK or CD-ROM .
    2. Operation : Specify the device contents from the pull-down list.
      • Select Clone from ADSF file to copy any file from the cluster that can be used as an image onto the disk.
      • Select Empty CD-ROM to create a blank CD-ROM device. (This option appears only when CD-ROM is selected in the previous field.) A CD-ROM device is needed when you intend to provide a system image from CD-ROM.
      • Select Allocate on Storage Container to allocate space without specifying an image. (This option appears only when DISK is selected in the previous field.) Selecting this option means you are allocating space only. You have to provide a system image later from a CD-ROM or other source.
      • Select Clone from Image Service to copy an image that you have imported by using image service feature onto the disk. For more information about the Image Service feature, see Configuring Images and Image Management in the Prism Self Service Administration Guide .
    3. Bus Type : Select the bus type from the pull-down list. The choices are IDE , SCSI , or SATA .

      The options displayed in the Bus Type drop-down list varies based on the storage device Type selected in Step a.

      • For device Disk , select from SCSI , SATA , PCI , or IDE bus type.
      • For device CD-ROM , you can select either IDE ,or SATA bus type.
      Note: SCSI bus is the preferred bus type and it is used in most cases. Ensure you have installed the VirtIO drivers in the guest OS.
      Caution: Use SATA, PCI, IDE for compatibility purpose when the guest OS does not have VirtIO drivers to support SCSI devices. This may have performance implications.
      Note: For AHV 5.16 and later, you cannot use an IDE device if Secured Boot is enabled for UEFI Mode boot configuration.
    4. ADSF Path : Enter the path to the desired system image.
      This field appears only when Clone from ADSF file is selected. It specifies the image to copy. Enter the path name as / storage_container_name / iso_name .iso . For example to clone an image from myos.iso in a storage container named crt1 , enter /crt1/myos.iso . When a user types the storage container name ( / storage_container_name / ), a list appears of the ISO files in that storage container (assuming one or more ISO files had previously been copied to that storage container).
    5. Image : Select the image that you have created by using the image service feature.
      This field appears only when Clone from Image Service is selected. It specifies the image to copy.
    6. Storage Container : Select the storage container to use from the pull-down list.
      This field appears only when Allocate on Storage Container is selected. The list includes all storage containers created for this cluster.
    7. Size : Enter the disk size in GiBs.
    8. When all the field entries are correct, click the Add button to attach the disk to the VM and return to the Create VM dialog box.
    9. Repeat this step to attach additional devices to the VM.
  5. Select one of the following firmware to boot the VM.
    • Legacy BIOS : Select legacy BIOS to boot the VM with legacy BIOS firmware.
    • UEFI : Select UEFI to boot the VM with UEFI firmware. UEFI firmware supports larger hard drives, faster boot time, and provides more security features. For more information about UEFI firmware, see UEFI Support for VM .
    • Secure Boot is supported with AOS 5.16. The current support to Secure Boot is limited to the aCLI. For more information about Secure Boot, see Secure Boot Support for VMs . To enable Secure Boot, do the following:
    • Select UEFI.
    • Power-off the VM.
    • Log on to the aCLI and update the VM to enable Secure Boot. For more information, see Updating a VM to Enable Secure Boot in the AHV Administration Guide .
  6. To create a network interface for the VM, click the Add New NIC button.
    The Create NIC dialog box appears.
    Figure. Create NIC Dialog Box Click to enlarge configure a NIC screen

    Do the following in the indicated fields:
    1. Subnet Name : Select the target virtual LAN from the drop-down list.
      The list includes all defined networks (see Network Configuration For VM Interfaces .).
      Note: Selecting IPAM enabled subnet from the drop-down list displays the Private IP Assignment information that provides information about the number of free IP addresses available in the subnet and in the IP pool.
    2. Network Connection State : Select the state for the network that you want it to operate in after VM creation. The options are Connected or Disconnected .
    3. Private IP Assignment : This is a read-only field and displays the following:
      • Network Address/Prefix : The network IP address and prefix.
      • Free IPs (Subnet) : The number of free IP addresses in the subnet.
      • Free IPs (Pool) : The number of free IP addresses available in the IP pools for the subnet.
    4. Assignment Type : This is for IPAM enabled network. Select Assign with DHCP to assign IP address automatically to the VM using DHCP. For more information, see IP Address Management .
    5. When all the field entries are correct, click the Add button to create a network interface for the VM and return to the Create VM dialog box.
    6. Repeat this step to create additional network interfaces for the VM.
    Note: Nutanix guarantees a unique VM MAC address in a cluster. You can come across scenarios where two VM in different clusters can have the same MAC address.
    Note: Acropolis leader generates MAC address for the VM on AHV. The first 24 bits of the MAC address is set to 50-6b-8d ( 0101 0000 0110 1101 1000 1101 ) and are reserved by Nutanix, the 25th bit is set to 1 (reserved by Acropolis leader), the 26th bit to 48th bits are auto generated random numbers.
  7. To configure affinity policy for this VM, click Set Affinity .
    1. Select the host or hosts on which you want configure the affinity for this VM.
    2. Click Save .
      The selected host or hosts are listed. This configuration is permanent. The VM will not be moved from this host or hosts even in case of HA event and will take effect once the VM starts.
  8. To customize the VM by using Cloud-init (for Linux VMs) or Sysprep (for Windows VMs), select the Custom Script check box.
    Fields required for configuring Cloud-init and Sysprep, such as options for specifying a configuration script or answer file and text boxes for specifying paths to required files, appear below the check box.
    Figure. Create VM Dialog Box (custom script fields) Click to enlarge custom script fields in the create VM screen

  9. To specify a user data file (Linux VMs) or answer file (Windows VMs) for unattended provisioning, do one of the following:
    • If you uploaded the file to a storage container on the cluster, click ADSF path , and then enter the path to the file.

      Enter the ADSF prefix ( adsf:// ) followed by the absolute path to the file. For example, if the user data is in /home/my_dir/cloud.cfg , enter adsf:///home/my_dir/cloud.cfg . Note the use of three slashes.

    • If the file is available on your local computer, click Upload a file , click Choose File , and then upload the file.
    • If you want to create or paste the contents of the file, click Type or paste script , and then use the text box that is provided.
  10. To copy one or more files to a location on the VM (Linux VMs) or to a location in the ISO file (Windows VMs) during initialization, do the following:
    1. In Source File ADSF Path , enter the absolute path to the file.
    2. In Destination Path in VM , enter the absolute path to the target directory and the file name.
      For example, if the source file entry is /home/my_dir/myfile.txt then the entry for the Destination Path in VM should be /<directory_name>/copy_desitation> i.e. /mnt/myfile.txt .
    3. To add another file or directory, click the button beside the destination path field. In the new row that appears, specify the source and target details.
  11. When all the field entries are correct, click the Save button to create the VM and close the Create VM dialog box.
    The new VM appears in the VM table view.

Managing a VM (AHV)

You can use the web console to manage virtual machines (VMs) in Acropolis managed clusters.

About this task

After creating a VM, you can use the web console to start or shut down the VM, launch a console window, update the VM configuration, take a snapshot, attach a volume group, migrate the VM, clone the VM, or delete the VM.

Note: Your available options depend on the VM status, type, and permissions. Unavailable options are grayed out.

To accomplish one or more of these tasks, do the following:

Procedure

  1. In the VM dashboard, click the Table view.
  2. Select the target VM in the table (top section of screen).
    The Summary line (middle of screen) displays the VM name with a set of relevant action links on the right. You can also right-click on a VM to select a relevant action.

    The possible actions are Manage Guest Tools , Launch Console , Power on (or Power off ), Take Snapshot , Migrate , Clone , Update , and Delete .

    Note: VM pause and resume feature is not supported on AHV.
    The following steps describe how to perform each action.
    Figure. VM Action Links Click to enlarge

  3. To manage guest tools as follows, click Manage Guest Tools .
    You can also enable NGT applications (self-service restore, Volume Snapshot Service and application-consistent snapshots) also as part of manage guest tools.
    1. Select Enable Nutanix Guest Tools check box to enable NGT on the selected VM.
    2. Select Mount Nutanix Guest Tools to mount NGT on the selected VM.
      Ensure that VM must have at least one empty IDE CD-ROM slot to attach the ISO.
      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
    3. To enable self-service restore feature for Windows VMs, click Self Service Restore (SSR) check box.
      The Self-Service Restore feature is enabled of the VM. The guest VM administrator can restore the desired file or files from the VM. For more information about self-service restore feature, see Self-Service Restore in the Data Protection and Recovery with Prism Element guide.

    4. After you select Enable Nutanix Guest Tools check box the VSS snapshot feature is enabled by default.
      After this feature is enabled, Nutanix native in-guest VmQuiesced Snapshot Service (VSS) agent takes snapshots for VMs that support VSS.
      Note:

      The AHV VM snapshots are not application consistent. The AHV snapshots are taken from the VM entity menu by selecting a VM and clicking Take Snapshot .

      The application consistent snapshots feature is available with Protection Domain based snapshots and Recovery Points in Prism Central. For more information, see Conditions for Application-consistent Snapshots in the Data Protection and Recovery with Prism Element guide.

    5. Click Submit .
      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
      Note:
      If you eject the CD, you can mount the CD back again by logging into the Controller VM and running the following nCLI command.
      nutanix@cvm$ ncli ngt mount vm-id=virtual_machine_id

      For example, to mount the NGT on the VM with VM_ID=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-c1601e759987, type the following command.

      nutanix@cvm$ ncli ngt mount vm-id=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-
      c1601e759987
  4. To launch a console window, click the Launch Console action link.
    This opens a Virtual Network Computing (VNC) client and displays the console in a new tab or window. This option is available only when the VM is powered on. The console window includes four menu options (top right):
    • Clicking the Mount ISO button displays the following window that allows you to mount an ISO image to the VM. To mount an image, select the desired image and CD-ROM drive from the pull-down lists and then click the Mount button.
      Figure. Mount Disk Image Window Click to enlarge mount ISO image window from VNC console

      Note: For information about how to select CD-ROM as the storage device when you intent to provide a system image from CD-ROM, see Add New Disk in Creating a VM (AHV).
    • Clicking the C-A-D icon button sends a CtrlAltDel command to the VM.
    • Clicking the camera icon button takes a screenshot of the console window.
    • Clicking the power icon button allows you to power on/off the VM. These are the same options that you can access from the Power On Actions or Power Off Actions action link below the VM table (see next step).
    Figure. Virtual Network Computing (VNC) Window Click to enlarge

  5. To start or shut down the VM, click the Power on (or Power off ) action link.

    Power on begins immediately. If you want to power off the VMs, you are prompted to select one of the following options:

    • Power Off . Hypervisor performs a hard power off action on the VM.
    • Power Cycle . Hypervisor performs a hard restart action on the VM.
    • Reset . Hypervisor performs an ACPI reset action through the BIOS on the VM.
    • Guest Shutdown . Operating system of the VM performs a graceful shutdown.
    • Guest Reboot . Operating system of the VM performs a graceful restart.
    Note: If you perform power operations such as Guest Reboot or Guest Shutdown by using the Prism Element web console or API on Windows VMs, these operations might silently fail without any error messages if at that time a screen saver is running in the Windows VM. Perform the same power operations again immediately, so that they succeed.
  6. To make a snapshot of the VM, click the Take Snapshot action link.

    For more information, see Virtual Machine Snapshots .

  7. To migrate the VM to another host, click the Migrate action link.
    This displays the Migrate VM dialog box. Select the target host from the pull-down list (or select the System will automatically select a host option to let the system choose the host) and then click the Migrate button to start the migration.
    Figure. Migrate VM Dialog Box Click to enlarge

    Note: Nutanix recommends to live migrate VMs when they are under light load. If they are migrated while heavily utilized, migration may fail because of limited bandwidth.
  8. To clone the VM, click the Clone action link.
    This displays the Clone VM dialog box, which includes the same fields as the Create VM dialog box but with all fields (except the name) filled in with the current VM settings and number of clones needed. Enter a name for the clone and number of clones of the VM that are required and then click the Save button to create the clone. You can create a modified clone by changing some of the settings. You can also customize the VM during initialization by providing a custom script and specifying files needed during the customization process.
    Figure. Clone VM Dialog Box Click to enlarge

  9. To modify the VM configuration, click the Update action link.

    The Update VM dialog box appears, which includes the same fields as the Create VM dialog box. Modify the configuration as needed, and then save the configuration. In addition to modifying the configuration, you can attach a volume group to the VM and enable flash mode on the VM. If you attach a volume group to a VM that is part of a protection domain, the VM is not protected automatically. Add the VM to the same Consistency Group manually.

    (For GPU-enabled AHV clusters only) You can add pass-through GPUs if a VM is already using GPU pass-through. You can also change the GPU configuration from pass-through to vGPU or vGPU to pass-through, change the vGPU profile, add more vGPUs, and change the specified vGPU license. However, you need to power off the VM before you perform these operations.

    You can add new network adapters or NICs using the Add New NIC option. You can also modify the network used by an existing NIC. See Limitation for vNIC Hot-Unplugging and Creating a VM (AHV) before you modify the NIC network or create a new NIC for a VM.

    Figure. VM Update Dialog Box Click to enlarge

    Note: If you delete a vDisk attached to a VM and snapshots associated with this VM exist, space associated with that vDisk is not reclaimed unless you also delete the VM snapshots.
    To increase the memory allocation and the number of vCPUs on your VMs while the VMs are powered on (hot-pluggable), do the following:
    1. In the vCPUs field, you can increase the number of vCPUs on your VMs while the VMs are powered on.
    2. In the Number of Cores Per vCPU field, you can change the number of cores per vCPU only if the VMs are powered off.
      Note: This is not a hot-pluggable feature.
    3. In the Memory field, you can increase the memory allocation on your VMs while the VMs are powered on.
    For more information about hot-pluggable vCPUs and memory, see Virtual Machine Memory and CPU Hot-Plug Configurations in the AHV Administration Guide .
    To attach a volume group to the VM, do the following:
    1. In the Volume Groups section, click Add volume group , and then do one of the following:
      • From the Available Volume Groups list, select the volume group that you want to attach to the VM.
      • Click Create new volume group , and then, in the Create Volume Group dialog box, create a volume group (see Creating a Volume Group). After you create a volume group, select it from the Available Volume Groups list.
      Repeat these steps until you have added all the volume groups that you want to attach to the VM.
    2. Click Add .
    1. To enable flash mode on the VM, click the Enable Flash Mode check box.
      • After you enable this feature on the VM, the status is updated in the VM table view. To view the status of individual virtual disks (disks that are flashed to the SSD), go the Virtual Disks tab in the VM table view.
      • You can disable the flash mode feature for individual virtual disks. To update the flash mode for individual virtual disks, click the update disk icon in the Disks pane and deselect the Enable Flash Mode check box.
  10. To delete the VM, click the Delete action link. A window prompt appears; click the OK button to delete the VM.
    The deleted VM disappears from the list of VMs in the table.

Limitation for vNIC Hot-Unplugging

If you detach (hot-unplug) the vNIC for the VM with guest OS installed on it, the AOS displays the detach result as successful, but the actual detach success depends on the status of the ACPI mechanism in guest OS.

The following table describes the vNIC detach observations and workaround applicable based on guest OS response to ACPI request:

Table 1. vNIC Detach - Observations and Workaround
Detach Procedure Followed Guest OS responds to ACPI request (Yes/No) AOS Behavior Actual Detach Result Workaround
vNIC Detach (hot-unplug)
  • Using Prism Central: See Managing a VM (AHV) topic in Prism Central Guide.
  • Using Prism Element web console: See Managing a VM (AHV).
  • Using acli: Log on to the CVM with SSH and run the following command:

    nutanix@cvm$ acli vm.nic_delete <vm_name> <nic mac address>

    or,

    nutanix@cvm$ acli vm.nic_update <vm_name> <nic mac address> connected=false

    Replace the following attributes in the above commands:

    • <vm_name> with the name of the guest VM for which the vNIC is to be detached.
    • <nic mac address> with the vNIC MAC address that needs to be detached.
Yes

vNIC detach is Successful.

Observe the following logs:

Device detached successfully

vNIC detach is successful. No action needed
No vNIC detach is not successful. Power cycle the VM for successful vNIC detach.
Note: In most cases, it is observed that the ACPI mechanism failure occurs when no guest OS is installed on the VM.

Virtual Machine Snapshots

You can generate snapshots of virtual machines or VMs. You can generate snapshots of VMs manually or automatically. Some of the purposes that VM snapshots serve are as follows:

  • Disaster recovery
  • Testing - as a safe restoration point in case something went wrong during testing.
  • Migrate VMs
  • Create multiple instances of a VM.

Snapshot is a point-in-time state of entities such as VM and Volume Groups, and used for restoration and replication of data.. You can generate snapshots and store them locally or remotely. Snapshots are mechanism to capture the delta changes that has occurred over time. Snapshots are primarily used for data protection and disaster recovery. Snapshots are not autonomous like backup, in the sense that they depend on the underlying VM infrastructure and other snapshots to restore the VM. Snapshots consume less resources compared to a full autonomous backup. Typically, a VM snapshot captures the following:

  • The state including the power state (for example, powered-on, powered-off, suspended) of the VMs.
  • The data includes all the files that make up the VM. This data also includes the data from disks, configurations, and devices, such as virtual network interface cards.

For more information about creating VM snapshots, see Creating a VM Snapshot Manually section in the Prism Web Console Guide .

VM Snapshots and Snapshots for Disaster Recovery

The VM Dashboard only allows you to generate VM snapshots manually. You cannot select VMs and schedule snapshots of the VMs using the VM dashboard. The snapshots generated manually have very limited utility.

Note: These snapshots (stored locally) cannot be replicated to other sites.

You can schedule and generate snapshots as a part of the disaster recovery process using Nutanix DR solutions. AOS generates snapshots when you protect a VM with a protection domain using the Data Protection dashboard in Prism Web Console (see the Data Protection and Recovery with Prism Element guide). Similarly, Recovery Points (snapshots are called Recovery Points in Prism Central) when you protect a VM with a protection policy using Data Protection dashboard in Prism Central (see the Leap Administration Guide).

For example, in the Data Protection dashboard in Prism Web Console, you can create schedules to generate snapshots using various RPO schemes such as asynchronous replication with frequency intervals of 60 minutes or more, or NearSync replication with frequency intervals of as less as 20 seconds up to 15 minutes. These schemes create snapshots in addition to the ones generated by the schedules, for example, asynchronous replication schedules generate snapshots according to the configured schedule and, in addition, an extra snapshot every 6 hours. Similarly, NearSync generates snapshots according to the configured schedule and also generates one extra snapshot every hour.

Similarly, you can use the options in the Data Protection section of Prism Central to generate Recovery Points using the same RPO schemes.

Windows VM Provisioning

Nutanix VirtIO for Windows

Nutanix VirtIO is a collection of drivers for paravirtual devices that enhance the stability and performance of virtual machines on AHV.

Nutanix VirtIO is available in two formats:

  • To install Windows in a VM on AHV, use the VirtIO ISO.
  • To update VirtIO for Windows, use the VirtIO MSI installer file.

Use Nutanix Guest Tools (NGT) to install the Nutanix VirtlO Package. For more information about installing the Nutanix VirtIO package using the NGT, see NGT Installation in the Prism Web Console Guide .

VirtIO Requirements

Requirements for Nutanix VirtIO for Windows.

VirtIO supports the following operating systems:

  • Microsoft Windows server version: Windows 2008 R2 or later
  • Microsoft Windows client version: Windows 7 or later
Note: On Windows 7 and Windows Server 2008 R2, install Microsoft KB3033929 or update the operating system with the latest Windows Update to enable support for SHA2 certificates.
Caution: The VirtIO installation or upgrade may fail if multiple Windows VSS snapshots are present in the guest VM. The installation or upgrade failure is due to the timeout that occurs during installation of Nutanix VirtIO SCSI pass-through controller driver.

It is recommended to clean up the VSS snapshots or temporarily disconnect the drive that contains the snapshots. Ensure that you only delete the snapshots that are no longer needed. For more information about how to observe the VirtIO installation or upgrade failure that occurs due to availability of multiple Windows VSS snapshots, see KB-12374.

Installing or Upgrading Nutanix VirtIO for Windows

Download Nutanix VirtIO and the Nutanix VirtIO Microsoft installer (MSI). The MSI installs and upgrades the Nutanix VirtIO drivers.

Before you begin

Make sure that your system meets the VirtIO requirements described in VirtIO Requirements.

About this task

If you have already installed Nutanix VirtIO, use the following procedure to upgrade VirtIO to a latest version. If you have not yet installed Nutanix VirtIO, use the following procedure to install Nutanix VirtIO.

Procedure

  1. Go to the Nutanix Support portal and select Downloads > AHV and click VirtIO .
  2. Select the appropriate VirtIO package.
    • If you are creating a new Windows VM, download the ISO file. The installer is available on the ISO if your VM does not have Internet access.
    • If you are updating drivers in a Windows VM, download the MSI installer file.
    Figure. Search filter and VirtIO options Click to enlarge Use filter to search for the latest VirtIO package, ISO or MSI.

  3. Run the selected package.
    • For the ISO: Upload the ISO to the cluster, as described in the Configuring Images topic in Prism Web Console Guide .
    • For the MSI: open the download file to run the MSI.
  4. Read and accept the Nutanix VirtIO license agreement. Click Install .
    Figure. Nutanix VirtIO Windows Setup Wizard Click to enlarge Accept the License Agreement for Nutanix VirtIO Windows Installer

    The Nutanix VirtIO setup wizard shows a status bar and completes installation.

Manually Installing or Upgrading Nutanix VirtIO

Manually install or upgrade Nutanix VirtIO.

Before you begin

Make sure that your system meets the VirtIO requirements described in VirtIO Requirements.

About this task

Note: To automatically install Nutanix VirtIO, see Installing or Upgrading Nutanix VirtIO for Windows.

If you have already installed Nutanix VirtIO, use the following procedure to upgrade VirtIO to a latest version. If you have not yet installed Nutanix VirtIO, use the following procedure to install Nutanix VirtIO.

Procedure

  1. Go to the Nutanix Support portal, select Downloads > AHV , and click VirtIO .
  2. Do one of the following:
    • Extract the VirtIO ISO into the same VM where you load Nutanix VirtIO, for easier installation.

      If you choose this option, proceed directly to step 7.

    • Download the VirtIO ISO for Windows to your local machine.

      If you choose this option, proceed to step 3.

  3. Upload the ISO to the cluster, as described in the Configuring Images topic of Prism Web Console Guide .
  4. Locate the VM where you want to install the Nutanix VirtIO ISO and update the VM.
  5. Add the Nutanix VirtIO ISO by clicking Add New Disk and complete the indicated fields.
    • TYPE : CD-ROM
    • OPERATION : CLONE FROM IMAGE SERVICE
    • BUS TYPE : IDE
    • IMAGE : Select the Nutanix VirtIO ISO
  6. Click Add .
  7. Log on to the VM and browse to Control Panel > Device Manager .
  8. Note: Select the x86 subdirectory for 32-bit Windows, or the amd64 for 64-bit Windows.
    Open the devices and select the specific Nutanix drivers for download. For each device, right-click and Update Driver Software into the drive containing the VirtIO ISO. For each device, follow the wizard instructions until you receive installation confirmation.
    1. System Devices > Nutanix VirtIO Balloon Drivers
    2. Network Adapter > Nutanix VirtIO Ethernet Adapter .
    3. Processors > Storage Controllers > Nutanix VirtIO SCSI pass through Controller
      The Nutanix VirtIO SCSI pass-through controller prompts you to restart your system. Restart at any time to install the controller.
      Figure. List of Nutanix VirtIO downloads Click to enlarge This image lists the Nutanix VirtIO downloads required for 32-bit Windows.

Creating a Windows VM on AHV with Nutanix VirtIO

Create a Windows VM in AHV, or migrate a Windows VM from a non-Nutanix source to AHV, with the Nutanix VirtIO drivers.

Before you begin

  • Upload the Windows installer ISO to your cluster as described in the Configuring Images topic in Web Console Guide .
  • Upload the Nutanix VirtIO ISO to your cluster as described in the Configuring Images topic in Web Console Guide .

About this task

To install a new or migrated Windows VM with Nutanix VirtIO, complete the following.

Procedure

  1. Log on to the Prism web console using your Nutanix credentials.
  2. At the top-left corner, click Home > VM .
    The VM page appears.
  3. Click + Create VM in the corner of the page.
    The Create VM dialog box appears.
    Figure. Create VM dialog box Click to enlarge Create VM dialog box

  4. Complete the indicated fields.
    1. NAME : Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Timezone : Select the timezone that you want the VM to use. If you are creating a Linux VM, select (UTC) UTC .
      Note:

      The RTC of Linux VMs must be in UTC, so select the UTC timezone if you are creating a Linux VM.

      Windows VMs preserve the RTC in the local timezone, so set up the Windows VM with the hardware clock pointing to the desired timezone.

    4. Number of Cores per vCPU : Enter the number of cores assigned to each virtual CPU.
    5. MEMORY : Enter the amount of memory for the VM (in GiB).
  5. If you are creating a Windows VM, add a Windows CD-ROM to the VM.
    1. Click the pencil icon next to the CD-ROM that is already present and fill out the indicated fields.
      • OPERATION : CLONE FROM IMAGE SERVICE
      • BUS TYPE : IDE
      • IMAGE : Select the Windows OS install ISO.
    2. Click Update .
      The current CD-ROM opens in a new window.
  6. Add the Nutanix VirtIO ISO.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE : CD-ROM
      • OPERATION : CLONE FROM IMAGE SERVICE
      • BUS TYPE : IDE
      • IMAGE : Select the Nutanix VirtIO ISO.
    2. Click Add .
  7. Add a new disk for the hard drive.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE : DISK
      • OPERATION : ALLOCATE ON STORAGE CONTAINER
      • BUS TYPE : SCSI
      • STORAGE CONTAINER : Select the appropriate storage container.
      • SIZE : Enter the number for the size of the hard drive (in GiB).
    2. Click Add to add the disk driver.
  8. If you are migrating a VM, create a disk from the disk image.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE : DISK
      • OPERATION : CLONE FROM IMAGE
      • BUS TYPE : SCSI
      • CLONE FROM IMAGE SERVICE : Click the drop-down menu and choose the image you created previously.
    2. Click Add to add the disk driver.
  9. Optionally, after you have migrated or created a VM, add a network interface card (NIC).
    1. Click Add New NIC .
    2. In the VLAN ID field, choose the VLAN ID according to network requirements and enter the IP address, if necessary.
    3. Click Add .
  10. Click Save .

What to do next

Install Windows by following Installing Windows on a VM.

Installing Windows on a VM

Install a Windows virtual machine.

Before you begin

Create a Windows VM.

Procedure

  1. Log on to the web console.
  2. Click Home > VM to open the VM dashboard.
  3. Select the Windows VM.
  4. In the center of the VM page, click Power On .
  5. Click Launch Console .
    The Windows console opens in a new window.
  6. Select the desired language, time and currency format, and keyboard information.
  7. Click Next > Install Now .
    The Windows setup dialog box shows the operating systems to install.
  8. Select the Windows OS you want to install.
  9. Click Next and accept the license terms.
  10. Click Next > Custom: Install Windows only (advanced) > Load Driver > OK > Browse .
  11. Choose the Nutanix VirtIO driver.
    1. Select the Nutanix VirtIO CD drive.
    2. Expand the Windows OS folder and click OK .
    Figure. Select the Nutanix VirtIO drivers for your OS Click to enlarge Choose the driver folder.

    The Select the driver to install window appears.
  12. Select the VirtIO SCSI driver ( vioscsi.inf ) and click Next .
    Figure. Select the Driver for Installing Windows on a VM Click to enlarge Choose the VirtIO driver.

    The amd64 folder contains drivers for 64-bit operating systems. The x86 folder contains drivers for 32-bit operating systems.
    Note: From Nutanix VirtIO driver version 1.1.5, the driver package contains Windows Hardware Quality Lab (WHQL) certified driver for Windows.
  13. Select the allocated disk space for the VM and click Next .
    Windows shows the installation progress, which can take several minutes.
  14. Enter your user name and password information and click Finish .
    Installation can take several minutes.
    Once you complete the logon information, Windows setup completes installation.
  15. Follow the instructions in Installing or Upgrading Nutanix VirtIO for Windows to install other drivers which are part of Nutanix VirtIO package.

Windows Defender Credential Guard Support in AHV

AHV enables you to use the Windows Defender Credential Guard security feature on Windows guest VMs.

Windows Defender Credential Guard feature of Microsoft Windows operating systems allows you to securely isolate user credentials from the rest of the operating system. By that means, you can protect guest VMs from credential theft attacks such as Pass-the-Hash or Pass-The-Ticket.

See the Microsoft documentation for more information about the Windows Defender Credential Guard security feature.

Windows Defender Credential Guard Architecture in AHV

Figure. Architecture Click to enlarge

Windows Defender Credential Guard uses Microsoft virtualization-based security to isolate user credentials in the virtualization-based security (VBS) module in AHV. When you enable Windows Defender Credential Guard on an AHV guest VM, the guest VM runs on top of AHV running both the Windows OS and VBS. Each Windows OS guest VM, which has credential guard enabled, has a VBS to securely store credentials.

Windows Defender Credential Guard Requirements

Ensure the following to enable Windows Defender Credential Guard:

  1. AOS, AHV, and Windows Server versions support Windows Defender Credential Guard:
    • AOS version must be 5.19 or later
    • AHV version must be AHV 20201007.1 or later
    • Windows Server version must be Windows server 2016 or later, Windows 10 Enterprise or later and Windows Server 2019 or later
  2. UEFI, Secure Boot, and machine type q35 are enabled in the Windows VM from AOS.

    The Prism Element workflow to enable Windows Defender Credential Guard includes the workflow to enable these features.

Limitations

  • Windows Defender Credential guard is not supported on hosts with AMD CPUs.
  • If you enable Windows Defender Credential Guard for your AHV guest VMs, the following optional configurations are not supported:

    • vTPM (Virtual Trusted Platform Modules) to store MS policies.
    • DMA protection (vIOMMU).
    • Nutanix Live Migration.
    • Cross hypervisor DR of Credential Guard VMs.
Caution: Use of Windows Defender Credential Guard in your AHV clusters impacts VM performance. If you enable Windows Defender Credential Guard on AHV guest VMs, VM density drops by ~15–20%. This expected performance impact is due to nested virtualization overhead added as a result of enabling credential guard.

Enabling Windows Defender Credential Guard Support in AHV Guest VMs

You can enable Windows Defender Credential Guard when you are either creating a VM or updating a VM.

About this task

Perform the following procedure to enable Windows Defender Credential Guard:

Procedure

  1. Enable Windows Defender Credential Guard when you are either creating a VM or updating a VM. Do one of the following:
    • If you are creating a VM, see step 2.
    • If you are updating a VM, see step 3.
  2. If you are creating a Windows VM, do the following:
    1. Log on to the Prism Element web console.
    2. In the VM dashboard, click Create VM .
    3. Fill in the mandatory fields to configure a VM.
    4. Under Boot Configuration , select UEFI , and then select the Secure Boot and Windows Defender Credential Guard options.
      Figure. Enable Windows Defender Credential Guard Click to enlarge

      See UEFI Support for VM and Secure Boot Support for VMs for more information about these features.

    5. Proceed to configure other attributes for your Windows VM.
      See Creating a Windows VM on AHV with Nutanix VirtIO for more information.
    6. Click Save .
    7. Turn on the VM.
  3. If you are updating an existing VM, do the following:
    1. Log on to the Prism Element web console.
    2. In the VM dashboard, click the Table view, select the VM, and click Update .
    3. Under Boot Configuration , select UEFI , and then select the Secure Boot and Windows Defender Credential Guard options.
      Note:

      If the VM is configured to use BIOS, install the guest OS again.

      If the VM is already configured to use UEFI, skip the step to select Secure Boot.

      See UEFI Support for VM and Secure Boot Support for VMs for more information about these features.

    4. Click Save .
    5. Turn on the VM.
  4. Enable Windows Defender Credential Guard in the Windows VM by using group policy.
    See the Enable Windows Defender Credential Guard by using the Group Policy procedure of the Manage Windows Defender Credential Guard topic in the Microsoft documentation to enable VBS, Secure Boot, and Windows Defender Credential Guard for the Windows VM.
  5. Open command prompt in the Windows VM and apply the Group Policy settings:
    > gpupdate /force

    If you have not enabled Windows Defender Credential Guard (step 4) and perform this step (step 5), a warning similar to the following is displayed:

    Updating policy...
     
    Computer Policy update has completed successfully.
     
    The following warnings were encountered during computer policy processing:
     
    Windows failed to apply the {F312195E-3D9D-447A-A3F5-08DFFA24735E} settings. {F312195E-3D9D-447A-A3F5-08DFFA24735E} settings might have its own log file. Please click on the "More information" link.
    User Policy update has completed successfully.
     
    For more detailed information, review the event log or run GPRESULT /H GPReport.html from the command line to access information about Group Policy results.
    

    Event Viewer displays a warning for the group policy with an error message that indicates Secure Boot is not enabled on the VM.

    To view the warning message in Event Viewer, do the following:

    • In the Windows VM, open Event Viewer .
    • Go to Windows Logs -> System and click the warning with the Source as GroupPolicy (Microsoft-Windows-GroupPolicy) and Event ID as 1085 .
    Figure. Warning in Event Viewer Click to enlarge

    Note: Ensure that you follow the steps in the order that is stated in this document to successfully enable Windows Defender Credential Guard.
  6. Restart the VM.
  7. Verify if Windows Defender Credential Guard is enabled in your Windows VM.
    1. Start a Windows PowerShell terminal.
    2. Run the following command.
      PS > Get-CimInstance -ClassName Win32_DeviceGu
      ard -Namespace 'root\Microsoft\Windows\DeviceGuard'

      An output similar to the following is displayed.

      PS > Get-CimInstance -ClassName Win32_DeviceGuard -Namespace 'root\Microsoft\Windows\DeviceGuard'
      AvailableSecurityProperties              	: {1, 2, 3, 5}
      CodeIntegrityPolicyEnforcementStatus     	: 0
      InstanceIdentifier                       	: 4ff40742-2649-41b8-bdd1-e80fad1cce80
      RequiredSecurityProperties               	: {1, 2}
      SecurityServicesConfigured               	: {1}
      SecurityServicesRunning                  	: {1}
      UsermodeCodeIntegrityPolicyEnforcementStatus : 0
      Version                                  	: 1.0
      VirtualizationBasedSecurityStatus        	: 2
      PSComputerName 
      

      Confirm that both SecurityServicesConfigured and SecurityServicesRunning have the value { 1 } .

    Alternatively, you can verify if Windows Defender Credential Guard is enabled by using System Information (msinfo32):

    1. In the Windows VM, open System Information by typing msinfo32 in the search field next to the Start menu.
    2. Verify if the values of the parameters are as indicated in the following screen shot:
      Figure. Verify Windows Defender Credential Guard Click to enlarge

Affinity Policies for AHV

As an administrator of an AHV cluster, you can specify scheduling policies for virtual machines on an AHV cluster. By defining these policies, you can control the placement of the virtual machines on the hosts within a cluster.

You can define two types of affinity policies.

VM-Host Affinity Policy

The VM-host affinity policy controls the placement of the VMs. You can use this policy to specify that a selected VM can only run on the members of the affinity host list. This policy checks and enforces where a VM can be hosted when you restart or migrate the VM.
Note:
  • If you choose to apply the VM-host affinity policy, it limits Acropolis HA and Acropolis Dynamic Scheduling (ADS) in such a way that a virtual machine cannot be restarted or migrated to a host that does not conform to the requirements of the affinity policy as this policy is enforced mandatorily.
  • The VM-host anti-affinity policy is not supported.
  • VMs configured with host affinity settings retain these settings if the VM is migrated to a new cluster. Remove the VM-host affinity policies applied to a VM that you want to migrate to another cluster, as the UUID of the host is retained by the VM and it does not allow VM restart on the destination cluster. When you attempt to protect such VMs, it is successful. However, some disaster recovery operations like migration fail and attempts to power on these VMs also fail.
  • VMs with host affinity policies can only be migrated to the hosts specified in the affinity policy. If only one host is specified, the VM cannot be migrated or started on another host during an HA event. For more information, see Non-Migratable Hosts.

You can define the VM-host affinity policies by using Prism Element during the VM create or update operation. For more information, see Creating a VM (AHV).

VM-VM Anti-Affinity Policy

You can use this policy to specify anti-affinity between the virtual machines. The VM-VM anti-affinity policy keeps the specified virtual machines apart in such a way that when a problem occurs with one host, you should not lose both the virtual machines. However, this is a preferential policy. This policy does not limit the Acropolis Dynamic Scheduling (ADS) feature to take necessary action in case of resource constraints.
Note:
  • Currently, you can only define VM-VM anti-affinity policy by using aCLI. For more information, see Configuring VM-VM Anti-Affinity Policy.
  • The VM-VM affinity policy is not supported.
Note: If a VM is cloned that has the affinity policies configured, then the policies are not automatically applied to the cloned VM. However, if a VM is restored from a DR snapshot, the policies are automatically applied to the VM.

Limitations of Affinity Rules

Even though if a host is removed from a cluster, the host UUID is not removed from the host-affinity list for a VM.

Configuring VM-VM Anti-Affinity Policy

To configure VM-VM anti-affinity policies, you must first define a group and then add all the VMs on which you want to define VM-VM anti-affinity policy.

About this task

Note: Currently, the VM-VM affinity policy is not supported.

Perform the following procedure to configure the VM-VM anti-affinity policy.

Procedure

  1. Log on to the Controller VM with SSH session.
  2. Create a group.
    nutanix@cvm$ acli vm_group.create group_name

    Replace group_name with the name of the group.

  3. Add the VMs on which you want to define anti-affinity to the group.
    nutanix@cvm$ acli vm_group.add_vms group_name vm_list=vm_name

    Replace group_name with the name of the group. Replace vm_name with the name of the VMs that you want to define anti-affinity on. In case of multiple VMs, you can specify comma-separated list of VM names.

  4. Configure VM-VM anti-affinity policy.
    nutanix@cvm$ acli vm_group.antiaffinity_set group_name

    Replace group_name with the name of the group.

    After you configure the group, the new anti-affinity rule is applied when the ADS runs again the next time. ADS runs every 15 minutes.

Removing VM-VM Anti-Affinity Policy

Perform the following procedure to remove the VM-VM anti-affinity policy.

Procedure

  1. Log on to the Controller VM with SSH session.
  2. Remove the VM-VM anti-affinity policy.
    nutanix@cvm$ acli vm_group.antiaffinity_unset group_name

    Replace group_name with the name of the group.

    The VM-VM anti-affinity policy is removed for the VMs that are present in the group, and they can start on any host during the next power on operation (as necessitated by the ADS feature).

Non-Migratable Hosts

VMs with GPU, CPU passthrough, PCI passthrough, and host affinity policies are not migrated to other hosts in the cluster. Such VMs are treated in a different manner in scenarios where VMs are required to migrate to other hosts in the cluster.

Table 1. Scenarios Where VMs Are Required to Migrate to Other Hosts
Scenario Behavior
One-click upgrade VM is powered off.
Life-cycle management (LCM) Pre-check for LCM fails and the VMs are not migrated.
Rolling restart VM is powered off.
AHV host maintenance mode Use the tunable option to shut down the VMs while putting the node in maintenance mode. For more information, see Putting a Node into Maintenance Mode.

Performing Power Operations on VMs by Using Nutanix Guest Tools (aCLI)

You can initiate safe and graceful power operations such as soft shutdown and restart of the VMs running on the AHV hosts by using the aCLI. Nutanix Guest Tools (NGT) initiates and performs the soft shutdown and restart operations within the VM. This workflow ensures a safe and graceful shutdown or restart of the VM. You can create a pre-shutdown script that you can choose to run before a shutdown or restart of the VM. In the pre-shutdown script, include any tasks or checks that you want to run before a VM is shut down or restarted. You can choose to cancel the power operation if the pre-shutdown script fails. If the script fails, an alert (guest_agent_alert) is generated in the Prism web console.

Before you begin

Ensure that you have met the following prerequisites before you initiate the power operations:
  1. NGT is enabled on the VM. All operating systems that NGT supports are supported for this feature.
  2. NGT version running on the Controller VM and guest VM is the same.
  3. (Optional) If you want to run a pre-shutdown script, place the script in the following locations depending on your VMs:
    • Windows VMs: installed_dir\scripts\power_off.bat

      The file name of the script must be power_off.bat .

    • Linux VMs: installed_dir/scripts/power_off

      The file name of the script must be power_off .

About this task

Note: You can also perform these power operations by using the V3 API calls. For more information, see developer.nutanix.com.

Perform the following steps to initiate the power operations:

Procedure

  1. Log on to a Controller VM with SSH.
  2. Do one of the following:
    • Soft shut down the VM.
      nutanix@cvm$ acli vm.guest_shutdown vm_name enable_script_exec=[true or false] fail_on_script_failure=[true or false]

      Replace vm_name with the name of the VM.

    • Restart the VM.
      nutanix@cvm$ acli vm.guest_reboot vm_name enable_script_exec=[true or false] fail_on_script_failure=[true or false]

      Replace vm_name with the name of the VM.

    Set the value of enable_script_exec to true to run your pre-shutdown script and set the value of fail_on_script_failure to true to cancel the power operation if the pre-shutdown script fails.

UEFI Support for VM

UEFI firmware is a successor to legacy BIOS firmware that supports larger hard drives, faster boot time, and provides more security features.

VMs with UEFI firmware have the following advantages:

  • Boot faster
  • Avoid legacy option ROM address constraints
  • Include robust reliability and fault management
  • Use UEFI drivers
Note:
  • Nutanix supports the starting of VMs with UEFI firmware in an AHV cluster. However, if a VM is added to a protection domain and later restored on a different cluster, the VM loses boot configuration. To restore the lost boot configuration, see Setting up Boot Device.
  • Nutanix also provides limited support for VMs migrated from a Hyper-V cluster.

You can create or update VMs with UEFI firmware by using acli commands, Prism Element web console, or Prism Central web console. For more information about creating a VM by using the Prism Element web console or Prism Central web console, see Creating a VM (AHV). For information about creating a VM by using aCLI, see Creating UEFI VMs by Using aCLI.

Note: If you are creating a VM by using aCLI commands, you can define the location of the storage container for UEFI firmware and variables. Prism Element web console or Prism Central web console does not provide the option to define the storage container to store UEFI firmware and variables.

For more information about the supported OSes for the guest VMs, see the AHV Guest OS section in the ]Compatibility and Interoperability Matrix document.

Creating UEFI VMs by Using aCLI

In AHV clusters, you can create a virtual machine (VM) to start with UEFI firmware by using Acropolis CLI (aCLI). This topic describes the procedure to create a VM by using aCLI. See the "Creating a VM (AHV)" topic for information about how to create a VM by using the Prism Element web console.

Before you begin

Ensure that the VM has an empty vDisk.

About this task

Perform the following procedure to create a UEFI VM by using aCLI:

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. Create a UEFI VM.
    
    nutanix@cvm$ acli vm.create vm-name uefi_boot=true
    A VM is created with UEFI firmware. Replace vm-name with a name of your choice for the VM. By default, the UEFI firmware and variables are stored in an NVRAM container. If you would like to specify a location of the NVRAM storage container to store the UEFI firmware and variables, do so by running the following command.
    nutanix@cvm$ acli vm.create vm-name uefi_boot=true nvram_container=NutanixManagementShare
    Replace NutanixManagementShare with a storage container in which you want to store the UEFI variables.
    The UEFI variables are stored in a default NVRAM container. Nutanix recommends you to choose a storage container with at least RF2 storage policy to ensure the VM high availability for node failure scenarios. For more information about RF2 storage policy, see Failure and Recovery Scenarios in the Prism Web Console Guide document.
    Note: When you update the location of the storage container, clear the UEFI configuration and update the location of nvram_container to a container of your choice.

What to do next

Go to the UEFI BIOS menu and configure the UEFI firmware settings. For more information about accessing and setting the UEFI firmware, see Getting Familiar with UEFI Firmware Menu.

Getting Familiar with UEFI Firmware Menu

After you launch a VM console from the Prism Element web console, the UEFI firmware menu allows you to do the following tasks for the VM.

  • Changing default boot resolution
  • Setting up boot device
  • Changing boot-time value

Changing Boot Resolution

You can change the default boot resolution of your Windows VM from the UEFI firmware menu.

Before you begin

Ensure that the VM is in powered on state.

About this task

Perform the following procedure to change the default boot resolution of your Windows VM by using the UEFI firmware menu.

Procedure

  1. Log on to the Prism Element web console.
  2. Launch the console for the VM.
    For more details about launching console for the VM, see Managing a VM (AHV) section in Prism Web Console Guide .
  3. To go to the UEFI firmware menu, press the F2 keys on your keyboard.
    Tip: To enter UEFI menu, open the VM console, select Reset in the Power off/Reset VM dialog box, and immediately press F2 when the VM starts to boot.
    Important: Resetting the VM causes a downtime. We suggest that you reset the VM only during off-production hours or during a maintenance period.
    Figure. UEFI Firmware Menu Click to enlarge UEFI Firmware Menu

  4. Use the up or down arrow key to go to Device Manager and press Enter .
    The Device Manager page appears.
  5. In the Device Manager screen, use the up or down arrow key to go to OVMF Platform Configuration and press Enter .
    Figure. OVMF Settings Click to enlarge

    The OVMF Settings page appears.
  6. In the OVMF Settings page, use the up or down arrow key to go to the Change Preferred field and use the right or left arrow key to increase or decrease the boot resolution.
    The default boot resolution is 1280X1024.
  7. Do one of the following.
    • To save the changed resolution, press the F10 key.
    • To go back to the previous screen, press the Esc key.
  8. Select Reset and click Submit in the Power off/Reset dialog box to restart the VM.
    After you restart the VM, the OS displays the changed resolution.

Setting up Boot Device

You cannot set the boot order for UEFI VMs by using the aCLI, Prism Central web console, or Prism Element web console. You can change the boot device for a UEFI VM by using the UEFI firmware menu.

Before you begin

Ensure that the VM is in powered on state.

Procedure

  1. Log on to the Prism Element web console.
  2. Launch the console for the VM.
    For more details about launching console for the VM, see Managing a VM (AHV) section in Prism Web Console Guide .
  3. To go to the UEFI firmware menu, press the F2 keys on your keyboard.
    Tip: To enter UEFI menu, open the VM console, select Reset in the Power off/Reset VM dialog box, and immediately press F2 when the VM starts to boot.
    Important: Resetting the VM causes a downtime. We suggest that you reset the VM only during off-production hours or during a maintenance period.
  4. Use the up or down arrow key to go to Boot Manager and press Enter .
    The Boot Manager screen displays the list of available boot devices in the cluster.
    Figure. Boot Manager Click to enlarge

  5. In the Boot Manager screen, use the up or down arrow key to select the boot device and press Enter .
    The boot device is saved. After you select and save the boot device, the VM boots up with the new boot device.
  6. To go back to the previous screen, press Esc .

Changing Boot Time-Out Value

The boot time-out value determines how long the boot menu is displayed (in seconds) before the default boot entry is loaded to the VM. This topic describes the procedure to change the default boot-time value of 0 seconds.

About this task

Ensure that the VM is in powered on state.

Procedure

  1. Log on to the Prism Element web console.
  2. Launch the console for the VM.
    For more details about launching console for the VM, see Managing a VM (AHV) section in Prism Web Console Guide .
  3. To go to the UEFI firmware menu, press the F2 keys on your keyboard.
    Tip: To enter UEFI menu, open the VM console, select Reset in the Power off/Reset VM dialog box, and immediately press F2 when the VM starts to boot.
    Important: Resetting the VM causes a downtime. We suggest that you reset the VM only during off-production hours or during a maintenance period.
  4. Use the up or down arrow key to go to Boot Maintenance Manager and press Enter .
    Figure. Boot Maintenance Manager Click to enlarge

  5. In the Boot Maintenance Manager screen, use the up or down arrow key to go to the Auto Boot Time-out field.
    The default boot-time value is 0 seconds.
  6. In the Auto Boot Time-out field, enter the boot-time value and press Enter .
    Note: The valid boot-time value ranges from 1 second to 9 seconds.
    The boot-time value is changed. The VM starts after the defined boot-time value.
  7. To go back to the previous screen, press Esc .

Secure Boot Support for VMs

The pre-operating system environment is vulnerable to attacks by possible malicious loaders. Secure boot addresses this vulnerability with UEFI secure boot using policies present in the firmware along with certificates, to ensure that only properly signed and authenticated components are allowed to execute.

Supported Operating Systems

For more information about the supported OSes for the guest VMs, see the AHV Guest OS section in the Compatibility and Interoperability Matrix document.

Secure Boot Considerations

This section provides the limitations and requirements to use Secure Boot.

Limitations

Secure Boot for guest VMs has the following limitation:

  • Nutanix does not support converting a VM that uses IDE disks or legacy BIOS to VMs that use Secure Boot.
  • The minimum supported version of the Nutanix VirtIO package for Secure boot-enabled VMs is 1.1.6.
  • Secure boot VMs do not permit CPU, memory, or PCI disk hot plug.

Requirements

Following are the requirements for Secure Boot:

  • Secure Boot is supported only on the Q35 machine type.

Creating/Updating a VM with Secure Boot Enabled

You can enable Secure Boot with UEFI firmware, either while creating a VM or while updating a VM by using aCLI commands or Prism Element web console.

See Creating a VM (AHV) for instructions about how to enable Secure Boot by using the Prism Element web console.

Creating a VM with Secure Boot Enabled

About this task

To create a VM with Secure Boot enabled:

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. To create a VM with Secure Boot enabled:
    nutanix@cvm$ acli vm.create  <vm_name> secure_boot=true machine_type=q35
    Note: Specifying the machine type is required to enable the secure boot feature. UEFI is enabled by default when the Secure Boot feature is enabled.

Updating a VM to Enable Secure Boot

About this task

To update a VM to enable Secure Boot:

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. To update a VM to enable Secure Boot, ensure that the VM is powered off.
    nutanix@cvm$ acli vm.update <vm_name> secure_boot=true machine_type=q35
    Note:
    • If you disable the secure boot flag alone, the machine type remains q35, unless you disable that flag explicitly.
    • UEFI is enabled by default when the Secure Boot feature is enabled. Disabling Secure Boot does not revert the UEFI flags.

Virtual Machine Network Management

Virtual machine network management involves configuring connectivity for guest VMs through virtual switches and VPCs.

For information about creating or updating a virtual switch and other VM network options, see Network and Security Management in Prism Central Guide . Virtual switch creation and updates are also covered in Network Management in Prism Web Console Guide .

Configuring a Virtual NIC to Operate in Access or Trunk Mode

By default, a virtual NIC on a guest VM operates in access mode. In this mode, the virtual NIC can send and receive traffic only over its own VLAN, which is the VLAN of the virtual network to which it is connected. If restricted to using access mode interfaces, a VM running an application on multiple VLANs (such as a firewall application) must use multiple virtual NICs—one for each VLAN. Instead of configuring multiple virtual NICs in access mode, you can configure a single virtual NIC on the VM to operate in trunk mode. A virtual NIC in trunk mode can send and receive traffic over any number of VLANs in addition to its own VLAN. You can trunk specific VLANs or trunk all VLANs. You can also convert a virtual NIC from the trunk mode to the access mode, in which case the virtual NIC reverts to sending and receiving traffic only over its own VLAN.

About this task

To configure a virtual NIC as an access port or trunk port, do the following:

Procedure

  1. Log on to the CVM with SSH.
  2. Do one of the following:
    1. Create a virtual NIC on the VM and configure the NIC to operate in the required mode.
      nutanix@cvm$ acli vm.nic_create <vm_name> network=network [vlan_mode={kAccess | kTrunked}] [trunked_networks=networks]

      Specify appropriate values for the following parameters:

      • <vm_name> . Name of the VM.
      • network . Name of the virtual network to which you want to connect the virtual NIC.
      • trunked_networks . Comma-separated list of the VLAN IDs that you want to trunk. The parameter is processed only if vlan_mode is set to kTrunked and is ignored if vlan_mode is set to kAccess . To include the default VLAN, VLAN 0, include it in the list of trunked networks. To trunk all VLANs, set vlan_mode to kTrunked and skip this parameter.
      • vlan_mode . Mode in which the virtual NIC must operate. Set the parameter to kAccess for access mode and to kTrunked for trunk mode. Default: kAccess .
    2. Configure an existing virtual NIC to operate in the required mode.
      nutanix@cvm$ acli vm.nic_update <vm_name> mac_addr update_vlan_trunk_info=true [vlan_mode={kAccess | kTrunked}] [trunked_networks=networks]

      Specify appropriate values for the following parameters:

      • <vm_name> . Name of the VM.
      • mac_addr . MAC address of the virtual NIC to update (the MAC address is used to identify the virtual NIC). Required to update a virtual NIC.
      • update_vlan_trunk_info . Update the VLAN type and list of trunked VLANs. Set update_vlan_trunk_info=true to enable trunked mode. If not specified, the parameter defaults to false and the vlan_mode and trunked_networks parameters are ignored.
        Note: You must set the update_vlan_trunk_info to true . If you do not set this parameter to true , "trunked_networks" are not changed.
      • vlan_mode . Mode in which the virtual NIC must operate. Set the parameter to kAccess for access mode and to kTrunked for trunk mode.
      • trunked_networks . Comma-separated list of the VLAN IDs that you want to trunk. The parameter is processed only if vlan_mode is set to kTrunked and is ignored if vlan_mode is set to kAccess . To include the default VLAN, VLAN 0, include it in the list of trunked networks. To trunk all VLANs, set vlan_mode to kTrunked and skip this parameter.

Virtual Machine Memory and CPU Hot-Plug Configurations

Memory and CPUs are hot-pluggable on guest VMs running on AHV. You can increase the memory allocation and the number of CPUs on your VMs while the VMs are powered on. You can change the number of vCPUs (sockets) while the VMs are powered on. However, you cannot change the number of cores per socket while the VMs are powered on.

Note: You cannot decrease the memory allocation and the number of CPUs on your VMs while the VMs are powered on.

You can change the memory and CPU configuration of your VMs by using the Acropolis CLI (aCLI) (see Managing a VM (AHV) in the Prism Web Console Guide or see Managing a VM (AHV) and Managing a VM (Self Service) in the Prism Central Guide).

See the AHV Guest OS Compatibility Matrix for information about operating systems on which you can hot plug memory and CPUs.

Memory OS Limitations

  1. On Linux operating systems, the Linux kernel might not make the hot-plugged memory online. If the memory is not online, you cannot use the new memory. Perform the following procedure to make the memory online.
    1. Identify the memory block that is offline.

      Display the status of all of the memory.

      $ cat /sys/devices/system/memory/memoryXXX/state 
      

      Display the state of a specific memory block.

      $ grep line /sys/devices/system/memory/*/state 
      
    2. Make the memory online.
      $ echo online > /sys/devices/system/memory/memoryXXX/state 
      
  2. If your VM has CentoOS 7.2 as the guest OS and less than 3 GB memory, hot plugging more memory to that VM so that the final memory is greater than 3 GB, results in a memory-overflow condition. To resolve the issue, restart the guest OS (CentOS 7.2) with the following setting:
    swiotlb=force 
    

CPU OS Limitation

On CentOS operating systems, if the hot-plugged CPUs are not displayed in /proc/cpuinfo , you might have to bring the CPUs online. For each hot-plugged CPU, run the following command to bring the CPU online.

$ echo 1 > /sys/devices/system/cpu/cpu<n>/online  

Replace <n> with the number of the hot plugged CPU.

Hot-Plugging the Memory and CPUs on Virtual Machines (AHV)

About this task

Perform the following procedure to hot plug the memory and CPUs on the AHV VMs.

Procedure

  1. Log on the Controller VM with SSH.
  2. Update the memory allocation for the VM.
    nutanix@cvm$ acli vm.update vm-name memory=new_memory_size 
    

    Replace vm-name with the name of the VM and new_memory_size with the memory size.

  3. Update the number of CPUs on the VM.
    nutanix@cvm$ acli vm.update vm-name num_vcpus=n 
    

    Replace vm-name with the name of the VM and n with the number of CPUs.

    Note: After you upgrade from a hot-plug unsupported version to the hot-plug supported version, you must power cycle the VM that was instantiated and powered on before the upgrade, so that it is compatible with the memory and CPU hot-plug feature. This power-cycle has to be done only once after the upgrade. New VMs created on the supported version shall have the hot-plug compatibility by default.

Virtual Machine Memory Management (vNUMA)

AHV hosts support Virtual Non-uniform Memory Access (vNUMA) on virtual machines. You can enable vNUMA on VMs when you create or modify the VMs to optimize memory performance.

Non-uniform Memory Access (NUMA)

In a NUMA topology, the memory access times of a VM depend on the memory location relative to a processor. A VM accesses memory local to a processor faster than the non-local memory. If the VM uses both CPU and memory from the same physical NUMA node, you can achieve optimal resource utilization. If you are running the CPU on one NUMA node (for example, node 0) and the VM accesses the memory from another node (node 1) then memory latency occurs. Ensure that the virtual topology of VMs matches the physical hardware topology to achieve minimum memory latency.

Virtual Non-uniform Memory Access (vNUMA)

vNUMA optimizes the memory performance of virtual machines that require more vCPUs or memory than the capacity of a single physical NUMA node. In a vNUMA topology, you can create multiple vNUMA nodes where each vNUMA node includes vCPUs and virtual RAM. When you assign a vNUMA node to a physical NUMA node, the vCPUs can intelligently determine the memory latency (high or low). Low memory latency within a vNUMA node results in low latency in the physical NUMA node as well.

vNUMA vCPU hard-pinning

When you configure NUMA and hyper-threading, you ensure that the VM can schedule on virtual peers. You also expose the NUMA topology to the VM. While this configuration helps you limit the amount of memory that is available to each virtual NUMA node, the distribution underneath, in the hardware, still occurs randomly.

Enable virtual CPU (vCPU) hard-pinning in the topology to define which NUMA node the vCPUs (and hyper-threads or peers) are located on and how much memory that NUMA node has. vCPU hard-pinning also allows you to see a proper mapping of vCPU to CPU set (virtual CPU to physical core or hyper-thread). It ensures that a VM is never scheduled on a different core or peer that is not defined in the hard-pin configuration. It also results in memory being allocated and distributed correctly across the configured mapping,

While vCPU hard-pinning gives a benefit to scheduling operations and memory operations, it also has a couple of caveats.

  • Acropolis Dynamic Scheduling (ADS) is not NUMA aware, so the high availability (HA) process is not NUMA aware. This lack of awareness can lead to potential issues when a host fails.

  • When you start a VM, a process running in the background nullifies the memory pages for a VM. The more memory is allocated to a VM, the longer this process takes. Consider a deployment having 10 VMs: 9 have 4GB RAM and one has 4.5TB RAM. The process runs faster on the 9 VMs with lesser RAM while it takes longer to complete on the VM with more RAM (perhaps a couple of seconds for the VMs with less RAM vs potentially 2 minutes for the VM with more RAM). The potential issue this time lag leads to is: the smaller VMs are already running on a socket, and when trying to power on the large memory VM, that socket or the cores is unavailable. The unavailability could result in a boot failure and error message when starting the VM.

    The workaround is to use affinity rules and ensure that large VMs that have vCPU hard-pinning configured have a failover node available to them, with a different affinity rule for the non-pinned VMs.

For information about configuring vCPU hard-pinning, see Enabling vNUMA on Virtual Machines.

Enabling vNUMA on Virtual Machines

Before you begin

Before you enable vNUMA, see AHV Best Practices Guide under Solutions Documentation .

About this task

Perform the following procedure to enable vNUMA on your VMs running on the AHV hosts.

Procedure

  1. Log on to a Controller VM with SSH.
  2. Check how many NUMA nodes are available on each AHV host in the cluster.
    nutanix@cvm$ hostssh "numactl --hardware"

    The console displays an output similar to the following:

    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
    node 0 size: 128837 MB
    node 0 free: 862 MB
    node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
    node 1 size: 129021 MB
    node 1 free: 352 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128859 MB
    node 0 free: 1076 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129000 MB
    node 1 free: 436 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128859 MB
    node 0 free: 701 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129000 MB
    node 1 free: 357 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128838 MB
    node 0 free: 1274 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129021 MB
    node 1 free: 424 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128837 MB
    node 0 free: 577 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129021 MB
    node 1 free: 612 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10

    The example output shows that each AHV host has two NUMA nodes.

  3. Do one of the following:
    • Enable vNUMA if you are creating a VM.
      nutanix@cvm$ acli vm.create <vm_name> num_vcpus=x \
      num_cores_per_vcpu=x memory=xG \
      num_vnuma_nodes=x
    • Enable vNUMA if you are modifying an existing VM.
      nutanix@cvm$ acli vm.update <vm_name> \
      num_vnuma_nodes=x
    Replace <vm_name> with the name of the VM on which you want to enable vNUMA or vUMA. Replace x with the values for the following indicated parameters:
    • num_vcpus : Type the number of vCPUs for the VM.
    • num_cores_per_vcpu : Type the number of cores per vCPU.
    • memory : Type the memory in GB for the VM.
    • num_vnuma_nodes : Type the number of vNUMA nodes for the VM.

    For example:

    nutanix@cvm$ acli vm.create test_vm num_vcpus=20 memory=150G num_vnuma_nodes=2

    This command creates a VM with 2 vNUMA nodes, 10 vCPUs and 75 GB memory for each vNUMA node.

What to do next

To configure vCPU hard-pinning on existing VMs, do the following:
nutanix@cvm$ acli vm.update <vm_name> num_vcpus=x
nutanix@cvm$ acli vm.update <vm_name> num_cores_per_vcpu=x
nutanix@cvm$ acli vm.update <vm_name> num_threads_per_core=x
nutanix@cvm$ acli vm.update <vm_name> num_vnuma_nodes=x
nutanix@cvm$ acli vm.update <vm_name> vcpu_hard_pin=true

For example,

nutanix@cvm$ acli vm.update <vm_name> num_vcpus=3
nutanix@cvm$ acli vm.update <vm_name> num_cores_per_vcpu=28
nutanix@cvm$ acli vm.update <vm_name> num_threads_per_core=2
nutanix@cvm$ acli vm.update <vm_name> num_vnuma_nodes=3
nutanix@cvm$ acli vm.update <vm_name> vcpu_hard_pin=true

GPU and vGPU Support

AHV supports GPU-accelerated computing for guest VMs. You can configure either GPU pass-through or a virtual GPU.
Note: You can configure either pass-through or a vGPU for a guest VM but not both.

This guide describes the concepts related to the GPU and vGPU support in AHV. For the configuration procedures, see the Prism Web Console Guide.

For driver installation instructions, see the NVIDIA Grid Host Driver for Nutanix AHV Installation Guide .

Note: VMs with GPU are not migrated to other hosts in the cluster. For more information, see Non-Migratable Hosts.

Supported GPUs

The following GPUs are supported:
Note: These GPUs are supported only by the AHV version that is bundled with the AOS release.
  • NVIDIA® Ampere® A16
  • NVIDIA® Ampere® A30
  • NVIDIA® Ampere® A40
  • NVIDIA® Ampere® A100
  • NVIDIA® Quadro® RTX 6000
  • NVIDIA® Quadro® RTX 8000
  • NVIDIA® Tesla® M10
  • NVIDIA® Tesla® M60
  • NVIDIA® Tesla® P4
  • NVIDIA® Tesla® P40
  • NVIDIA® Tesla® P100
  • NVIDIA® Tesla® T4 16 GB
  • NVIDIA® Tesla® V100 16 GB
  • NVIDIA® Tesla® V100 32 GB
  • NVIDIA® Tesla® V100S 32 GB

GPU Pass-Through for Guest VMs

AHV hosts support GPU pass-through for guest VMs, allowing applications on VMs direct access to GPU resources. The Nutanix user interfaces provide a cluster-wide view of GPUs, allowing you to allocate any available GPU to a VM. You can also allocate multiple GPUs to a VM. However, in a pass-through configuration, only one VM can use a GPU at any given time.

Host Selection Criteria for VMs with GPU Pass-Through

When you power on a VM with GPU pass-through, the VM is started on the host that has the specified GPU, provided that the Acropolis Dynamic Scheduler determines that the host has sufficient resources to run the VM. If the specified GPU is available on more than one host, the Acropolis Dynamic Scheduler ensures that a host with sufficient resources is selected. If sufficient resources are not available on any host with the specified GPU, the VM is not powered on.

If you allocate multiple GPUs to a VM, the VM is started on a host if, in addition to satisfying Acropolis Dynamic Scheduler requirements, the host has all of the GPUs that are specified for the VM.

If you want a VM to always use a GPU on a specific host, configure host affinity for the VM.

Support for Graphics and Compute Modes

AHV supports running GPU cards in either graphics mode or compute mode. If a GPU is running in compute mode, Nutanix user interfaces indicate the mode by appending the string compute to the model name. No string is appended if a GPU is running in the default graphics mode.

Switching Between Graphics and Compute Modes

If you want to change the mode of the firmware on a GPU, put the host in maintenance mode, and then flash the GPU manually by logging on to the AHV host and performing standard procedures as documented for Linux VMs by the vendor of the GPU card.

Typically, you restart the host immediately after you flash the GPU. After restarting the host, redo the GPU configuration on the affected VM, and then start the VM. For example, consider that you want to re-flash an NVIDIA Tesla® M60 GPU that is running in graphics mode. The Prism web console identifies the card as an NVIDIA Tesla M60 GPU. After you re-flash the GPU to run in compute mode and restart the host, redo the GPU configuration on the affected VMs by adding back the GPU, which is now identified as an NVIDIA Tesla M60.compute GPU, and then start the VM.

Supported GPU Cards

For a list of supported GPUs, see Supported GPUs.

Limitations

GPU pass-through support has the following limitations:

  • Live migration of VMs with a GPU configuration is not supported. Live migration of VMs is necessary when the BIOS, BMC, and the hypervisor on the host are being upgraded. During these upgrades, VMs that have a GPU configuration are powered off and then powered on automatically when the node is back up.
  • VM pause and resume are not supported.
  • You cannot hot add VM memory if the VM is using a GPU.
  • Hot add and hot remove support is not available for GPUs.
  • You can change the GPU configuration of a VM only when the VM is turned off.
  • The Prism web console does not support console access for VMs that are configured with GPU pass-through. Before you configure GPU pass-through for a VM, set up an alternative means to access the VM. For example, enable remote access over RDP.

    Removing GPU pass-through from a VM restores console access to the VM through the Prism web console.

Configuring GPU Pass-Through

For information about configuring GPU pass-through for guest VMs, see Creating a VM (AHV) in the "Virtual Machine Management" chapter of the Prism Web Console Guide .

NVIDIA GRID Virtual GPU Support on AHV

AHV supports NVIDIA GRID technology, which enables multiple guest VMs to use the same physical GPU concurrently. Concurrent use is possible by dividing a physical GPU into discrete virtual GPUs (vGPUs) and allocating those vGPUs to guest VMs. Each vGPU has a fixed range of frame buffer and uses all the GPU processing cores in a time-sliced manner.

Virtual GPUs are of different types (vGPU types are also called vGPU profiles). vGPUs differ by the amount of physical GPU resources allocated to them and the class of workload that they target. The number of vGPUs into which a single physical GPU can be divided therefore depends on the vGPU profile that is used on a physical GPU.

Each physical GPU supports more than one vGPU profile, but a physical GPU cannot run multiple vGPU profiles concurrently. After a vGPU of a given profile is created on a physical GPU (that is, after a vGPU is allocated to a VM that is powered on), the GPU is restricted to that vGPU profile until it is freed up completely. To understand this behavior, consider that you configure a VM to use an M60-1Q vGPU. When the VM is powering on, it is allocated an M60-1Q vGPU instance only if a physical GPU that supports M60-1Q is either unused or already running the M60-1Q profile and can accommodate the requested vGPU.

If an entire physical GPU that supports M60-1Q is free at the time the VM is powering on, an M60-1Q vGPU instance is created for the VM on the GPU, and that profile is locked on the GPU. In other words, until the physical GPU is completely freed up again, only M60-1Q vGPU instances can be created on that physical GPU (that is, only VMs configured with M60-1Q vGPUs can use that physical GPU).

Note: NVIDIA does not support Windows Guest VMs on the C-series NVIDIA vGPU types. See the NVIDIA documentation on Virtual GPU software for more information.

NVIDIA Grid Host Drivers and License Installation

To enable guest VMs to use vGPUs on AHV, you must install NVIDIA drivers on the guest VMs, install the NVIDIA GRID host driver on the hypervisor, and set up an NVIDIA GRID License Server.

See the NVIDIA Grid Host Driver for Nutanix AHV Installation Guide for details about the workflow to enable guest VMs to use vGPUs on AHV and the NVIDIA GRID host driver installation instructions.

vGPU Profile Licensing

vGPU profiles are licensed through an NVIDIA GRID license server. The choice of license depends on the type of vGPU that the applications running on the VM require. Licenses are available in various editions, and the vGPU profile that you want might be supported by more than one license edition.

Note: If the specified license is not available on the licensing server, the VM starts up and functions normally, but the vGPU runs with reduced capability.

You must determine the vGPU profile that the VM requires, install an appropriate license on the licensing server, and configure the VM to use that license and vGPU type. For information about licensing for different vGPU types, see the NVIDIA GRID licensing documentation.

Guest VMs check out a license over the network when starting up and return the license when shutting down. As the VM is powering on, it checks out the license from the licensing server. When a license is checked back in, the vGPU is returned to the vGPU resource pool.

When powered on, guest VMs use a vGPU in the same way that they use a physical GPU that is passed through.

Supported GPU Cards

For a list of supported GPUs, see Supported GPUs.

High Availability Support for VMs with vGPUs

Nutanix conditionally supports high availability (HA) of VMs that have NVIDIA GRID vGPUs configured. The cluster does not reserve any specific resources to guarantee High Availability for the VMs with vGPUs. The vGPU VMs are restarted on best effort basis in the event of a node failure. You can restart a VM with vGPUs on another (failover) host which has compatible or identical vGPU resources available. The vGPU profile available on the failover host must be identical to the vGPU profile configured on the VM that needs HA. The system attempts to restart the VM after an event. If the failover host has insufficient memory and vGPU resources for the VM to start, the VM fails to start after failover.

The following conditions are applicable to HA of VMs with vGPUs:

  • Memory is not reserved for the VM on the failover host by the HA process. When the VM fails over, if sufficient memory is not available, the VM cannot power on.
  • vGPU resources are not reserved on the failover host. When the VM fails over, if the required vGPU resources are not available on the failover host, the VM cannot power on.

Limitations for vGPU Support

vGPU support on AHV has the following limitations:

  • You cannot hot-add memory to VMs that have a vGPU.
  • The Prism web console does not support console access for a VM that is configured with multiple vGPUs. The Prism web console supports console access for a VM that is configured with a single vGPU only.

    Before you add multiple vGPUs to a VM, set up an alternative means to access the VM. For example, enable remote access over RDP. For Linux VMs, instead of RDP, use Virtual Network Computing (VNC) or equivalent.

Console Support for VMs with vGPU

Like other VMs, you can access a VMs with a single vGPU using the console. Enable or disable console support for a VM with only one vGPU configured. Enabling console support for a VM with multiple vGPUs is not supported. By default, console support for a vGPU VM is disabled.

See Enabling or Disabling Console Support for vGPU VMs for more information about configuring the support.

Recovery of vGPU Console-enabled VMs

With AHV, you can recover vGPU console-enabled guest VMs efficiently. When you perform DR of vGPU console-enabled guest VMs, the VMs recovers with the vGPU console. The guest VMs fail to recover when you perform cross-hypervisor disaster recovery (CHDR).

For AHV with minimum AOS versions 6.1, 6.0.2.4 and 5.20:

  • vGPU-enabled VMs can be recovered when protected by protection domains in PD-based DR or protection policies in Leap based solutions using asynchronous, NearSync, or Synchronous (Leap only) replications.
    Note: GPU Passthrough is not supported.
  • If both site A and site B have the same GPU boards (and the same assignable vGPU profiles), failovers work seamlessly. However, with protection domains, no additional steps are required. GPU profiles are restored correctly and vGPU console settings persist after recovery. With Leap DR, vGPU console settings do not persist after recovery.
  • If site A and site B have different GPU boards and vGPU profiles, you must manually remove the vGPU profile before you power on the VM in site B.

The vGPU console settings are persistent after recovery and all failovers are supported for the following:

Table 1. Persistent vGPU Console Settings with Failover Support
Recovery using For vGPU enabled AHV VMs
Protection domain based DR Yes
VMware SRM with Nutanix SRA Not applicable

For information about the behavior See the Recovery of vGPU-enabled VMs topic in the Data Protection and Recovery with Prism Element guide.

See Enabling or Disabling Console Support for vGPU VMs for more information about configuring the support.

For SRA and SRM support, see the Nutanix SRA documentation.

ADS support for VMs with vGPUs

AHV supports Acropolis Dynamic Scheduling (ADS) for VMs with vGPUs.

Note: ADS support requires live migration of VMs with vGPU be operational in the cluster. See Live Migration of VMs with vGPUs above for minimum NVIDIA and AOS versions that support live migration of VMs with vGPUs.

When a number of VMs with vGPUs are running on a host and you enable ADS support for the cluster, the Lazan manager invokes VM migration tasks to resolve resource hotspots or fragmentation in the cluster to power on incoming vGPU VMs. The Lazan manager can migrate vGPU-enabled VMs to other hosts in the cluster only if:

  • The other hosts support compatible or identical vGPU resources as the source host (hosting the vGPU-enabled VMs).

  • The host affinity is not set for the vGPU-enabled VM.

For more information about limitations, see Live Migration of VMs with Virtual GPUs and Limitations of Live Migration Support.

For more information about ADS, see Acropolis Dynamic Scheduling in AHV.

Multiple Virtual GPU Support

Prism Central and Prism Element Web Console can deploy VMs with multiple virtual GPU instances. This support harnesses the capabilities of NVIDIA GRID virtual GPU (vGPU) support for multiple vGPU instances for a single VM.

Note: Multiple vGPUs on the same VM are supported on NVIDIA Virtual GPU software version 10.1 (440.53) or later.

You can deploy virtual GPUs of different types. A single physical GPU can be divided into the number of vGPUs depending on the type of vGPU profile that is used on the physical GPU. Each physical GPU on a GPU board supports more than one type of vGPU profile. For example, a Tesla® M60 GPU device provides different types of vGPU profiles like M60-0Q, M60-1Q, M60-2Q, M60-4Q, and M60-8Q.

You can only add multiple vGPUs of the same type of vGPU profile to a single VM. For example, consider that you configure a VM on a Node that has one NVidia Tesla® M60 GPU board. Tesla® M60 provides two physical GPUs, each supporting one M60-8Q (profile) vGPU, thus supporting a total of two M60-8Q vGPUs for the entire host.

For restrictions on configuring multiple vGPUs on the same VM, see Restrictions for Multiple vGPU Support.

For steps to add multiple vGPUs to the same VM, see Creating a VM (AHV) and Adding Multiple vGPUs to a VM in Prism Web Console Guide or Creating a VM through Prism Central (AHV) and Adding Multiple vGPUs to a VM in Prism Central Guide .

Restrictions for Multiple vGPU Support

You can configure multiple vGPUs subject to the following restrictions:

  • All the vGPUs that you assign to one VM must be of the same type. In the aforesaid example, with the Tesla® M60 GPU device, you can assign multiple M60-8Q vGPU profiles. You cannot assign one vGPU of the M60-1Q type and another vGPU of the M60-8Q type.

    Note: You can configure any number of vGPUs of the same type on a VM. However, the cluster calculates a maximum number of vGPUs of the same type per VM. This number is defined as max_instances_per_vm. This number is variable and changes based on the GPU resources available in the cluster and the number of VMs deployed. If the number of vGPUs of a specific type that you configured on a VM exceeds the max_instances_per_vm number, then the VM fails to power on and the following error message is displayed:
    Operation failed: NoHostResources: No host has enough available GPU for VM <name of VM>(UUID of VM).
    You could try reducing the GPU allotment...

    When you configure multiple vGPUs on a VM, after you select the appropriate vGPU type for the first vGPU assignment, Prism (Prism Central and Prism Element Web Console) automatically restricts the selection of vGPU type for subsequent vGPU assignments to the same VM.

    Figure. vGPU Type Restriction Message Click to enlarge vGPU Type Restriction Message

    Note:

    You can use CLI (acli) to configure multiple vGPUs of multiple types to the same VM. See Acropolis Command-Line Interface (aCLI) for information about aCLI. Use the vm.gpu_assign <vm.name> gpu=<gpu-type> command multiple times, once for each vGPU, to add multiple vGPUs of multiple types to the same VM.

    See the GPU board and software documentation for information about the combinations of the number and types of vGPUs profiles supported by the GPU resources installed in the cluster. For example, see the NVIDIA Virtual GPU Software Documentation for the vGPU type and number combinations on the Tesla® M60 board.

  • Configure multiple vGPUs only of the highest type using Prism. The highest type of vGPU profile is based on the driver deployed in the cluster. In the aforesaid example, on a Tesla® M60 device, you can only configure multiple vGPUs of the M60-8Q type. Prism prevents you from configuring multiple vGPUs of any other type such as M60-2Q.

    Figure. vGPU Type Restriction Message Click to enlarge Message showing the restriction of number of vGPUs of specified type.

    Note:

    You can use CLI (acli) to configure multiple vGPUs of other available types. See Acropolis Command-Line Interface (aCLI) for the aCLI information. Use the vm.gpu_assign <vm.name> gpu=<gpu-type> command multiple times, once for each vGPU, to configure multiple vGPUs of other available types.

    See the GPU board and software documentation for more information.

  • Configure either a passthrough GPU or vGPUs on the same VM. You cannot configure both passthrough GPU and vGPUs. Prism automatically disallows such configurations after the first GPU is configured.

  • The VM powers on only if the requested type and number of vGPUs are available in the node.

    In the aforesaid example, the VM, which is configured with two M60-8Q vGPUs, fails to power on if another VM sharing the same GPU board is already using one M60-8Q vGPU. This is because the Tesla® M60 GPU board allows only two M60-8Q vGPUs. Of these, one is already used by another VM. Thus, the VM configured with two M60-8Q vGPUs fails to power on due to unavailability of required vGPUs.

  • Multiple vGPUs on the same VM are supported on NVIDIA Virtual GPU software version 10.1 (440.53) or later. Ensure that the relevant GRID version license is installed and select it when you configure multiple vGPUs.
Adding Multiple vGPUs to the Same VM

About this task

You can add multiple vGPUs of the same vGPU type to:

  • A new VM when you create it.

  • An existing VM when you update it.

Important:

Before you add multiple vGPUs to the VM, see Multiple Virtual GPU Support and Restrictions for Multiple vGPU Support.

After you add the first vGPU, do the following on the Create VM or Update VM dialog box (the main dialog box) to add more vGPUs:

Procedure

  1. Click Add GPU .
  2. In the Add GPU dialog box, click Add .

    The License field is grayed out because you cannot select a different license when you add a vGPU for the same VM.

    The VGPU Profile is also auto-selected because you can only select the additional vGPU of the same vGPU type as indicated by the message at the top of the dialog box.

    Figure. Add GPU for multiple vGPUs Click to enlarge Adding multiple vGPU

  3. In the main dialog box, you see the newly added vGPU.
    Figure. New vGPUs Added Click to enlarge Multiple vGPUs Added

  4. Repeat the steps for each vGPU addition you want to make.

Live Migration of VMs with Virtual GPUs

You can perform live migration of VMs enabled with virtual GPUs (vGPU-enabled VM). The primary advantage of the live migration support is that unproductive downtime is avoided. Therefore, your vGPUs can continue to run while the VMs that are running the vGPUs are seamlessly migrated in the background. With very low stun times, as a graphics user, you barely notice the migration.

Note: Live migration of VMs with vGPUs is supported for vGPUs created with minimum NVIDIA Virtual GPU software version 10.1 (440.53).
Table 1. Minimum Versions
Component Supports With Minimum Version
AOS Live migration within the same cluster 5.18.1
AHV Live migration within the same cluster 20190916.294
AOS Live migration across cluster 6.1
AHV Live migration across cluster 20201105.30142
Important: In an HA event involving any GPU node, the node locality of the affected vGPU VMs is not restored after GPU node recovery. The affected vGPU VMs are not migrated back to their original GPU host intentionally to avoid extended VM stun time expected while migrating vGPU frame buffer. If vGPU VM node locality is required, migrate the affected vGPU VMs to the desired host manually. For information about the steps to migrate a live VM with vGPUs, see Migrating Live a VM with Virtual GPUs in the Prism Central Guide and the Prism Web Console Guide .
Note:

Important frame buffer and VM stun time considerations are:

  • The GPU board (for example, NVIDIA Tesla M60) vendor provides the information for maximum frame buffer size of vGPU types (for example, M60-8Q type) that can be configured on VMs. However, the actual frame buffer usage may be lower than the maximum sizes.

  • The VM stun time depends on the number of vGPUs configured on the VM being migrated. Stun time may be longer in case of multiple vGPUs operating on the VM.

    The stun time also depends on the network factors such bandwidth available for use during the migration.

For information about the limitations applicable to the live migration support, see Limitations of Live Migration Support and Restrictions for Multiple vGPU Support.

For information about the steps to migrate live a VM with vGPUs, see Migrating Live a VM with Virtual GPUs in the Prism Central Guide and Migrating Live a VM with Virtual GPUs in the Prism Web Console Guide .

Limitations of Live Migration Support
  • Live migration is supported for VMs configured with single or multiple virtual GPUs. It is not supported for VMs configured with passthrough GPUs.

  • The target cluster for the migration must have adequate and available GPU resources, with the same vGPU types as configured for the VMs to be migrated, to support the vGPUs on the VMs that need to be migrated.

    See Restrictions for Multiple vGPU Support for more details.

  • The VMs with vGPUs that need to be migrated live cannot be protected with high availability.
  • Ensure that the VM is not powered off.
  • Ensure that you have the right GPU software license that supports live migration of vGPUs. The source and target clusters must have the same license type. You require an appropriate license of NVIDIA GRID software version. See Live Migration of VMs with Virtual GPUs for minimum license requirements.

Enabling or Disabling Console Support for vGPU VMs

About this task

Enable or disable console support for a VM with only one vGPU configured. Enabling console support for a VM with multiple vGPUs is not supported. By default, console support for a vGPU VM is disabled.

To enable or disable console support for each VM with vGPUs, do the following:

Procedure

  1. Run the following aCLI command to check if console support is enabled or disabled for the VM with vGPUs.
    acli> vm.get vm-name

    Where vm-name is the name of the VM for which you want to check the console support status.

    The step result includes the following parameter for the specified VM:
    gpu_console=False

    Where False indicates that console support is not enabled for the VM. This parameter is displayed as True when you enable console support for the VM. The default value for gpu_console= is False since console support is disabled by default.

    Note: The console may not display the gpu_console parameter in the output of the vm.get command if the gpu_console parameter was not previously enabled.
  2. Run the following aCLI command to enable or disable console support for the VM with vGPU:
    vm.update vm-name gpu_console=true | false

    Where:

    • true —indicates that you are enabling console support for the VM with vGPU.
    • false —indicates that you are disabling console support for the VM with vGPU.
  3. Run the vm.get command to check if gpu_console value is true indicating that console support is enabled or false indicating that console support is disabled as you configured it.

    If the value indicated in the vm.get command output is not what is expected, then perform Guest Shutdown of the VM with vGPU. Next, run the vm.on vm-name aCLI command to turn the VM on again. Then run vm.get command and check the gpu_console= value.

  4. Click a VM name in the VM table view to open the VM details page. Click Launch Console .
    The Console opens but only a black screen is displayed.
  5. Click on the console screen. Click one of the following key combinations based on the operating system you are accessing the cluster from.
    • For Apple Mac OS: Control+Command+2
    • For MS Windows: Ctrl+Alt+2
    The console is fully enabled and displays the content.

PXE Configuration for AHV VMs

You can configure a VM to boot over the network in a Preboot eXecution Environment (PXE). Booting over the network is called PXE booting and does not require the use of installation media. When starting up, a PXE-enabled VM communicates with a DHCP server to obtain information about the boot file it requires.

Configuring PXE boot for an AHV VM involves performing the following steps:

  • Configuring the VM to boot over the network.
  • Configuring the PXE environment.

The procedure for configuring a VM to boot over the network is the same for managed and unmanaged networks. The procedure for configuring the PXE environment differs for the two network types, as follows:

  • An unmanaged network does not perform IPAM functions and gives VMs direct access to an external Ethernet network. Therefore, the procedure for configuring the PXE environment for AHV VMs is the same as for a physical machine or a VM that is running on any other hypervisor. VMs obtain boot file information from the DHCP or PXE server on the external network.
  • A managed network intercepts DHCP requests from AHV VMs and performs IP address management (IPAM) functions for the VMs. Therefore, you must add a TFTP server and the required boot file information to the configuration of the managed network. VMs obtain boot file information from this configuration.

A VM that is configured to use PXE boot boots over the network on subsequent restarts until the boot order of the VM is changed.

Configuring the PXE Environment for AHV VMs

The procedure for configuring the PXE environment for a VM on an unmanaged network is similar to the procedure for configuring a PXE environment for a physical machine on the external network and is beyond the scope of this document. This procedure configures a PXE environment for a VM in a managed network on an AHV host.

About this task

To configure a PXE environment for a VM on a managed network on an AHV host, do the following:

Procedure

  1. Log on to the Prism web console, click the gear icon, and then click Network Configuration in the menu.
  2. On Network Configuration > Subnets tab, click the Edit action link of the network for which you want to configure a PXE environment.
    The VMs that require the PXE boot information must be on this network.
  3. In the Update Subnet dialog box:
    1. Select the Enable IP address management check box and complete the following configurations:
      • In the Network IP Prefix field, enter the network IP address, with prefix, of the subnet that you are updating.
      • In the Gateway IP Address field, enter the gateway IP address of the subnet that you are updating.
      • To provide DHCP settings for the VM, select the DHCP Settings check box and provide the following information.
Fields Description and Values
Domain Name Servers

Provide a comma-separated list of DNS IP addresses.

Example: 8.8.8.8, or 9.9.9.9

Domain Search

Enter the VLAN domain name. Use only the domain name format.

Example: nutanix.com

TFTP Server Name

Enter a valid TFTP host server name of the TFTP server where you host the host boot file. The IP address of the TFTP server must be accessible to the virtual machines to download a boot file.

Example: tftp_vlan103

Boot File Name

The name of the boot file that the VMs need to download from the TFTP host server.

Example: boot_ahv202010

  1. Under IP Address Pools , click Create Pool to add IP address pools for the subnet.

    (Mandatory for Overlay type subnets) This section provides the Network IP Prefix and Gateway IP fields for the subnet.

    (Optional for VLAN type subnet) Check this box to display the Network IP Prefix and Gateway IP fields and configure the IP address details.

  2. (Optional and for VLAN networks only) Check the Override DHCP Server dialog box and enter an IP address in the DHCP Server IP Address field.

    You can configure a DHCP server using the Override DHCP Server option only in case of VLAN networks.

    The DHCP Server IP address (reserved IP address for the Acropolis DHCP server) is visible only to VMs on this network and responds only to DHCP requests. If this box is not checked, the DHCP Server IP Address field is not displayed and the DHCP server IP address is generated automatically. The automatically generated address is network_IP_address_subnet.254 , or if the default gateway is using that address, network_IP_address_subnet.253 .

    Usually the default DHCP server IP is configured as the last usable IP in the subnet (For eg., its 10.0.0.254 for 10.0.0.0/24 subnet). If you want to use a different IP address in the subnet as the DHCP server IP, use the override option.

  3. Click Close .

Configuring a VM to Boot over a Network

To enable a VM to boot over the network, update the VM's boot device setting. Currently, the only user interface that enables you to perform this task is the Acropolis CLI (aCLI).

About this task

To configure a VM to boot from the network, do the following:

Procedure

  1. Log on to any CVM in the cluster with SSH.
  2. Create a VM.
    
    nutanix@cvm$ acli vm.create vm num_vcpus=num_vcpus memory=memory

    Replace vm with a name for the VM, and replace num_vcpus and memory with the number of vCPUs and amount of memory that you want to assign to the VM, respectively.

    For example, create a VM named nw-boot-vm.

    nutanix@cvm$ acli vm.create nw-boot-vm num_vcpus=1 memory=512
  3. Create a virtual interface for the VM and place it on a network.
    nutanix@cvm$ acli vm.nic_create vm network=network

    Replace vm with the name of the VM and replace network with the name of the network. If the network is an unmanaged network, make sure that a DHCP server and the boot file that the VM requires are available on the network. If the network is a managed network, configure the DHCP server to provide TFTP server and boot file information to the VM. See Configuring the PXE Environment for AHV VMs.

    For example, create a virtual interface for VM nw-boot-vm and place it on a network named network1.

    nutanix@cvm$ acli vm.nic_create nw-boot-vm network=network1
  4. Obtain the MAC address of the virtual interface.
    nutanix@cvm$ acli vm.nic_list vm

    Replace vm with the name of the VM.

    For example, obtain the MAC address of VM nw-boot-vm.

    nutanix@cvm$ acli vm.nic_list nw-boot-vm
    00-00-5E-00-53-FF
  5. Update the boot device setting so that the VM boots over the network.
    nutanix@cvm$ acli vm.update_boot_device vm mac_addr=mac_addr

    Replace vm with the name of the VM and mac_addr with the MAC address of the virtual interface that the VM must use to boot over the network.

    For example, update the boot device setting of the VM named nw-boot-vm so that the VM uses the virtual interface with MAC address 00-00-5E-00-53-FF.

    nutanix@cvm$ acli vm.update_boot_device nw-boot-vm mac_addr=00-00-5E-00-53-FF
  6. Power on the VM.
    nutanix@cvm$ acli vm.on vm_list [host="host"]

    Replace vm_list with the name of the VM. Replace host with the name of the host on which you want to start the VM.

    For example, start the VM named nw-boot-vm on a host named host-1.

    nutanix@cvm$ acli vm.on nw-boot-vm host="host-1"

Uploading Files to DSF for Microsoft Windows Users

If you are a Microsoft Windows user, you can securely upload files to DSF by using the following procedure.

Procedure

  1. Use WinSCP, with SFTP selected, to connect to Controller VM through port 2222 and start browsing the DSF datastore.
    Note: The root directory displays storage containers and you cannot change it. You can only upload files to one of the storage containers and not directly to the root directory. To create or delete storage containers, you can use the Prism user interface.
  2. Authenticate by using Prism username and password or, for advanced users, use the public key that is managed through the Prism cluster lockdown user interface.

Enabling Load Balancing of vDisks in a Volume Group

AHV hosts support load balancing of vDisks in a volume group for guest VMs. Load balancing of vDisks in a volume group enables IO-intensive VMs to use the storage capabilities of multiple Controller VMs (CVMs).

About this task

If you enable load balancing on a volume group, the guest VM communicates directly with each CVM hosting a vDisk. Each vDisk is served by a single CVM. Therefore, to use the storage capabilities of multiple CVMs, create more than one vDisk for a file system and use the OS-level striped volumes to spread the workload. This configuration improves performance and prevents storage bottlenecks.

Note:
  • vDisk load balancing is disabled by default for volume groups that are directly attached to VMs.

    However, vDisk load balancing is enabled by default for volume groups that are attached to VMs by using a data services IP address.

  • If you use web console to clone a volume group that has load balancing enabled, the volume group clone does not have load balancing enabled by default. To enable load balancing on the volume group clone, you must set the load_balance_vm_attachments parameter to true using acli or Rest API.
  • You can attach a maximum number of 10 load balanced volume groups per guest VM.
  • For Linux VMs, ensure that the SCSI device timeout is 60 seconds. For information about how to check and modify the SCSI device timeout, see the Red Hat documentation at https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/online_storage_reconfiguration_guide/task_controlling-scsi-command-timer-onlining-devices.

Perform the following procedure to enable load balancing of vDisks by using aCLI.

Procedure

  1. Log on to a Controller VM with SSH.
  2. Do one of the following:
    • Enable vDisk load balancing if you are creating a volume group.
      nutanix@cvm$ acli vg.create vg_name load_balance_vm_attachments=true

      Replace vg_name with the name of the volume group.

    • Enable vDisk load balancing if you are updating an existing volume group.
      nutanix@cvm$ acli vg.update vg_name load_balance_vm_attachments=true

      Replace vg_name with the name of the volume group.

      Note: To modify an existing volume group, you must first detach all the VMs that are attached to that volume group before you enable vDisk load balancing.
  3. Verify if vDisk load balancing is enabled.
    nutanix@cvm$ acli vg.get vg_name

    An output similar to the following is displayed:

    nutanix@cvm$ acli vg.get ERA_DB_VG_xxxxxxxx
    ERA_DB_VG_xxxxxxxx {
      attachment_list {
        vm_uuid: "xxxxx"
    .
    .
    .
    .
    iscsi_target_name: "xxxxx"
    load_balance_vm_attachments: True
    logical_timestamp: 4
    name: "ERA_DB_VG_xxxxxxxx"
    shared: True
    uuid: "xxxxxx"
    }

    If vDisk load balancing is enabled on a volume group, load_balance_vm_attachments: True is displayed in the output. The output does not display the load_balance_vm_attachments: parameter at all if vDisk load balancing is disabled.

  4. (Optional) Disable vDisk load balancing.
    nutanix@cvm$ acli vg.update vg_name load_balance_vm_attachments=false

    Replace vg_name with the name of the volume group.

Viewing list of restarted VMs after an HA event

This section provides the information about how to view the list of VMs that are restarted after an HA event in the AHV cluster.

About this task

If an AHV host becomes inaccessible or fails due to some unplanned event, the AOS restarts the VMs across the remaining hosts in the cluster.

To view the list of restarted VMs after an HA event:

Procedure

  1. Log in to Prism Central or Prism web console.
  2. View the list of restarted VMs on either of the following page:
    • Events page:
      1. Navigate to Activity > Events from the entities menu to access the Events page in Prism Central .

        Navigate to Alerts > Events from the main menu to access the Events page in the Prism web console .

      2. Locate or search for the following string, and hover over or click the string:

        VMs restarted due to HA failover

        The system displays the list of restarted VMs in the Summary page and as a hover text for the selected event.

        For example:

        VMs restarted due to HA failover: <VM_Name1>, <VM_Name2>, <VM_Name3>, <VM_Name4>. VMs were running on host X.X.X.1 prior to HA.

      Observe <VM_Name1>, <VM_Name2>, <VM_Name3>, and <VM_Name4> as the actual VMs in your cluster.

    • Tasks page:
      1. Navigate to Activity > Tasks from the entities menu to access the Tasks page in Prism Central .

        Navigate to Tasks from the main menu to access the Tasks page in the Prism web console .

      2. Locate or search for the following task, and click Details :

        HA failover

        The system displays a list of related tasks for the HA failover event.

      3. Locate or search for the following related task, and click Details :

        Host restart all VMs

        The system displays Restart VM group task for the HA failover event.

      4. In the Entity Affected column, click Details , or hover over the VMs text for Restart VM group task:

      The system displays the list of restarted VMs:

      Figure. List of restarted VMs Click to enlarge This figure shows the list of restarted VMs.

Live vDisk Migration Across Storage Containers

vDisk migration allows you to change the container of a vDisk. You can migrate vDisks across storage containers while they are attached to guest VMs without the need to shut down or delete VMs (live migration). You can either migrate all vDisks attached to a VM or migrate specific vDisks to another container.

In a Nutanix solution, you group vDisks into storage containers and attach vDisks to guest VMs. AOS applies storage policies such as replication factor, encryption, compression, deduplication, and erasure coding at the storage container level. If you apply a storage policy to a storage container, AOS enables that policy on all the vDisks of the container. If you want to change the policies of the vDisks (for example, from RF2 to RF3), create another container with a different policy and move the vDisk to that container. With live migration of vDisks across containers, you can migrate vDisk across containers even if those vDisks are attached to a live VM. Thus, live migration of vDisks across storage containers enables you to efficiently manage storage policies for guest VMs.

General Considerations

You cannot migrate images or volume groups.

You cannot perform the following operations during an ongoing vDisk migration:

  • Clone the VM
  • Resize the VM
  • Take a snapshot
Note: During vDisk migration, the logical usage of a vDisk is more than the total capacity of the vDisk. The issue occurs because the logical usage of the vDisk includes the space occupied in both the source and destination containers. Once the migration is complete, the logical usage of the vDisk returns to its normal value.

Migration of vDisks stalls if sufficient storage space is not available in the target storage container. Ensure that the target container has sufficient storage space before you begin migration.

Disaster Recovery Considerations

Consider the following points if you have a disaster recovery and backup setup:

  • You cannot migrate vDisks of a VM that is protected by a protection domain or protection policy. When you start the migration, ensure that the VM is not protected by a protection domain or protection policy. If you want to migrate vDisks of such a VM, do the following:
    • Remove the VM from the protection domain or protection policy.
    • Migrate the vDisks to the target container.
    • Add the VM back to the protection domain or protection policy.
    • Configure the remote site with the details of the new container.

    vDisk migration fails if the VM is protected by a protection domain or protection policy.

  • If you are using a third-party backup solution, AOS temporarily blocks snapshot operations for a VM if vDisk migration is in progress for that VM.

Migrating a vDisk to Another Container

You can either migrate all vDisks attached to a VM or migrate specific vDisks to another container.

About this task

Perform the following procedure to migrate vDisks across storage containers:

Procedure

  1. Log on to a CVM in the cluster with SSH.
  2. Do one of the following:
    • Migrate all vDisks of a VM to the target storage container.
      nutanix@cvm$ acli vm.update_container vm-name container=target-container wait=false

      Replace vm-name with the name of the VM whose vDisks you want to migrate and target-container with the name of the target container.

    • Migrate specific vDisks by using either the UUID of the vDisk or address of the vDisk.

      Migrate specific vDisks by using the UUID of the vDisk.

      nutanix@cvm$ acli vm.update_container vm-name device_uuid_list=device_uuid container=target-container wait=false

      Replace vm-name with the name of VM, device_uuid with the device UUID of the vDisk, and target-container with the name of the target storage container.

      Run nutanix@cvm$ acli vm.get <vm-name> to determine the device UUID of the vDisk.

      You can migrate multiple vDisks at a time by specifying a comma-separated list of device UUIDs of the vDisks.

      Alternatively, you can migrate vDisks by using the address of the vDisk.

      nutanix@cvm$ acli vm.update_container vm-name disk_addr_list=disk-address container=target-container wait=false

      Replace vm-name with the name of VM, disk-address with the address of the disk, and target-container with the name of the target storage container.

      Run nutanix@cvm$ acli vm.get <vm-name> to determine the address of the vDisk.

      Following is the format of the vDisk address:

      bus.index

      Following is a section of the output of the acli vm.get vm-name command:

      disk_list {
           addr {
             bus: "scsi"
             index: 0
           }

      Combine the values of bus and index as shown in the following example:

      nutanix@cvm$ acli vm.update_container TestUVM_1 disk_addr_list=scsi.0 container=test-container-17475

      You can migrate multiple vDisks at a time by specifying a comma-separated list of vDisk addresses.

  3. Check the status of the migration in the Tasks menu of the Prism Element web console.
  4. (Optional) Cancel the migration if you no longer want to proceed with it.
    nutanix@cvm$ ecli task.cancel task_list=task-ID

    Replace task-ID with the ID of the migration task.

    Determine the task ID as follows:

    nutanix@cvm$ ecli task.list

    In the Type column of the tasks list, look for VmChangeDiskContainer .

    VmChangeDiskContainer indicates that it is a vDisk migration task. Note the ID of such a task.

    Note: Note the following points about canceling migration:
    • If you cancel an ongoing migration, AOS retains the vDisks that have not yet been migrated in the source container. AOS does not migrate vDisks that have already been migrated to the target container back to the source container.
    • If sufficient storage space is not available in the original storage container, migration of vDisks back to the original container stalls. To resolve the issue, ensure that the source container has sufficient storage space.

OVAs

An Open Virtual Appliance (OVA) file is a tar archive file created by converting a virtual machine (VM) into an Open Virtualization Format (OVF) package for easy distribution and deployment. OVA helps you to quickly create, move or deploy VMs on different hypervisors.

Prism Central helps you perform the following operations with OVAs:

  • Export an AHV VM as an OVA file.

  • Upload OVAs of VMs or virtual appliances (vApps). You can import (upload) an OVA file with the QCOW2 or VMDK disk formats from a URL or the local machine.

  • Deploy an OVA file as a VM.

  • Download an OVA file to your local machine.

  • Rename an OVA file.

  • Delete an OVA file.

  • Track or monitor the tasks associated with OVA operations in Tasks .

The access to OVA operations is based on your role. See Role Details View in the Prism Central Guide to check if your role allows you to perform the OVA operations.

For information about:

  • Restrictions applicable to OVA operations, see OVA Restrictions.

  • The OVAs dashboard, see OVAs View in the Prism Central Guide .

  • Exporting a VM as an OVA, see Exporting a VM as an OVA in the Prism Central Guide .

  • Other OVA operations, see OVA Management in the Prism Central Guide .

OVA Restrictions

You can perform the OVA operations subject to the following restrictions:

  • Export to or upload OVAs with one of the following disk formats:
    • QCOW2: Default disk format auto-selected in the Export as OVA dialog box.
    • VMDK: Deselect QCOW2 and select VMDK, if required, before you submit the VM export request when you export a VM.
    • When you export a VM or upload an OVA and the VM or OVA does not have any disks, the disk format is irrelevant.
  • Upload an OVA to multiple clusters using a URL as the source for the OVA. You can upload an OVA only to a single cluster when you use the local OVA File source.
  • Perform the OVA operations only with appropriate permissions. You can run the OVA operations that you have permissions for, based on your assigned user role.
  • The OVA that results from exporting a VM on AHV is compatible with any AHV version 5.18 or later.
  • The minimum supported versions for performing OVA operations are AOS 5.18, Prism Central 2020.8, and AHV-20190916.253.
Read article

AHV Administration Guide

AHV 6.5

Product Release Date: 2022-07-25

Last updated: 2022-12-15

AHV Overview

As the default option for Nutanix HCI, the native Nutanix hypervisor, AHV, represents a unique approach to virtualization that offers the powerful virtualization capabilities needed to deploy and manage enterprise applications. AHV compliments the HCI value by integrating native virtualization along with networking, infrastructure, and operations management with a single intuitive interface - Nutanix Prism.

Virtualization teams find AHV easy to learn and transition to from legacy virtualization solutions with familiar workflows for VM operations, live migration, VM high availability, and virtual network management. AHV includes resiliency features, including high availability and dynamic scheduling without the need for additional licensing, and security is integral to every aspect of the system from the ground up. AHV also incorporates the optional Flow Security and Networking, allowing easy access to hypervisor-based network microsegmentation and advanced software-defined networking.

See the Field Installation Guide for information about how to deploy and create a cluster. Once you create the cluster by using Foundation, you can use this guide to perform day-to-day management tasks.

AOS and AHV Compatibility

For information about the AOS and AHV compatibility with this release, see Compatibility and Interoperability Matrix.

Minimum Field Requirements for Nutanix Cloud Infrastructure (NCI)

For information about minimum field requirements for NCI, see Minimum Field Requirements for Nutanix Cloud Infrastructure (NCI) topic in Acropolis Advanced Administration Guide .

Limitations

For information about AHV configuration limitations, see Nutanix Configuration Maximums webpage.

Nested Virtualization

Nutanix does not support nested virtualization (nested VMs) in an AHV cluster.

Storage Overview

AHV uses a Distributed Storage Fabric to deliver data services such as storage provisioning, snapshots, clones, and data protection to VMs directly.

In AHV clusters, AOS passes all disks to the VMs as raw SCSI block devices. By that means, the I/O path is lightweight and optimized. Each AHV host runs an iSCSI redirector, which establishes a highly resilient storage path from each VM to storage across the Nutanix cluster.

QEMU is configured with the iSCSI redirector as the iSCSI target portal. Upon a login request, the redirector performs an iSCSI login redirect to a healthy Stargate (preferably the local one).

Figure. AHV Storage Click to enlarge

AHV Turbo

AHV Turbo represents significant advances to the data path in AHV. AHV Turbo provides an I/O path that bypasses QEMU and services storage I/O requests, which lowers CPU usage and increases the amount of storage I/O available to VMs.

AHV Turbo represents significant advances to the data path in AHV.

When you use QEMU, all I/O travels through a single queue that can impact system performance. AHV Turbo provides an I/O path that uses the multi-queue approach to bypasses QEMU. The multi-queue approach allows the data to flow from a VM to the storage more efficiently. This results in a much higher I/O capacity and lower CPU usage. The storage queues automatically scale out to match the number of vCPUs configured for a given VM, and results in a higher performance as the workload scales up.

AHV Turbo is transparent to VMs and is enabled by default on VMs that runs in AHV clusters. For maximum VM performance, ensure that the following conditions are met:

  • The latest Nutanix VirtIO package is installed for Windows VMs. For information on how to download and install the latest VirtIO package, see Installing or Upgrading Nutanix VirtIO for Windows.
    Note: No additional configuration is required at this stage.
  • The VM has more than one vCPU.
  • The workloads are multi-threaded.
Note: Multi-queue is enabled by default in current Linux distributions. For details, refer your vendor-specific documentation for Linux distribution.
In addition to multi-queue approach for storage I/O, you can also achieve the maximum network I/O performance using the multi-queue approach for any vNICs in the system. For information about how to enable multi-queue and set an optimum number of queues, see Enabling RSS Virtio-Net Multi-Queue by Increasing the Number of VNIC Queues .
Note: Ensure that the guest operating system fully supports multi-queue before you enable it. For details, refer your vendor-specific documentation for Linux distribution.

Acropolis Dynamic Scheduling in AHV

Acropolis Dynamic Scheduling (ADS) proactively monitors your cluster for any compute and storage I/O contentions or hotspots over a period of time. If ADS detects a problem, ADS creates a migration plan that eliminates hotspots in the cluster by migrating VMs from one host to another.

You can monitor VM migration tasks from the Task dashboard of the Prism Element web console.

Following are the advantages of ADS:

  • ADS improves the initial placement of the VMs depending on the VM configuration.
  • Nutanix Volumes uses ADS for balancing sessions of the externally available iSCSI targets.
Note: ADS honors all the configured host affinities, VM-host affinities, VM-VM antiaffinity policies, and HA policies.

By default, ADS is enabled and Nutanix recommends you keep this feature enabled. However, see Disabling Acropolis Dynamic Scheduling for information about how to disable the ADS feature. See Enabling Acropolis Dynamic Scheduling for information about how to enable the ADS feature if you previously disabled the feature.

ADS monitors the following resources:

  • VM CPU Utilization: Total CPU usage of each guest VM.
  • Storage CPU Utilization: Storage controller (Stargate) CPU usage per VM or iSCSI target

ADS does not monitor memory and networking usage.

How Acropolis Dynamic Scheduling Works

Lazan is the ADS service in an AHV cluster. AOS selects a Lazan manager and Lazan solver among the hosts in the cluster to effectively manage ADS operations.

ADS performs the following tasks to resolve compute and storage I/O contentions or hotspots:

  • The Lazan manager gathers statistics from the components it monitors.
  • The Lazan solver (runner) checks the statistics for potential anomalies and determines how to resolve them, if possible.
  • The Lazan manager invokes the tasks (for example, VM migrations) to resolve the situation.
Note:
  • During migration, a VM consumes resources on both the source and destination hosts as the High Availability (HA) reservation algorithm must protect the VM on both hosts. If a migration fails due to lack of free resources, turn off some VMs so that migration is possible.
  • If a problem is detected and ADS cannot solve the issue (for example, because of limited CPU or storage resources), the migration plan might fail. In these cases, an alert is generated. Monitor these alerts from the Alerts dashboard of the Prism Element web console and take necessary remedial actions.
  • If the host, firmware, or AOS upgrade is in progress and if any resource contention occurs during the upgrade period, ADS does not perform any resource contention rebalancing.

When Is a Hotspot Detected?

Lazan runs every 15 minutes and analyzes the resource usage for at least that period of time. If the resource utilization of an AHV host remains >85% for the span of 15 minutes, Lazan triggers migration tasks to remove the hotspot.

Note: For a storage hotspot, ADS looks at the last 40 minutes of data and uses a smoothing algorithm to use the most recent data. For a CPU hotspot, ADS looks at the last 10 minutes of data only, that is, the average CPU usage over the last 10 minutes.

Following are the possible reasons if there is an obvious hotspot, but the VMs did not migrate:

  • Lazan cannot resolve a hotspot. For example:
    • If there is a huge VM (16 vCPUs) at 100% usage, and accounts for 75% of the AHV host usage (which is also at 100% usage).
    • The other hosts are loaded at ~ 40% usage.

    In these situations, the other hosts cannot accommodate the large VM without causing contention there as well. Lazan does not prioritize one host or VM over others for contention, so it leaves the VM where it is hosted.

  • Number of all-flash nodes in the cluster is less than the replication factor.

    If the cluster has an RF2 configuration, the cluster must have a minimum of two all-flash nodes for successful migration of VMs on all the all-flash nodes.

Migrations Audit

Prism Central displays the list of all the VM migration operations generated by ADS. In Prism Central, go to Menu -> Activity -> Audits to display the VM migrations list. You can filter the migrations by clicking Filters and selecting Migrate in the Operation Type tab. The list displays all the VM migration tasks created by ADS with details such as the source and target host, VM name, and time of migration.

Disabling Acropolis Dynamic Scheduling

Perform the procedure described in this topic to disable ADS. Nutanix recommends you keep ADS enabled.

Procedure

  1. Log on to a Controller VM in your cluster with SSH.
  2. Disable ADS.
    nutanix@cvm$ acli ads.update enable=false

    No action is taken by ADS to solve the contentions after you disable the ADS feature. You must manually take the remedial actions or you can enable the feature.

Enabling Acropolis Dynamic Scheduling

If you have disabled the ADS feature and want to enable the feature, perform the following procedure.

Procedure

  1. Log onto a Controller VM in your cluster with SSH.
  2. Enable ADS.
    nutanix@cvm$ acli ads.update enable=true

Virtualization Management Web Console Interface

You can manage the virtualization management features by using the Prism GUI (Prism Element and Prism Central web consoles).

You can do the following by using the Prism web consoles:

  • Configure network connections
  • Create virtual machines
  • Manage virtual machines (launch console, start/shut down, take snapshots, migrate, clone, update, and delete)
  • Monitor virtual machines
  • Enable VM high availability

See Prism Web Console Guide and Prism Central Guide for more information.

Finding the AHV Version on Prism Element

You can see the installed AHV version in the Prism Element web console.

About this task

To view the AHV version installed on the host, do the following.

Procedure

  1. Log on to Prism Web Console
  2. The Hypervisor Summary widget widget on the top left side of the Home page displays the AHV version.
    Figure. LCM Page Displays AHV Version Click to enlarge Displaying the LCM page which shows the AHV version installed.

Finding the AHV Version on Prism Central

You can see the installed AHV version in the Prism Central console.

About this task

To view the AHV version installed on any host in the clusters managed by the Prism Central, do the following.
Video: Finding the AHV Version on Prism Central

Procedure

  1. Log on to Prism Central.
  2. In side bar, select Hardware > Hosts > Summary tab.
  3. Click the host you want to see the hypervisor version for.
  4. The Host detail view page displays the Properties widget that lists the Hypervisor Version .
    Figure. Hypervisor Version in Host Detail View Click to enlarge Displaying the Host details page showing the Hypervisor Version.

Node Management

Nonconfigurable AHV Components

The components listed here are configured by the Nutanix manufacturing and installation processes. Do not modify any of these components except under the direction of Nutanix Support.

Nutanix Software

Modifying any of the following Nutanix software settings may inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.

  • Local datastore name.
  • Configuration and contents of any CVM (except memory configuration to enable certain features).
Important: Note the following important considerations about CVMs.
  • Do not delete the Nutanix CVM.
  • Do not take a snapshot of the CVM for backup.
  • Do not rename, modify, or delete the admin and nutanix user accounts of the CVM.
  • Do not create additional CVM user accounts.

    Use the default accounts ( admin or nutanix ), or use sudo to elevate to the root account.

  • Do not decrease CVM memory below recommended minimum amounts required for cluster and add-in features.

    Nutanix Cluster Checks (NCC), preupgrade cluster checks, and the AOS upgrade process detect and monitor CVM memory.

  • Nutanix does not support the usage of third-party storage on the host part of Nutanix clusters.

    Normal cluster operations might be affected if there are connectivity issues with the third-party storage you attach to the hosts in a Nutanix cluster.

  • Do not run any commands on a CVM that are not in the Nutanix documentation.

AHV Settings

Nutanix AHV is a cluster-optimized hypervisor appliance.

Alteration of the hypervisor appliance (unless advised by Nutanix Technical Support) is unsupported and may result in the hypervisor or VMs functioning incorrectly.

Unsupported alterations include (but are not limited to):

  • Hypervisor configuration, including installed packages
  • Controller VM virtual hardware configuration file (.xml file). Each AOS version and upgrade includes a specific Controller VM virtual hardware configuration. Therefore, do not edit or otherwise modify the Controller VM virtual hardware configuration file.
  • iSCSI settings
  • Open vSwitch settings

  • Installation of third-party software not approved by Nutanix
  • Installation or upgrade of software packages from non-Nutanix sources (using yum, rpm, or similar)
  • Taking snapshots of the Controller VM
  • Creating user accounts on AHV hosts
  • Changing the timezone of the AHV hosts. By default, the timezone of an AHV host is set to UTC.
  • Joining AHV hosts to Active Directory or OpenLDAP domains

Controller VM Access

Although each host in a Nutanix cluster runs a hypervisor independent of other hosts in the cluster, some operations affect the entire cluster.

Most administrative functions of a Nutanix cluster can be performed through the web console (Prism), however, there are some management tasks that require access to the Controller VM (CVM) over SSH. Nutanix recommends restricting CVM SSH access with password or key authentication.

This topic provides information about how to access the Controller VM as an admin user and nutanix user.

admin User Access

Use the admin user access for all tasks and operations that you must perform on the controller VM. As an admin user with default credentials, you cannot access nCLI. You must change the default password before you can use nCLI. Nutanix recommends that you do not create additional CVM user accounts. Use the default accounts ( admin or nutanix ), or use sudo to elevate to the root account.

For more information about admin user access, see Admin User Access to Controller VM.

nutanix User Access

Nutanix strongly recommends that you do not use the nutanix user access unless the procedure (as provided in a Nutanix Knowledge Base article or user guide) specifically requires the use of the nutanix user access.

For more information about nutanix user access, see Nutanix User Access to Controller VM.

You can perform most administrative functions of a Nutanix cluster through the Prism web consoles or REST API. Nutanix recommends using these interfaces whenever possible and disabling Controller VM SSH access with password or key authentication. Some functions, however, require logging on to a Controller VM with SSH. Exercise caution whenever connecting directly to a Controller VM as it increases the risk of causing cluster issues.

Warning: When you connect to a Controller VM with SSH, ensure that the SSH client does not import or change any locale settings. The Nutanix software is not localized, and running the commands with any locale other than en_US.UTF-8 can cause severe cluster issues.

To check the locale used in an SSH session, run /usr/bin/locale . If any environment variables are set to anything other than en_US.UTF-8 , reconnect with an SSH configuration that does not import or change any locale settings.

Admin User Access to Controller VM

You can access the Controller VM as the admin user ( admin user name and password) with SSH. For security reasons, the password of the admin user must meet Controller VM Password Complexity Requirements . When you log on to the Controller VM as the admin user for the first time, you are prompted to change the default password.

See Controller VM Password Complexity Requirements to set a secure password.

After you have successfully changed the password, the new password is synchronized across all Controller VMs and interfaces (Prism web console, nCLI, and SSH).

Note:
  • As an admin user, you cannot access nCLI by using the default credentials. If you are logging in as the admin user for the first time, you must log on through the Prism web console or SSH to the Controller VM. Also, you cannot change the default password of the admin user through nCLI. To change the default password of the admin user, you must log on through the Prism web console or SSH to the Controller VM.
  • When you make an attempt to log in to the Prism web console for the first time after you upgrade to AOS 5.1 from an earlier AOS version, you can use your existing admin user password to log in and then change the existing password (you are prompted) to adhere to the password complexity requirements. However, if you are logging in to the Controller VM with SSH for the first time after the upgrade as the admin user, you must use the default admin user password (Nutanix/4u) and then change the default password (you are prompted) to adhere to the Controller VM Password Complexity Requirements .
  • You cannot delete the admin user account.
  • The default password expiration age for the admin user is 60 days. You can configure the minimum and maximum password expiration days based on your security requirement.
    • nutanix@cvm$ sudo chage -M MAX-DAYS admin
    • nutanix@cvm$ sudo chage -m MIN-DAYS admin

When you change the admin user password, you must update any applications and scripts using the admin user credentials for authentication. Nutanix recommends that you create a user assigned with the admin role instead of using the admin user for authentication. The Prism Web Console Guide describes authentication and roles.

Following are the default credentials to access a Controller VM.

Table 1. Controller VM Credentials
Interface Target User Name Password
SSH client Nutanix Controller VM admin Nutanix/4u
nutanix nutanix/4u
Prism web console Nutanix Controller VM admin Nutanix/4u

Accessing the Controller VM Using the Admin User Account

About this task

Perform the following procedure to log on to the Controller VM by using the admin user with SSH for the first time.

Procedure

  1. Log on to the Controller VM with SSH by using the management IP address of the Controller VM and the following credentials.
    • User name: admin
    • Password: Nutanix/4u
    You are now prompted to change the default password.
  2. Respond to the prompts, providing the current and new admin user password.
    Changing password for admin.
    Old Password:
    New password:
    Retype new password:
    Password changed.
    

    See the requirements listed in Controller VM Password Complexity Requirements to set a secure password.

    For information about logging on to a Controller VM by using the admin user account through the Prism web console, see Logging Into The Web Console in the Prism Web Console Guide .

Nutanix User Access to Controller VM

You can access the Controller VM as the nutanix user ( nutanix user name and password) with SSH. For security reasons, the password of the nutanix user must meet the Controller VM Password Complexity Requirements. When you log on to the Controller VM as the nutanix user for the first time, you are prompted to change the default password.

See Controller VM Password Complexity Requirementsto set a secure password.

After you have successfully changed the password, the new password is synchronized across all Controller VMs and interfaces (Prism web console, nCLI, and SSH).

Note:
  • As a nutanix user, you cannot access nCLI by using the default credentials. If you are logging in as the nutanix user for the first time, you must log on through the Prism web console or SSH to the Controller VM. Also, you cannot change the default password of the nutanix user through nCLI. To change the default password of the nutanix user, you must log on through the Prism web console or SSH to the Controller VM.

  • When you make an attempt to log in to the Prism web console for the first time after you upgrade the AOS from an earlier AOS version, you can use your existing nutanix user password to log in and then change the existing password (you are prompted) to adhere to the password complexity requirements. However, if you are logging in to the Controller VM with SSH for the first time after the upgrade as the nutanix user, you must use the default nutanix user password (nutanix/4u) and then change the default password (you are prompted) to adhere to the Controller VM Password Complexity Requirements.

  • You cannot delete the nutanix user account.
  • You can configure the minimum and maximum password expiration days based on your security requirement.
    • nutanix@cvm$ sudo chage -M MAX-DAYS admin
    • nutanix@cvm$ sudo chage -m MIN-DAYS admin

When you change the nutanix user password, you must update any applications and scripts using the nutanix user credentials for authentication. Nutanix recommends that you create a user assigned with the nutanix role instead of using the nutanix user for authentication. The Prism Web Console Guide describes authentication and roles.

Following are the default credentials to access a Controller VM.

Table 1. Controller VM Credentials
Interface Target User Name Password
SSH client Nutanix Controller VM admin Nutanix/4u
nutanix nutanix/4u
Prism web console Nutanix Controller VM admin Nutanix/4u

Accessing the Controller VM Using the Nutanix User Account

About this task

Perform the following procedure to log on to the Controller VM by using the nutanix user with SSH for the first time.

Procedure

  1. Log on to the Controller VM with SSH by using the management IP address of the Controller VM and the following credentials.
    • User name: nutanix
    • Password: nutanix/4u
    You are now prompted to change the default password.
  2. Respond to the prompts, providing the current and new nutanix user password.
    Changing password for nutanix.
    Old Password:
    New password:
    Retype new password:
    Password changed.
    

    See Controller VM Password Complexity Requirementsto set a secure password.

    For information about logging on to a Controller VM by using the nutanix user account through the Prism web console, see Logging Into The Web Console in the Prism Web Console Guide .

Controller VM Password Complexity Requirements

The password must meet the following complexity requirements:

  • At least eight characters long.
  • At least one lowercase letter.
  • At least one uppercase letter.
  • At least one number.
  • At least one special character.
    Note: Ensure that the following conditions are met for the special characters usage in the CVM password:
    • The special characters are appropriately used while setting up the CVM password. In some cases, for example when you use ! followed by a number in the CVM password, it leads to a special meaning at the system end, and the system may replace it with a command from the bash history. In this case, you may generate a password string different from the actual password that you intend to set.
    • The special character used in the CVM password are ASCII printable characters only. For information about ACSII printable characters, refer ASCII printable characters (character code 32-127) article on ASCII code website.
  • At least four characters difference from the old password.
  • Must not be among the last 5 passwords.
  • Must not have more than 2 consecutive occurrences of a character.
  • Must not be longer than 199 characters.

AHV Host Access

You can perform most of the administrative functions of a Nutanix cluster using the Prism web consoles or REST API. Nutanix recommends using these interfaces whenever possible. Some functions, however, require logging on to an AHV host with SSH.

Note: From AOS 5.15.5 with AHV 20190916.410 onwards, AHV has two new user accounts— admin and nutanix .

Nutanix provides the following users to access the AHV host:

  • root —It is used internally by the AOS. The root user is used for the initial access and configuration of the AHV host.
  • admin —It is used to log on to an AHV host. The admin user is recommended for accessing the AHV host.
  • nutanix —It is used internally by the AOS and must not be used for interactive logon.

Exercise caution whenever connecting directly to an AHV host as it increases the risk of causing cluster issues.

Following are the default credentials to access an AHV host:

Table 1. AHV Host Credentials
Interface Target User Name Password
SSH client AHV Host root nutanix/4u
admin

There is no default password for admin. You must set it during the initial configuration.

nutanix nutanix/4u

Initial Configuration

About this task

The AHV host is shipped with the default password for the root and nutanix users, which must be changed using SSH when you log on to the AHV host for the first time. After changing the default passwords and the admin password, all subsequent logins to the AHV host must be with the admin user.

Perform the following procedure to change admin user account password for the first time:
Note: Perform this initial configuration on all the AHV hosts.

Procedure

  1. Use SSH and log on to the AHV host using the root account.
    $ ssh root@<AHV Host IP Address>
    Nutanix AHV
    root@<AHV Host IP Address> password: # default password nutanix/4u
    
  2. Change the default root user password.
    root@ahv# passwd root
    Changing password for user root.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
  3. Change the default nutanix user password.
    root@ahv# passwd nutanix
    Changing password for user nutanix.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
  4. Change the admin user password.
    root@ahv# passwd admin
    Changing password for user admin.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    

Accessing the AHV Host Using the Admin Account

About this task

After setting the admin password in the Initial Configuration, use the admin user for all subsequent logins.

Perform the following procedure to log on to the Controller VM by using the admin user with SSH for the first time.

Procedure

  1. Log on to the AHV host with SSH using the admin account.
    $ ssh admin@ <AHV Host IP Address> 
    Nutanix AHV
    
  2. Enter the admin user password configured in the Initial Configuration.
    admin@<AHV Host IP Address> password:
  3. Append sudo to the commands if privileged access is required.
    $ sudo ls /var/log

Changing Admin User Password

About this task

Perform these steps to change the admin password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Enter the admin user password configured in the Initial Configuration.
  3. Run the sudo command to change to admin user password.
    $ sudo passwd admin
  4. Respond to the prompts and provide the new password.
    [sudo] password for admin: 
    Changing password for user admin.
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

Changing the Root User Password

About this task

Perform these steps to change the root password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Run the sudo command to change to root user.
  3. Change the root password.
    root@ahv# passwd root
  4. Respond to the prompts and provide the current and new root password.
    Changing password for root.
    New password:
    Retype new password:
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

Changing Nutanix User Password

About this task

Perform these steps to change the nutanix password on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host using the admin account with SSH.
  2. Run the sudo command to change to root user.
  3. Change the nutanix password.
    root@ahv# passwd nutanix
  4. Respond to the prompts and provide the current and new nutanix password.
    Changing password for nutanix.
    New password:
    Retype new password:
    passwd: all authentication tokens updated successfully.
    
    Note: Repeat this step for each AHV host.

    See AHV Host Password Complexity Requirements to set a secure password.

AHV Host Password Complexity Requirements

The password you choose must meet the following complexity requirements:

  • In configurations with high-security requirements, the password must contain:
    • At least 15 characters.
    • At least one upper case letter (A–Z).
    • At least one lower case letter (a–z).
    • At least one digit (0–9).
    • At least one printable ASCII special (non-alphanumeric) character. For example, a tilde (~), exclamation point (!), at sign (@), number sign (#), or dollar sign ($).
    • At least eight characters different from the previous password.
    • At most three consecutive occurrences of any given character.
    • At most four consecutive occurrences of any given class.

The password cannot be the same as the last 5 passwords.

  • In configurations without high-security requirements, the password must contain:
    • At least eight characters.
    • At least one upper case letter (A–Z).
    • At least one lower case letter (a–z).
    • At least one digit (0–9).
    • At least one printable ASCII special (non-alphanumeric) character. For example, a tilde (~), exclamation point (!), at sign (@), number sign (#), or dollar sign ($).
    • At least three characters different from the previous password.
    • At most three consecutive occurrences of any given character.

The password cannot be the same as the last 5 passwords.

In both types of configuration, if a password for an account is entered three times unsuccessfully within a 15-minute period, the account is locked for 15 minutes.

Verifying the Cluster Health

Before you perform operations such as restarting a CVM or AHV host and putting an AHV host into maintenance mode, check if the cluster can tolerate a single-node failure.

Before you begin

Ensure that you are running the most recent version of NCC.

About this task

Note: If you see any critical alerts, resolve the issues by referring to the indicated KB articles. If you are unable to resolve any issues, contact Nutanix Support.

Perform the following steps to avoid unexpected downtime or performance issues.

Procedure

  1. Review and resolve any critical alerts. Do one of the following:
    • In the Prism Element web console, go to the Alerts page.
    • Log on to a Controller VM (CVM) with SSH and display the alerts.
      nutanix@cvm$ ncli alert ls
    Note: If you receive alerts indicating expired encryption certificates or a key manager is not reachable, resolve these issues before you shut down the cluster. If you do not resolve these issues, data loss of the cluster might occur.
  2. Verify if the cluster can tolerate a single-node failure. Do one of the following:
    • In the Prism Element web console, in the Home page, check the status of the Data Resiliency Status dashboard.

      Verify that the status is OK . If the status is anything other than OK , resolve the indicated issues before you perform any maintenance activity.

    • Log on to a Controller VM (CVM) with SSH and check the fault tolerance status of the cluster.
      nutanix@cvm$ ncli cluster get-domain-fault-tolerance-status type=node
      

      An output similar to the following is displayed:

      Important:
      Domain Type               : NODE
          Component Type            : STATIC_CONFIGURATION
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 14:22:09 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : ERASURE_CODE_STRIP_SIZE
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : METADATA
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Mon Sep 28 14:35:25 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : ZOOKEEPER
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Thu Sep 17 11:09:39 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : EXTENT_GROUPS
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : OPLOG
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 13:19:58 GMT+05:00 2015
      
          Domain Type               : NODE
          Component Type            : FREE_SPACE
          Current Fault Tolerance   : 1
          Fault Tolerance Details   : 
          Last Update Time          : Wed Nov 18 14:20:57 GMT+05:00 2015
      

      The value of the Current Fault Tolerance column must be at least 1 for all the nodes in the cluster.

Node Maintenance Mode

You are required to gracefully place a node into the maintenance mode or non-operational state for reasons such as making changes to the network configuration of a node, performing manual firmware upgrades or replacements, performing CVM maintenance or any other maintenance operations.

Entering and Exiting Maintenance Mode

You can only place one node at a time in maintenance mode for each cluster.​ When a host is in maintenance mode, the CVM is placed in maintenance mode as part of the node maintenance operation and any associated RF1 VMs are powered-off. The cluster marks the host as unschedulable so that no new VM instances are created on it. When a node is placed in the maintenance mode from the Prism web console, an attempt is made to evacuate VMs from the host. If the evacuation attempt fails, the host remains in the entering maintenance mode state, where it is marked unschedulable, waiting for user remediation.

When a host is placed in the maintenance mode, the non-migratable VMs (for example, pinned or RF1 VMs which have affinity towards a specific node) are powered-off while live migratable or high availability (HA) VMs are moved from the original host to other hosts in the cluster. After exiting the maintenance mode, all non-migratable guest VMs are powered on again and the live migrated VMs are automatically restored on the original host.
Note: VMs with CPU passthrough or PCI passthrough, pinned VMs (with host affinity policies), and RF1 VMs are not migrated to other hosts in the cluster when a node undergoes maintenance. Click View these VMs link to view the list of VMs that cannot be live-migrated.

For information about how to place a node under maintenance, see Putting a Node into Maintenance Mode using Web Console.

You can also place an AHV host under maintenance mode or exit an AHV host from maintenance mode through the CLI.
Note: Using the CLI method to place an AHV host under maintenance only places the hypervisor under maintenance mode. The CVM is up running in this method. To place the entire node under maintenance, Nutanix recommends using the UI method (through web console).
  • For information about how to use the CLI method to place an AHV host in maintenance mode, see Putting a Node into Maintenance Mode using CLI .
  • For information about how to use the CLI method to exit a node from the maintenance mode, see Exiting a Node from the Maintenance Mode Using CLI .

Exiting a Node from Maintenance Mode

For information about how to remove a node from the maintenance mode, see Exiting a Node from the Maintenance Mode using Web Console.

Viewing a Node under Maintenance Mode

For information about how to view the node under maintenance mode, see Viewing a Node that is in Maintenance Mode.

UVM Status When Node under Maintenance Mode

For information about how to view the status of UVMs when a node is undergoing maintenance operations, see Guest VM Status when Node is in Maintenance Mode.

Best Practices and Recommendations

Nutanix strongly recommends using the Enter Maintenance Mode option on the Prism web console to place a node under maintenance.

Known Issues and Limitations

  • The Prism web console enabled maintenance operations (enter and exit node maintenance) are currently supported only on AHV.
  • Entering or exiting a node under maintenance from the CLI is not equivalent to entering or exiting the node under maintenance from the Prism Element web console. For example, placing a node under maintenance from the CLI places the AHV host and CVM under maintenance while the CVM continues to remain powered on.
  • You must exit the node from maintenance mode using the same method that you have used to put the node into maintenance mode. For example, if you used CLI to put the node into maintenance mode, you must use CLI to exit the node from maintenance mode. Similarly, if you used web console to put the node into maintenance mode, you must use the web console to exit the node from maintenance mode.

Putting a Node into Maintenance Mode using Web Console

Before you begin

Check the cluster status and resiliency before putting a node under maintenance. You can also verify the status of the UVMs. See Guest VM Status when Node is in Maintenance Mode for more information.

About this task

As the node enter the maintenance mode, the following high-level tasks are performed internally.
  • The AHV host initiates entering the maintenance mode.
  • The HA VMs are live migrated.
  • The pinned and RF1 VMs are powered-off.
  • The AHV host completes entering the maintenance mode.
    Note: At this stage, the AHV host is not shut down. For information about how to shut down the AHV host, see Shutting Down a Node in a Cluster (AHV). You can list all the hosts in the cluster by running nutanix@cvm$ acli host.list command, and note the value of Hypervisor IP for the node you want to shut down.
  • The CVM enters the maintenance mode.
  • The CVM is shut down.

For more information, see Guest VM Status when Node is in Maintenance Mode to view the status of the UVMs.

Perform the following steps to put the node into maintenance mode.

Procedure

  1. Login to the Prism Element web console.
  2. On the home page, select Hardware from the drop-down menu.
  3. Go to the Table > Host view.
  4. Select the node which you intend to put under maintenance.
  5. Click the Enter Maintenance Mode option.
    Figure. Enter Maintenance Mode Option Click to enlarge

    The Host Maintenance window appears with a prompt to power-off all VMs that cannot be live migrated.
    Figure. Host Maintenance Window (Enter Maintenance Mode Enabled) Click to enlarge

    Note: VMs with CPU passthrough, PCI passthrough, pinned VMs (with host affinity policies), and RF1 are not migrated to other hosts in the cluster when a node undergoes maintenance. Click View these VMs link to view the list of VMs that cannot be live-migrated.
  6. Select the Power-off VMs that can not migrate check-box to enable the Enter Maintenance Mode button.
  7. Click the Enter Maintenance Mode button.
    • A revolving icon appears as a tool tip beside the selected node and also in the Host Details view. This indicates that the host is entering the maintenance mode.
    • The revolving icon disappears and the Exit Maintenance Mode option is enabled after the node completely enters the maintenance mode.
      Figure. Enter Node Maintenance (On-going) Click to enlarge

    • You can also monitor the progress of the node maintenance operation through the newly created Host enter maintenance and Enter maintenance mode tasks which appear in the task tray.
    Note: In case of a node maintenance failure, certain rolled-back operations are performed. For example, the CVM is rebooted. But the live migrated are not restored to the original host.

What to do next

Once the maintenance activity is complete, you can perform any of the following.
  • View the nodes under maintenance, see Viewing a Node that is in Maintenance Mode
  • View the status of the UVMs, see Guest VM Status when Node is in Maintenance Mode
  • Remove the node from the maintenance mode Exiting a Node from the Maintenance Mode using Web Console)

Viewing a Node that is in Maintenance Mode

About this task

Note: This procedure is the same for AHV and ESXI nodes.

Perform the following steps to view a node under maintenance.

Procedure

  1. Login to the Prism Element web console.
  2. On the home page, select Hardware from the drop-down menu.
  3. Go to the Table > Host view.
  4. Observe the icon along with a tool tip that appears beside the node which is under maintenance. You can also view this icon in the host details view.
    Figure. Example: Node under Maintenance (Table and Host Details View) in AHV Click to enlarge

  5. Alternatively, view the node under maintenance from the Hardware > Diagram view.
    Figure. Example: Node under Maintenance (Diagram and Host Details View) in AHV Click to enlarge

What to do next

You can:
  • View the status of the guest VMs, see Guest VM Status when Node is in Maintenance Mode.
  • Remove the node from the maintenance mode Exiting a Node from the Maintenance Mode using Web Console .

Exiting a Node from the Maintenance Mode using Web Console

After you perform any maintenance activity, exit the node from the maintenance mode.

About this task

As the node exits the maintenance mode, the following high-level tasks are performed internally.
  • The CVM is powered on.
  • The CVM is taken out of maintenance.
  • The host is taken out of maintenance.
    Note: The AHV host is shut down during Putting a Node into Maintenance Mode using Web Console and it is required to power on the AHV host. For information about how to power on the AHV host, see Starting a Node in a Cluster (AHV).
    .
After the host exits the maintenance mode, the RF1 VMs continue to be powered on and the VMs migrate to restore host locality.

For more information, see Guest VM Status when Node is in Maintenance Mode to view the status of the UVMs.

Perform the following steps to remove the node into maintenance mode.

Procedure

  1. On the Prism web console home page, select Hardware from the drop-down menu.
  2. Go to the Table > Host view.
  3. Select the node which you intend to remove from the maintenance mode.
  4. Click the Exit Maintenance Mode option.
    Figure. Exit Maintenance Mode Option Click to enlarge

    The Host Maintenance window appears.
  5. Click the Exit Maintenance Mode button.
    Figure. Host Maintenance Window (Exit Maintenance Mode) Click to enlarge

    • A revolving icon appears as a tool tip beside the selected node and also in the Host Details view. This indicates that the host is exiting the maintenance mode.
    • The revolving icon disappears and the Enter Maintenance Mode option is enabled after the node completely exits the maintenance mode.
      Figure. Exit Node Maintenance (On-going) Click to enlarge

    • You can also monitor the progress of the exit node maintenance operation through the newly created Host exit maintenance and Exit maintenance mode tasks which appear in the task tray.

What to do next

Once a node exits the maintenance mode, you can perform any of the following.
  • View the status of node under maintenance, see Viewing a Node that is in Maintenance Mode.
  • View the status of the UVMs, see Guest VM Status when Node is in Maintenance Mode.

Guest VM Status when Node is in Maintenance Mode

The following scenarios demonstrate the behavior of three guest VM types - high availability (HA) VMs, pinned VMs, and RF1 VMs, when a node enters and exits a maintenance operation. The HA VMs are live VMs that can migrate across nodes if the host server goes down or reboots. The pinned VMs have the host affinity set to a specific node. The RF1 VMs have affinity towards a specific node or a CVM. To view the status of the guest VMs, go to VM > Table .

Note: The following scenarios are the same for AHV and ESXI nodes.

Scenario 1: Guest VMs before Node Entering Maintenance Mode

In this example, you can observe the status of the guest VMs on the node prior to the node entering the maintenance mode. All the guest VMs are powered-on and reside on the same host.

Figure. Example: Original State of VM and Hosts in AHV Click to enlarge

Scenario 2: Guest VMs during Node Maintenance Mode

  • As the node enter the maintenance mode, the following high-level tasks are performed internally.
    1. The host initiates entering the maintenance mode.
    2. The HA VMs are live migrated.
    3. The pinned and RF1 VMs are powered-off.
    4. The host completes entering the maintenance mode.
    5. The CVM enters the maintenance mode.
    6. The AHV host completes entering the maintenance mode.
    7. The CVM enters the maintenance mode.
    8. The CVM is shut down.
Figure. Example: VM and Hosts before Entering Maintenance Mode Click to enlarge

Scenario 3: Guest VMs after Node Exiting Maintenance Mode

  • As the node exits the maintenance mode, the following high-level tasks are performed internally.
    1. The CVM is powered on.
    2. The CVM is taken out of maintenance.
    3. The host is taken out of maintenance.
    After the host exits the maintenance mode, the RF1 VMs continue to be powered on and the VMs migrate to restore host locality.
Figure. Example: Original State of VM and Hosts in AHV Click to enlarge

Putting a Node into Maintenance Mode using CLI

You are required to put a node into maintenance mode for reasons such as making changes to the network configuration of a node, performing manual firmware upgrades, or any other.

Before you begin

Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

About this task

When a host is in maintenance mode, AOS marks the host as unschedulable so that no new VM instances are created on it. Next, an attempt is made to evacuate VMs from the host.

If the evacuation attempt fails, the host remains in the "entering maintenance mode" state, where it is marked unschedulable, waiting for user remediation. You can shut down VMs on the host or move them to other nodes. Once the host has no more running VMs, it is in maintenance mode.

When a host is in maintenance mode, VMs are moved from that host to other hosts in the cluster. After exiting maintenance mode, those VMs are automatically returned to the original host, eliminating the need to manually move them.

VMs with GPU, CPU passthrough, PCI passthrough, and host affinity policies are not migrated to other hosts in the cluster. You can choose to shut down such VMs while putting the node into maintenance mode.

Agent VMs are always shut down if you put a node in maintenance mode and are powered on again after exiting maintenance mode.

Perform the following steps to put the node into maintenance mode.

Procedure

  1. Use SSH to log on to a Controller VM in the cluster.
  2. Determine the IP address of the node you want to put into maintenance mode.
    nutanix@cvm$ acli host.list

    Note the value of Hypervisor IP for the node you want to put in maintenance mode.

  3. Put the node into maintenance mode.
    nutanix@cvm$ acli host.enter_maintenance_mode hypervisor-IP-address [wait="{ true | false }" ] [non_migratable_vm_action="{ acpi_shutdown | block }" ]
    Note: Never put Controller VM and AHV hosts into maintenance mode on single-node clusters. It is recommended to shutdown user VMs before proceeding with disruptive changes.

    Replace host-IP-address with either the IP address or host name of the AHV host you want to shut down.

    The following are optional parameters for running the acli host.enter_maintenance_mode command:

    • wait : Set the wait parameter to true to wait for the host evacuation attempt to finish.
    • non_migratable_vm_action : By default the non_migratable_vm_action parameter is set to block , which means VMs with GPU, CPU passthrough, PCI passthrough, and host affinity policies are not migrated or shut down when you put a node into maintenance mode.

      If you want to automatically shut down such VMs, set the non_migratable_vm_action parameter to acpi_shutdown .

  4. Verify if the host is in the maintenance mode.
    nutanix@cvm$ acli host.get host-ip

    In the output that is displayed, ensure that node_state equals to EnteredMaintenanceMode and schedulable equals to False .

    Do not continue if the host has failed to enter the maintenance mode.

  5. See Verifying the Cluster Health to once again check if the cluster can tolerate a single-node failure.
  6. Put the CVM into the maintenance mode.
    nutanix@cvm$ ncli host edit id=host-ID enable-maintenance-mode=true

    Replace host-ID with the ID of the host.

    This step prevents the CVM services from being affected by any connectivity issues.

  7. Determine the ID of the host.
    nutanix@cvm$ ncli host list

    An output similar to the following is displayed:

    Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
    Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
    Name                      : XXXXXXXXXXX-X 
    IPMI Address              : X.X.Z.3 
    Controller VM Address     : X.X.X.1 
    Hypervisor Address        : X.X.Y.2
    

    In this example, the host ID is 1234.

    Wait for a few minutes until the CVM is put into the maintenance mode.

  8. Verify if the CVM is in the maintenance mode.

    Run the following command on the CVM that you put in the maintenance mode.

    nutanix@cvm$ genesis status | grep -v "\[\]"

    An output similar to the following is displayed:

    nutanix@cvm$ genesis status | grep -v "\[\]"
    2021-09-24 05:28:03.827628: Services running on this node:
      genesis: [11189, 11390, 11414, 11415, 15671, 15672, 15673, 15676]
      scavenger: [27241, 27525, 27526, 27527]
      xmount: [25915, 26055, 26056, 26074]
      zookeeper: [13053, 13101, 13102, 13103, 13113, 13130]
    nutanix@cvm$ 

    Only the Genesis, Scavenger, Xmount, and Zookeeper processes must be running (process ID is displayed next to the process name).

    Do not continue if the CVM has failed to enter the maintenance mode, because it can cause a service interruption.

What to do next

Perform the maintenance activity. Once the maintenance activity is complete, remove the node from the maintenance mode. See Exiting a Node from the Maintenance Mode Using CLI for more information.

Exiting a Node from the Maintenance Mode Using CLI

After you perform any maintenance activity, exit the node from the maintenance mode.

About this task

Perform the following to exit the host from the maintenance mode.

Procedure

  1. Remove the CVM from the maintenance mode.
    1. Determine the ID of the host.
      nutanix@cvm$ ncli host list

      An output similar to the following is displayed:

      Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
      Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
      Name                      : XXXXXXXXXXX-X 
      IPMI Address              : X.X.Z.3 
      Controller VM Address     : X.X.X.1 
      Hypervisor Address        : X.X.Y.2
      

      In this example, the host ID is 1234.

    1. From any other CVM in the cluster, run the following command to exit the CVM from the maintenance mode.
      nutanix@cvm$ ncli host edit id=host-ID enable-maintenance-mode=false

      Replace host-ID with the ID of the host.

      Note: The command fails if you run the command from the CVM that is in the maintenance mode.
    2. Verify if all the processes on all the CVMs are in the UP state.
      nutanix@cvm$ cluster status | grep -v UP
    Do not continue if the CVM has failed to exit the maintenance mode.
  2. Remove the AHV host from the maintenance mode.
    1. From any CVM in the cluster, run the following command to exit the AHV host from the maintenance mode.
      nutanix@cvm$ acli host.exit_maintenance_mode host-ip 
      

      Replace host-ip with the new IP address of the host.

      This command migrates (live migration) all the VMs that were previously running on the host back to the host.

    2. Verify if the host has exited the maintenance mode.
      nutanix@cvm$ acli host.get host-ip 

      In the output that is displayed, ensure that node_state equals to kAcropolisNormal or AcropolisNormal and schedulable equals to True .

    Contact Nutanix Support if any of the steps described in this document produce unexpected results.

Shutting Down a Node in a Cluster (AHV)

Before you begin

Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

See Verifying the Cluster Health to check if the cluster can tolerate a single-node failure. Do not proceed if the cluster cannot tolerate a single-node failure.

About this task

Perform the following procedure to shut down a node.

Procedure

  1. Put the node into maintenance mode as described in Putting a Node into Maintenance Mode using Web Console.
  2. Log on to the AHV host with SSH.
  3. Shut down the host.
    root@ahv# shutdown -h now

What to do next

See Starting a Node in a Cluster (AHV) for instructions about how to start a node, including how to start a CVM and how to exit a node from maintenance mode.

Starting a Node in a Cluster (AHV)

About this task

Procedure

  1. On the hardware appliance, power on the node. The CVM starts automatically when your reboot the node.
  2. If the node is in maintenance mode, log on to Prism Web Console and remove the node from the maintenance mode.
    See Exiting a Node from the Maintenance Mode using Web Console for more information.
  3. Log on to another CVM in the Nutanix cluster with SSH.
  4. Verify that the status of all services on all the CVMs are Up.
    nutanix@cvm$ cluster status
    If the Nutanix cluster is running properly, output similar to the following is displayed for each node in the Nutanix cluster.
    CVM:host IP-Address Up
                                    Zeus   UP       [9935, 9980, 9981, 9994, 10015, 10037]
                               Scavenger   UP       [25880, 26061, 26062]
                                  Xmount   UP       [21170, 21208]
                        SysStatCollector   UP       [22272, 22330, 22331]
                               IkatProxy   UP       [23213, 23262]
                        IkatControlPlane   UP       [23487, 23565]
                           SSLTerminator   UP       [23490, 23620]
                          SecureFileSync   UP       [23496, 23645, 23646]
                                  Medusa   UP       [23912, 23944, 23945, 23946, 24176]
                      DynamicRingChanger   UP       [24314, 24404, 24405, 24558]
                                  Pithos   UP       [24317, 24555, 24556, 24593]
                              InsightsDB   UP       [24322, 24472, 24473, 24583]
                                  Athena   UP       [24329, 24504, 24505]
                                 Mercury   UP       [24338, 24515, 24516, 24614]
                                  Mantle   UP       [24344, 24572, 24573, 24634]
                              VipMonitor   UP       [18387, 18464, 18465, 18466, 18474]
                                Stargate   UP       [24993, 25032]
                    InsightsDataTransfer   UP       [25258, 25348, 25349, 25388, 25391, 25393, 25396]
                                   Ergon   UP       [25263, 25414, 25415]
                                 Cerebro   UP       [25272, 25462, 25464, 25581]
                                 Chronos   UP       [25281, 25488, 25489, 25547]
                                 Curator   UP       [25294, 25528, 25529, 25585]
                                   Prism   UP       [25718, 25801, 25802, 25899, 25901, 25906, 25941, 25942]
                                     CIM   UP       [25721, 25829, 25830, 25856]
                            AlertManager   UP       [25727, 25862, 25863, 25990]
                                Arithmos   UP       [25737, 25896, 25897, 26040]
                                 Catalog   UP       [25749, 25989, 25991]
                               Acropolis   UP       [26011, 26118, 26119]
                                   Uhura   UP       [26037, 26165, 26166]
                                    Snmp   UP       [26057, 26214, 26215]
                       NutanixGuestTools   UP       [26105, 26282, 26283, 26299]
                              MinervaCVM   UP       [27343, 27465, 27466, 27730]
                           ClusterConfig   UP       [27358, 27509, 27510]
                                Aequitas   UP       [27368, 27567, 27568, 27600]
                             APLOSEngine   UP       [27399, 27580, 27581]
                                   APLOS   UP       [27853, 27946, 27947]
                                   Lazan   UP       [27865, 27997, 27999]
                                  Delphi   UP       [27880, 28058, 28060]
                                    Flow   UP       [27896, 28121, 28124]
                                 Anduril   UP       [27913, 28143, 28145]
                                   XTrim   UP       [27956, 28171, 28172]
                           ClusterHealth   UP       [7102, 7103, 27995, 28209,28495, 28496, 28503, 28510,	
    28573, 28574, 28577, 28594, 28595, 28597, 28598, 28602, 28603, 28604, 28607, 28645, 28646, 28648, 28792,	
    28793, 28837, 28838, 28840, 28841, 28858, 28859, 29123, 29124, 29127, 29133, 29135, 29142, 29146, 29150,	
    29161, 29162, 29163, 29179, 29187, 29219, 29268, 29273]

Rebooting an AHV Node in a Nutanix Cluster

About this task

The Request Reboot operation in the Prism web console gracefully restarts the selected nodes one after the other.

Perform the following procedure to restart the nodes in the cluster.

Procedure

  1. Click the gear icon in the main menu and then select Reboot in the Settings page.
  2. In the Request Reboot window, select the nodes you want to restart, and click Reboot .
    Figure. Request Reboot of AHV Node Click to enlarge

    A progress bar is displayed that indicates the progress of the restart of each node.

Shutting Down an AHV Cluster

You might need to shut down an AHV cluster to perform a maintenance activity or tasks such as relocating the hardware.

Before you begin

Ensure the following before you shut down the cluster.

  1. Upgrade to the most recent version of NCC.
  2. Log on to a Controller VM (CVM) with SSH and run the complete NCC health check.
    nutanix@cvm$ ncc health_checks run_all

    If you receive any failure or error messages, resolve those issues by referring to the KB articles indicated in the output of the NCC check results. If you are unable to resolve these issues, contact Nutanix Support.

    Warning: If you receive alerts indicating expired encryption certificates or a key manager is not reachable, resolve these issues before you shut down the cluster. If you do not resolve these issues, data loss of the cluster might occur.

About this task

Shut down an AHV cluster in the following sequence.

Procedure

  1. Shut down the services or VMs associated with AOS features or Nutanix products. For example, shut down all the Nutanix file server VMs (FSVMs). See the documentation of those features or products for more information.
  2. Shut down all the guest VMs in the cluster in one of the following ways.
    • Shut down the guest VMs from within the guest OS.
    • Shut down the guest VMs by using the Prism Element web console.
    • If you are running many VMs, shut down the VMs by using aCLI:
    1. Log on to a CVM in the cluster with SSH.
    2. Shut down all the guest VMs in the cluster.
      nutanix@cvm$ for i in `acli vm.list power_state=on | awk '{print $1}' | grep -v NTNX` ; do acli vm.shutdown $i ; done
      
    3. Verify if all the guest VMs are shut down.
      nutanix@CVM$ acli vm.list power_state=on
    4. If any VMs are on, consider powering off the VMs from within the guest OS. To force shut down through AHV, run the following command:
      nutanix@cvm$ acli vm.off vm-name

      Replace vm-name with the name of the VM you want to shut down.

  3. Stop the Nutanix cluster.
    1. Log on to any CVM in the cluster with SSH.
    2. Stop the cluster.
      nutanix@cvm$ cluster stop
    3. Verify if the cluster services have stopped.
      nutanix@CVM$ cluster status

      The output displays the message The state of the cluster: stop , which confirms that the cluster has stopped.

      Note: Some system services continue to run even if the cluster has stopped.
  4. Shut down all the CVMs in the cluster. Log on to each CVM in the cluster with SSH and shut down that CVM.
    nutanix@cvm$ sudo shutdown -P now
  5. Shut down each node in the cluster. Perform the following steps for each node in the cluster.
    1. Log on to the IPMI web console of each node.
    2. Under Remote Control > Power Control , select Power Off Server - Orderly Shutdown to gracefully shut down the node.
    3. Ping each host to verify that all AHV hosts are shut down.
  6. Complete the maintenance activity or any other tasks.
  7. Start all the nodes in the cluster.
    1. Press the power button on the front of the block for each node.
    2. Log on to the IPMI web console of each node.
    3. On the System tab, check the Power Control status to verify if the node is powered on.
  8. Start the cluster.
    1. Wait for approximately 5 minutes after you start the last node to allow the cluster services to start.
      All CVMs start automatically after you start all the nodes.
    2. Log on to any CVM in the cluster with SSH.
    3. Start the cluster.
      nutanix@cvm$ cluster start
    4. Verify that all the cluster services are in the UP state.
      nutanix@cvm$ cluster status
    5. Start the guest VMs from within the guest OS or use the Prism Element web console.

      If you are running many VMs, start the VMs by using aCLI:

      nutanix@cvm$ for i in `acli vm.list power_state=off | awk '{print $1}' | grep -v NTNX` ; do acli vm.on $i; done
    6. Start the services or VMs associated with AOS features or Nutanix products. For example, start all the FSVMs. See the documentation of those features or products for more information.
    7. Verify if all guest VMs are powered on by using the Prism Element web console.

Changing CVM Memory Configuration (AHV)

About this task

You can increase the memory reserved for each Controller VM in your cluster by using the 1-click Controller VM Memory Upgrade available from the Prism Element web console. Increase memory size depending on the workload type or to enable certain AOS features. See the Increasing the Controller VM Memory Size topic in the Prism Web Console Guide for CVM memory sizing recommendations and instructions about how to increase the CVM memory.

Changing the AHV Hostname

To change the name of an AHV host, log on to any Controller VM (CVM) in the cluster as admin or nutanix user and run the change_ahv_hostname script.

About this task

Perform the following procedure to change the name of an AHV host:

Procedure

  1. Log on to any CVM in the cluster with SSH.
  2. Change the hostname of the AHV host.
    • If you are logged in as nutanix user, run the following command:
      nutanix@cvm$ change_ahv_hostname --host_ip=host-IP-address --host_name=new-host-name
    • If you are logged in as admin user, run the following command:
      admin@cvm$ sudo change_ahv_hostname --host_ip=host-IP-address --host_name=new-host-name
    Note: The system prompts you to enter the admin user password if you run the change_ahv_hostname command with sudo.

    Replace host-IP-address with the IP address of the host whose name you want to change and new-host-name with the new hostname for the AHV host.

    Note: This entity must fulfill the following naming conventions:
    • The maximum length is 63 characters.
    • Allowed characters are uppercase and lowercase letters (A-Z and a-z), decimal digits (0-9), dots (.), and hyphens (-).
    • The entity name must start and end with a number or letter.

    If you want to update the hostname of multiple hosts in the cluster, run the script for one host at a time (sequentially).

    Note: The Prism Element web console displays the new hostname after a few minutes.

Changing the Name of the CVM Displayed in the Prism Web Console

You can change the CVM name that is displayed in the Prism web console. The procedure described in this document does not change the CVM name that is displayed in the terminal or console of an SSH session.

About this task

You can change the CVM name by using the change_cvm_display_name script. Run this script from a CVM other than the CVM whose name you want to change. When you run the change_cvm_display_name script, AOS performs the following steps:

    1. Checks if the new name starts with NTNX- and ends with -CVM . The CVM name must have only letters, numbers, and dashes (-).
    2. Checks if the CVM has received a shutdown token.
    3. Powers off the CVM. The script does not put the CVM or host into maintenance mode. Therefore, the VMs are not migrated from the host and continue to run with the I/O operations redirected to another CVM while the current CVM is in a powered off state.
    4. Changes the CVM name, enables autostart, and powers on the CVM.

Perform the following to change the CVM name displayed in the Prism web console.

Procedure

  1. Use SSH to log on to a CVM other than the CVM whose name you want to change.
  2. Change the name of the CVM.
    nutanix@cvm$ change_cvm_display_name --cvm_ip=CVM-IP --cvm_name=new-name

    Replace CVM-IP with the IP address of the CVM whose name you want to change and new-name with the new name for the CVM.

    The CVM name must have only letters, numbers, and dashes (-), and must start with NTNX- and end with -CVM .

    Note: Do not run this command from the CVM whose name you want to change, because the script powers off the CVM. In this case, when the CVM is powered off, you lose connectivity to the CVM from the SSH console and the script abruptly ends.

Adding a Never-Schedulable Node (AHV Only)

Add a never-schedulable node if you want to add a node to increase data storage on your Nutanix cluster, but do not want any AHV VMs to run on that node.

About this task

AOS never schedules any VMs on a never-schedulable node. Therefore, a never-schedulable node configuration ensures that no additional compute resources such as CPUs are consumed from the Nutanix cluster. In this way, you can meet the compliance and licensing requirements of your virtual applications.

Note the following points about a never-schedulable node configuration.

Note:
  • Ensure that at any given time, the cluster has a minimum of three nodes (never-schedulable or otherwise) in function. To add your first never-schedulable node to your Nutanix cluster, the cluster must comprise of at least three schedulable nodes.
  • You can add any number of never-schedulable nodes to your Nutanix cluster.
  • If you want a node that is already a part of the cluster to work as a never-schedulable node, remove that node from the cluster and then add that node as a never-schedulable node.
  • If you no longer need a node to work as a never-schedulable node, remove the node from the cluster.

Procedure

You can add a never-schedulable node (storage-only node) to a cluster using the Expand Cluster operation from Prism Web Console.
For information about how to add a never-schedulable node to a cluster, see the Expanding a Cluster topic in Prism Web Console Guide .

Compute-Only Node Configuration (AHV Only)

A compute-only (CO) node allows you to seamlessly and efficiently expand the computing capacity (CPU and memory) of your AHV cluster. The Nutanix cluster uses the resources (CPUs and memory) of a CO node exclusively for computing purposes.

Note: Clusters that have compute-only nodes do not support virtual switches. Instead, use bridge configurations for network connections. For more information, see Virtual Switch Limitations.

You can use a supported server or an existing hyperconverged (HC) node as a CO node. To use a node as CO, image the node as CO by using Foundation and then add that node to the cluster by using the Prism Element web console. For more information about how to image a node as a CO node, see the Field Installation Guide.

Note: If you want an existing HC node that is already a part of the cluster to work as a CO node, remove that node from the cluster, image that node as CO by using Foundation, and add that node back to the cluster. For more information about how to remove a node, see Modifying a Cluster.

Key Features of Compute-Only Node

Following are the key features of CO nodes.

  • CO nodes do not have a Controller VM (CVM) and local storage.
  • AOS sources the storage for vDisks associated with VMs running on CO nodes from the hyperconverged (HC) nodes in the cluster.
  • You can seamlessly manage your VMs (CRUD operations, ADS, and HA) by using the Prism Element web console.
  • AHV runs on the local storage media of the CO node.
  • To update AHV on a cluster that contains a compute-only node, use the Life Cycle Manager. For more information, see the LCM Updates topic in the Life Cycle Manager Guide.

Use Case of Compute-Only Node

CO nodes enable you to achieve more control and value from restrictive licenses such as Oracle. A CO node is part of a Nutanix HC cluster, and there is no CVM running on the CO node (VMs use CVMs running on the HC nodes to access disks). As a result, licensed cores on the CO node are used only for the application VMs.

Applications or databases that are licensed on a per CPU core basis require the entire node to be licensed and that also includes the cores on which the CVM runs. With CO nodes, you get a much higher ROI on the purchase of your database licenses (such as Oracle and Microsoft SQL Server) since the CVM does not consume any compute resources.

Minimum Cluster Requirements

Following are the minimum cluster requirements for compute-only nodes.

  • The Nutanix cluster must be at least a three-node cluster before you add a compute-only node.

    However, Nutanix recommends that the cluster has four nodes before you add a compute-only node.

  • The ratio of compute-only to hyperconverged nodes in a cluster must not exceed the following:

    1 compute-only : 2 hyperconverged

  • All the hyperconverged nodes in the cluster must be all-flash nodes.
  • The number of vCPUs assigned to CVMs on the hyperconverged nodes must be greater than or equal to the total number of available cores on all the compute-only nodes in the cluster. The CVM requires a minimum of 12 vCPUs. For more information about how Foundation allocates memory and vCPUs to your platform model, see CVM vCPU and vRAM Allocation in the Field Installation Guide .
  • The total amount of NIC bandwidth allocated to all the hyperconverged nodes must be twice the amount of the total NIC bandwidth allocated to all the compute-only nodes in the cluster.

    Nutanix recommends you use dual 25 GbE on CO nodes and quad 25 GbE on an HC node serving storage to a CO node.

  • The AHV version of the compute-only node must be the same as the other nodes in the cluster.

    When you are adding a CO node to the cluster, AOS checks if the AHV version of the node matches with the AHV version of the existing nodes in the cluster. If there is a mismatch, the add node operation fails.

For general requirements about adding a node to a Nutanix cluster, see Expanding a Cluster.

Restrictions

Nutanix does not support the following features or tasks on a CO node in this release:

  1. Host boot disk replacement
  2. Network segmentation
  3. Virtual Switch configuration: Use bridge configurations instead.

Supported AOS Versions

Nutanix supports compute-only nodes on AOS releases 5.11 or later.

Supported Hardware Platforms

Compute-only nodes are supported on the following hardware platforms.

  • All the NX series hardware
  • Dell XC Core
  • Cisco UCS

Networking Configuration

To perform network tasks on a compute-only node such as creating or modifying bridges or uplink bonds or uplink load balancing, you must use the manage_ovs commands and add the --host flag to the manage_ovs commands as shown in the following example:

Note: If you have storage-only AHV nodes in clusters with compute-only nodes being ESXI or Hyper-V, deployment of default virtual switch vs0 fails. In such cases, the Prism Element, Prism Central or CLI workflows for virtual switch management are unavailable to manage the bridges and bonds. Use the manage_ovs command options to manage the bridges and bonds.
nutanix@cvm$ manage_ovs --host IP_address_of_co_node --bridge_name bridge_name create_single_bridge

Replace IP_address_of_co_node with the IP address of the CO node and bridge_name with the name of bridge you want to create.

Note: Run the manage_ovs commands for a CO from any CVM running on a hyperconverged node.

Perform the networking tasks for each CO node in the cluster individually.

For more information about networking configuration of the AHV hosts, see Host Network Management in the AHV Administration Guide .

Adding a Compute-Only Node to an AHV Cluster

About this task

Perform the following procedure to add a compute-only node to a Nutanix cluster.

Procedure

  1. Log on to the Prism Element web console.
  2. Do one of the following:
    • Click the gear icon in the main menu and select Expand Cluster in the Settings page.
    • Go to the hardware dashboard (see Hardware Dashboard) and click Expand Cluster .
  3. In the Select Host screen, scroll down and, under Manual Host Discovery , click Discover Hosts Manually .
    Figure. Discover Hosts Manually Click to enlarge

  4. Click Add Host .
    Figure. Add Host Click to enlarge

  5. Under Host or CVM IP , type the IP address of the AHV host and click Save .
    This node does not have a Controller VM and you must therefore provide the IP address of the AHV host.
  6. Click Discover and Add Hosts .
    Prism Element discovers this node and the node appears in the list of nodes in the Select Host screen.
  7. Select the node to display the details of the compute-only node.
  8. Click Next .
  9. In the Configure Host screen, click Expand Cluster .

    The add node process begins and Prism Element performs a set of checks before the node is added to the cluster.

    Check the progress of the operation in the Tasks menu of the Prism Element web console. The operation takes approximately five to seven minutes to complete.

  10. Check the Hardware Diagram view to verify if the node is added to the cluster.
    You can identity a node as a CO node if the Prism Element web console does not display the IP address for the CVM.

Host Network Management

Network management in an AHV cluster consists of the following tasks:

  • Configuring Layer 2 switching through virtual switch and Open vSwitch bridges. When configuring virtual switch vSwitch, you configure bridges, bonds, and VLANs.
  • Optionally changing the IP address, netmask, and default gateway that were specified for the hosts during the imaging process.

Virtual Networks (Layer 2)

Each VM network interface is bound to a virtual network. Each virtual network is bound to a single VLAN; trunking VLANs to a virtual network is not supported. Networks are designated by the Layer 2 type ( vlan ) and the VLAN number.

By default, each virtual network maps to virtual switch such as the default virtual switch vs0 . However, you can change this setting to map a virtual network to a custom virtual switch. The user is responsible for ensuring that the specified virtual switch exists on all hosts, and that the physical switch ports for the virtual switch uplinks are properly configured to receive VLAN-tagged traffic.

For more information about virtual switches, see About Virtual Switch.

A VM NIC must be associated with a virtual network. You can change the virtual network of a vNIC without deleting and recreating the vNIC.

Managed Networks (Layer 3)

A virtual network can have an IPv4 configuration, but it is not required. A virtual network with an IPv4 configuration is a managed network ; one without an IPv4 configuration is an unmanaged network . A VLAN can have at most one managed network defined. If a virtual network is managed, every NIC is assigned an IPv4 address at creation time.

A managed network can optionally have one or more non-overlapping DHCP pools. Each pool must be entirely contained within the network's managed subnet.

If the managed network has a DHCP pool, the NIC automatically gets assigned an IPv4 address from one of the pools at creation time, provided at least one address is available. Addresses in the DHCP pool are not reserved. That is, you can manually specify an address belonging to the pool when creating a virtual adapter. If the network has no DHCP pool, you must specify the IPv4 address manually.

All DHCP traffic on the network is rerouted to an internal DHCP server, which allocates IPv4 addresses. DHCP traffic on the virtual network (that is, between the guest VMs and the Controller VM) does not reach the physical network, and vice versa.

A network must be configured as managed or unmanaged when it is created. It is not possible to convert one to the other.

Figure. AHV Networking Architecture Click to enlarge AHV Networking Architecture image

Prerequisites for Configuring Networking

Change the configuration from the factory default to the recommended configuration. See AHV Networking Recommendations.

AHV Networking Recommendations

Nutanix recommends that you perform the following OVS configuration tasks from the Controller VM, as described in this documentation:

  • Viewing the network configuration
  • Configuring uplink bonds with desired interfaces using the Virtual Switch (VS) configurations.
  • Assigning the Controller VM to a VLAN

For performing other network configuration tasks such as adding an interface to a bridge and configuring LACP for the interfaces in a bond, follow the procedures described in the AHV Networking best practices documentation.

Nutanix recommends that you configure the network as follows:

Table 1. Recommended Network Configuration
Network Component Best Practice
Virtual Switch

Do not modify the OpenFlow tables of any bridges configured in any VS configurations in the AHV hosts.

Do not rename default virtual switch vs0. You cannot delete the default virtual switch vs0.

Do not delete or rename OVS bridge br0.

Do not modify the native Linux bridge virbr0.

Switch Hops Nutanix nodes send storage replication traffic to each other in a distributed fashion over the top-of-rack network. One Nutanix node can, therefore, send replication traffic to any other Nutanix node in the cluster. The network should provide low and predictable latency for this traffic. Ensure that there are no more than three switches between any two Nutanix nodes in the same cluster.
Switch Fabric

A switch fabric is a single leaf-spine topology or all switches connected to the same switch aggregation layer. The Nutanix VLAN shares a common broadcast domain within the fabric. Connect all Nutanix nodes that form a cluster to the same switch fabric. Do not stretch a single Nutanix cluster across multiple, disconnected switch fabrics.

Every Nutanix node in a cluster should therefore be in the same L2 broadcast domain and share the same IP subnet.

WAN Links A WAN (wide area network) or metro link connects different physical sites over a distance. As an extension of the switch fabric requirement, do not place Nutanix nodes in the same cluster if they are separated by a WAN.
VLANs

Add the Controller VM and the AHV host to the same VLAN. Place all CVMs and AHV hosts in a cluster in the same VLAN. By default the CVM and AHV host are untagged, shown as VLAN 0, which effectively places them on the native VLAN configured on the upstream physical switch.

Note: Do not add any other device (including guest VMs) to the VLAN to which the CVM and hypervisor host are assigned. Isolate guest VMs on one or more separate VLANs.

Nutanix recommends configuring the CVM and hypervisor host VLAN as the native, or untagged, VLAN on the connected switch ports. This native VLAN configuration allows for easy node addition and cluster expansion. By default, new Nutanix nodes send and receive untagged traffic. If you use a tagged VLAN for the CVM and hypervisor hosts instead, you must configure that VLAN while provisioning the new node, before adding that node to the Nutanix cluster.

Use tagged VLANs for all guest VM traffic and add the required guest VM VLANs to all connected switch ports for hosts in the Nutanix cluster. Limit guest VLANs for guest VM traffic to the smallest number of physical switches and switch ports possible to reduce broadcast network traffic load. If a VLAN is no longer needed, remove it.

Default VS bonded port (br0-up)

Aggregate the fastest links of the same speed on the physical host to a VS bond on the default vs0 and provision VLAN trunking for these interfaces on the physical switch.

By default, interfaces in the bond in the virtual switch operate in the recommended active-backup mode.
Note: The mixing of bond modes across AHV hosts in the same cluster is not recommended and not supported.
1 GbE and 10 GbE interfaces (physical host)

If 10 GbE or faster uplinks are available, Nutanix recommends that you use them instead of 1 GbE uplinks.

Recommendations for 1 GbE uplinks are as follows:

  • If you plan to use 1 GbE uplinks, do not include them in the same bond as the 10 GbE interfaces.

    Nutanix recommends that you do not use uplinks of different speeds in the same bond.

  • If you choose to configure only 1 GbE uplinks, when migration of memory-intensive VMs becomes necessary, power off and power on in a new host instead of using live migration. In this context, memory-intensive VMs are VMs whose memory changes at a rate that exceeds the bandwidth offered by the 1 GbE uplinks.

    Nutanix recommends the manual procedure for memory-intensive VMs because live migration, which you initiate either manually or by placing the host in maintenance mode, might appear prolonged or unresponsive and might eventually fail.

    Use the aCLI on any CVM in the cluster to start the VMs on another AHV host:

    nutanix@cvm$ acli vm.on vm_list host=host

    Replace vm_list with a comma-delimited list of VM names and replace host with the IP address or UUID of the target host.

  • If you must use only 1GbE uplinks, add them into a bond to increase bandwidth and use the balance-TCP (LACP) or balance-SLB bond mode.
IPMI port on the hypervisor host Do not use VLAN trunking on switch ports that connect to the IPMI interface. Configure the switch ports as access ports for management simplicity.
Upstream physical switch

Nutanix does not recommend the use of Fabric Extenders (FEX) or similar technologies for production use cases. While initial, low-load implementations might run smoothly with such technologies, poor performance, VM lockups, and other issues might occur as implementations scale upward (see Knowledge Base article KB1612). Nutanix recommends the use of 10Gbps, line-rate, non-blocking switches with larger buffers for production workloads.

Cut-through versus store-and-forward selection depends on network design. In designs with no oversubscription and no speed mismatches you can use low-latency cut-through switches. If you have any oversubscription or any speed mismatch in the network design, then use a switch with larger buffers. Port-to-port latency should be no higher than 2 microseconds.

Use fast-convergence technologies (such as Cisco PortFast) on switch ports that are connected to the hypervisor host.

Physical Network Layout Use redundant top-of-rack switches in a traditional leaf-spine architecture. This simple, flat network design is well suited for a highly distributed, shared-nothing compute and storage architecture.

Add all the nodes that belong to a given cluster to the same Layer-2 network segment.

Other network layouts are supported as long as all other Nutanix recommendations are followed.

Jumbo Frames

The Nutanix CVM uses the standard Ethernet MTU (maximum transmission unit) of 1,500 bytes for all the network interfaces by default. The standard 1,500 byte MTU delivers excellent performance and stability. Nutanix does not support configuring the MTU on network interfaces of a CVM to higher values.

You can enable jumbo frames (MTU of 9,000 bytes) on the physical network interfaces of AHV hosts and guest VMs if the applications on your guest VMs require them. If you choose to use jumbo frames on hypervisor hosts, be sure to enable them end to end in the desired network and consider both the physical and virtual network infrastructure impacted by the change.

Controller VM Do not remove the Controller VM from either the OVS bridge br0 or the native Linux bridge virbr0.
Rack Awareness and Block Awareness Block awareness and rack awareness provide smart placement of Nutanix cluster services, metadata, and VM data to help maintain data availability, even when you lose an entire block or rack. The same network requirements for low latency and high throughput between servers in the same cluster still apply when using block and rack awareness.
Note: Do not use features like block or rack awareness to stretch a Nutanix cluster between different physical sites.
Oversubscription

Oversubscription occurs when an intermediate network device or link does not have enough capacity to allow line rate communication between the systems connected to it. For example, if a 10 Gbps link connects two switches and four hosts connect to each switch at 10 Gbps, the connecting link is oversubscribed. Oversubscription is often expressed as a ratio—in this case 4:1, as the environment could potentially attempt to transmit 40 Gbps between the switches with only 10 Gbps available. Achieving a ratio of 1:1 is not always feasible. However, you should keep the ratio as small as possible based on budget and available capacity. If there is any oversubscription, choose a switch with larger buffers.

In a typical deployment where Nutanix nodes connect to redundant top-of-rack switches, storage replication traffic between CVMs traverses multiple devices. To avoid packet loss due to link oversubscription, ensure that the switch uplinks consist of multiple interfaces operating at a faster speed than the Nutanix host interfaces. For example, for nodes connected at 10 Gbps, the inter-switch connection should consist of multiple 10 Gbps or 40 Gbps links.

The following diagrams show sample network configurations using Open vSwitch and Virtual Switch.

Figure. Virtual Switch Click to enlarge Displaying Virtual Switch mechanism

Figure. AHV Bridge Chain Click to enlarge Displaying Virtual Switch mechanism

Figure. Default factory configuration of Open vSwitch in AHV Click to enlarge

Figure. Open vSwitch Configuration Click to enlarge

IP Address Management

IP Address Management (IPAM) is a feature of AHV that allows it to assign IP addresses automatically to VMs by using DHCP. You can configure each virtual network with a specific IP address subnet, associated domain settings, and IP address pools available for assignment to VMs.

An AHV network is defined as a managed network or an unmanaged network based on the IPAM setting.

Managed Network

Managed network refers to an AHV network in which IPAM is enabled.

Unmanaged Network

Unmanaged network refers to an AHV network in which IPAM is not enabled or is disabled.

IPAM is enabled, or not, in the Create Network dialog box when you create a virtual network for Guest VMs. See Configuring a Virtual Network for Guest VM Interfaces topic in the Prism Web Console Guide .
Note: You can enable IPAM only when you are creating a virtual network. You cannot enable or disable IPAM for an existing virtual network.

IPAM enabled or disabled status has implications. For example, when you want to reconfigure the IP address of a Prism Central VM, the procedure to do so may involve additional steps for managed networks (that is, networks with IPAM enabled) where the new IP address belongs to an IP address range different from the previous IP address range. See Reconfiguring the IP Address and Gateway of Prism Central VMs in Prism Central Guide .

Traffic Marking for Quality of Service

To prioritize outgoing (or egress) traffic as required, you can configure quality of service on the traffic for a cluster.

There are two distinct types of outgoing or egress traffic:

  • Management traffic (mgmt)
  • Data services (data-svc)

Data services traffic consists of the following protocols:

Table 1. Data Services Protocols
Protocol Port Nutanix Services
NFS

Source ports (TCP): 445, 2049, 20048, 20049, 20050, and 7508.

Source ports (UDP): 2049, 20048, 20049, 20050, and 7508.

Source and Destination ports (TCP for Replicator-dr): 7515.

-Nutanix Files-
SMB

Source ports (TCP): 445, 2049, 20048, 20049, 20050, and 7508.

Source ports (UDP): 2049, 20048, 20049, 20050, and 7508.

Source and Destination ports (TCP for Replicator-dr): 7515.

-Nutanix Files-
Cluster-to-cluster replications (external or inter-site)

Destination Ports: 2009 and 2020 on CVM.

Stargate and Cerebro
Node-to-node replications (internal or intra-site)

Destination Ports: 2009 and 2020 on CVM.

Stargate and Cerebro
iSCSI

Source Ports: 3260,3261,3205 on CVM.

Destination Ports: 3260,3261,3205 on AHV.

Nutanix Files and Volumes

Traffic other than data services traffic is management traffic. Traffic marking for QoS is disabled by default.

When you enable QoS, you can mark both the types of traffic with QoS values. AOS considers the values in hexadecimal even if you provide the values in decimal. When you view or get the QoS configuration enabled on the cluster, nCLI provides the QoS values in hexadecimal format (0xXX where XX is hexadecimal value in the range 00–3f).
Note: Set any QoS value in the range 0x0–0x3f. The default QoS values for the traffic are as follows:
  • Management traffic (mgmt) = 0x10
  • Data services (data-svc) = 0xa

Configuring Traffic Marking for QoS

Configure Quality of Service (QoS) for management and data services traffic using nCLI.

About this task

To perform the following operations for QoS on the egress traffic of a cluster, use the nCLI commands in this section:

  • Enable traffic marking for QoS on the cluster. QoS traffic marking is disabled by default.
  • View or get the QoS configuration enabled on the cluster.
  • Set QoS values for all traffic types or specific traffic types.
  • Disable QoS on the cluster.

When you run any of the QoS configuration commands and the command succeeds, the console displays the following output indicating the successful command run:

QoSUpdateStatusDTO(status=true, message=null)

Where:

  • status=true indicates that the command succeeded.
  • message=null indicates that there is no error.

When you run any of the QoS configuration commands and the command fails, the console displays the following sample output indicating the failure:

QoSUpdateStatusDTO(status=false, message=QoS is already enabled.)

Where:

  • status=false indicates that the command failed.
  • message=QoS is already enabled. indicates why the command failed. This sample error message indicates that the net enable-qos command failed because QoS enable command was run again when QoS is already enabled.

Procedure

  • To enable QoS on a cluster, run the following command:
    ncli> net enable-qos [data-svc="data-svc value"][mgmt="mgmt value"]

    If you run the command as net enable-qos without the options, AOS enables QoS with the default values ( mgmt=0x10 and data-svc=0xa ).

    Note: After you run the net enable-qos command, if you run it again, the command fails and AOS displays the following output:
    QoSUpdateStatusDTO(status=false, message=QoS is already enabled.)
    Note: If you need to change the QoS values after you enable it , run the net edit-qos command with the option ( data-svc or mgmt or both as necessary).
    Note: Set any QoS value in the range 0x0–0x3f.
  • To view or get the QoS configuration enabled on a cluster, run the following command:
    ncli> net get-qos
    Note: When you get the QoS configuration enabled on the cluster, nCLI provides the QoS values in hexadecimal format (0xXX where XX is hexadecimal value in the range 00–3f).

    A sample output on the console is as follows:

    QoSDTO(status=true, isEnabled=true, mgmt=0x10, dataSvc=0xa, message=null)
    Note:

    Where:

    • status=true indicates that the net get-qos command passed. status=false indicates that the net get-qos command failed. See the message= value for the failure error message.
    • isEnabled=true indicates that QoS is enabled. isEnabled=false indicates that QoS is not enabled.
    • mgmt=0x10 indicates that QoS value for Management traffic ( mgmt option) is set to 0x10 (represented in hexadecimal value as 0x10 . If you disabled QoS, then this parameter is displayed as mgmt=null .
    • dataSvc=0xa indicates that QoS value for data services traffic ( data-svc option) is set to 0xa (represented in hexadecimal value as 0xa . If you disabled QoS, then this parameter is displayed as dataSvc=null .
    • message=null indicates there is no error message. message= parameter provides the command failure error message if the command fails.
  • To set the QoS values for the traffic types on a cluster after you enabled QoS on the cluster, run the following command:
    ncli> net edit-qos [data-svc="data-svc value"][mgmt="mgmt value"]

    You can provide QoS values between 0x0-0x3f for one or both the options. The value is hexadecimal representation of a value between decimal 0-63 both inclusive.

  • To disable QoS on a cluster, run the following command:
    ncli> net disable-qos
    QoSDTO(status=true, isEnabled=false, mgmt=null, dataSvc=null, message=null)

Layer 2 Network Management

AHV uses virtual switch (VS) to connect the Controller VM, the hypervisor, and the guest VMs to each other and to the physical network. Virtual switch is configured by default on each AHV node and the VS services start automatically when you start a node.

To configure virtual networking in an AHV cluster, you need to be familiar with virtual switch. This documentation gives you a brief overview of virtual switch and the networking components that you need to configure to enable the hypervisor, Controller VM, and guest VMs to connect to each other and to the physical network.

About Virtual Switch

Virtual switches or VS are used to manage multiple bridges and uplinks.

The VS configuration is designed to provide flexibility in configuring virtual bridge connections. A virtual switch (VS) defines a collection of AHV nodes and the uplink ports on each node. It is an aggregation of the same OVS bridge on all the compute nodes in a cluster. For example, vs0 is the default virtual switch is an aggregation of the br0 bridge and br0-up uplinks of all the nodes.

After you configure a VS, you can use the VS as reference for physical network management instead of using the bridge names as reference.

For overview about Virtual Switch, see Virtual Switch Considerations.

For information about OVS, see About Open vSwitch.

Virtual Switch Workflow

A virtual switch (VS) defines a collection of AHV compute nodes and the uplink ports on each node. It is an aggregation of the same OVS bridge on all the compute nodes in a cluster. For example, vs0 is the default virtual switch is an aggregation of the br0 bridge of all the nodes.

The system creates the default virtual switch vs0 connecting the default bridge br0 on all the hosts in the cluster during installation of or upgrade to the compatible versions of AOS and AHV. Default virtual switch vs0 has the following characteristics:

  • The default virtual switch cannot be deleted.

  • The default bridges br0 on all the nodes in the cluster map to vs0. thus, vs0 is not empty. It has at least one uplink configured.

  • The default management connectivity to a node is mapped to default bridge br0 that is mapped to vs0.

  • The default parameter values of vs0 - Name, Description, MTU and Bond Type - can be modified subject to aforesaid characteristics.

  • The default virtual switch is configured with the Active-Backup uplink bond type.

    For more information about bond types, see the Bond Type table.

The virtual switch aggregates the same bridges on all nodes in the cluster. The bridge (for example, br1) connects to the physical port such as eth3 (Ethernet port) via the corresponding uplink (for example, br1-up). The uplink ports of the bridges are connected to the same physical network. For example, the following illustration shows that vs0 is mapped to the br0 bridge, in turn connected via uplink br0-up to various (physical) Ethernet ports on different nodes.

Figure. Virtual Switch Click to enlarge Displaying Virtual Switch mechanism

Uplink configuration uses bonds to improve traffic management. The bond types are defined for the aggregated OVS bridges. A new bond type - No uplink bond - provides a no-bonding option. A virtual switch configured with the No uplink bond uplink bond type has 0 or 1 uplinks.

When you configure a virtual switch with any other bond type, you must select at least two uplink ports on every node.

If you change the uplink configuration of vs0, AOS applies the updated settings to all the nodes in the cluster one after the other (the rolling update process). To update the settings in a cluster, AOS performs the following tasks when configuration method applied is Standard :

  1. Puts the node in maintenance mode (migrates VMs out of the node)
  2. Applies the updated settings
  3. Checks connectivity with the default gateway
  4. Exits maintenance mode
  5. Proceeds to apply the updated settings to the next node

AOS does not put the nodes in maintenance mode when the Quick configuration method is applied.

Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

Table 1. Bond Types
Bond Type Use Case

Maximum VM NIC Throughput

Maximum Host Throughput

Active-Backup

Recommended. Default configuration, which transmits all traffic over a single active adapter. 10 Gb 10 Gb

Active-Active with MAC pinning

Also known as balance-slb

Works with caveats for multicast traffic. Increases host bandwidth utilization beyond a single 10 Gb adapter. Places each VM NIC on a single adapter at a time. Do not use this bond type with link aggregation protocols such as LACP. 10 Gb 20 Gb

Active-Active

Also known as LACP with balance-tcp

LACP and link aggregation required. Increases host and VM bandwidth utilization beyond a single 10 Gb adapter by balancing VM NIC TCP and UDP sessions among adapters. Also used when network switches require LACP negotiation.

The default LACP settings are:

  • Speed—Fast (1s)
  • Mode—Active fallback-active-backup
  • Priority—Default. This is not configurable.
20 Gb 20 Gb
No Uplink Bond

No uplink or a single uplink on each host.

Virtual switch configured with the No uplink bond uplink bond type has 0 or 1 uplinks. When you configure a virtual switch with any other bond type, you must select at least two uplink ports on every node.

- -

Note the following points about the uplink configuration.

  • Virtual switches are not enabled in a cluster that has one or more compute-only nodes. See Virtual Switch Limitations and Virtual Switch Requirements.
  • If you select the Active-Active policy, you must manually enable LAG and LACP on the corresponding ToR switch for each node in the cluster.
  • If you reimage a cluster with the Active-Active policy enabled, the default virtual switch (vs0) on the reimaged cluster is once again the Active-Backup policy. The other virtual switches are removed during reimage.
  • Nutanix recommends configuring LACP with fallback to active-backup or individual mode on the ToR switches. The configuration and behavior varies based on the switch vendor. Use a switch configuration that allows both switch interfaces to pass traffic after LACP negotiation fails.

Virtual Switch Considerations

Virtual Switch Deployment

A VS configuration is deployed using rolling update of the clusters. After the VS configuration (creation or update) is received and execution starts, every node is first put into maintenance mode before the VS configuration is made or modified on the node. This is called the Standard recommended default method of configuring a VS.

You can select the Quick method of configuration also where the rolling update does not put the clusters in maintenance mode. The VS configuration task is marked as successful when the configuration is successful on the first node. Any configuration failure on successive nodes triggers corresponding NCC alerts. There is no change to the task status.

Note: If you are modifying an existing bond, AHV removes the bond and then re-creates the bond with the specified interfaces.

Ensure that the interfaces you want to include in the bond are physically connected to the Nutanix appliance before you run the command described in this topic. If the interfaces are not physically connected to the Nutanix appliance, the interfaces are not added to the bond.

Ensure that the pre-checks listed in LCM Prechecks section of the Life Cycle Manager Guide and the Always and Host Disruptive Upgrades types of pre-checks listed KB-4584 pass for Virtual Switch deployments.

The VS configuration is stored and re-enforced at system reboot.

The VM NIC configuration also displays the VS details. When you Update VM configuration or Create NIC for a VM, the NIC details show the virtual switches that can be associated. This view allows you to change a virtual network and the associated virtual switch.

To change the virtual network, select the virtual network in the Subnet Name dropdown list in the Create NIC or Update NIC dialog box.

Figure. Create VM - VS Details Click to enlarge

Figure. VM NIC - VS Details Click to enlarge

Impact of Installation of or Upgrade to Compatible AOS and AHV Versions

See Virtual Switch Requirements for information about minimum and compatible AOS and AHV versions.

When you upgrade the AOS to a compatible version from an older version, the upgrade process:

  • Triggers the creation of the default virtual switch vs0, which is mapped to bridge br0on all the nodes.

  • Validates bridge br0 and its uplinks for consistency in terms of MTU and bond-type on every node.

    If valid, it adds the bridge br0 of each node to the virtual switch vs0.

    If br0 configuration is not consistent, the system generates an NCC alert which provides the failure reason and necessary details about it.

    The system migrates only the bridge br0 on each node to the default virtual switch vs0 because the connectivity of bridge br0 is guaranteed.

  • Does not migrate any other bridges to any other virtual switches during upgrade. You need to manually migrate the other bridges after install or upgrade is complete.

Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

Bridge Migration

After upgrading to a compatible version of AOS, you can migrate bridges other than br0 that existed on the nodes. When you migrate the bridges, the system converts the bridges to virtual switches.

See Virtual Switch Migration Requirements in Virtual Switch Requirements.

Note: You can migrate only those bridges that are present on every compute node in the cluster. See Migrating Bridges after Upgrade topic in Prism Web Console Guide .

Cluster Scaling Impact

VS management for cluster scaling (addition or removal of nodes) is seamless.

Node Removal

When you remove a node, the system detects the removal and automatically removes the node from all the VS configurations that include the node and generates an internal system update. For example, a node has two virtual switches, vs1 and vs2, configured apart from the default vs0. When you remove the node from the cluster, the system removes the node for the vs1 and vs2 configurations automatically with internal system update.

Node Addition

When you add a new node or host to a cluster, the bridges or virtual switches on the new node are treated in the following manner:

Note: If a host already included in a cluster is removed and then added back, it is treated as a new host.
  • The system validates the default bridge br0 and uplink bond br0-up to check if it conforms to the default virtual switch vs0 already present on the cluster.

    If br0 and br0-up conform, the system includes the new host and its uplinks in vs0.

    If br0 and br0-up do not conform,then the system generates an NCC alert.

  • The system does not automatically add any other bridge configured on the new host to any other virtual switch in the cluster.

    It generates NCC alerts for all the other non-default virtual switches.

  • You can manually include the host in the required non-default virtual switches. Update a non-default virtual switch to include the host.

    For information about updating a virtual switch in Prism Element Web Console, see the Configuring a Virtual Network for Guest VM Interfaces section in Prism Web Console Guide .

    For information about updating a virtual switch in Prism Central, see the Network Connections section in the Prism Central Guide .

VS Management

You can manage virtual switches from Prism Central or Prism Web Console. You can also use aCLI or REST APIs to manage them. See the Acropolis API Reference and Command Reference guides for more information.

You can also use the appropriate aCLI commands for virtual switches from the following list:

  • net.create_virtual_switch

  • net.list_virtual_switch

  • net.get_virtual_switch

  • net.update_virtual_switch

  • net.delete_virtual_switch

  • net.migrate_br_to_virtual_switch

  • net.disable_virtual_switch

About Open vSwitch

Open vSwitch (OVS) is an open-source software switch implemented in the Linux kernel and designed to work in a multiserver virtualization environment. By default, OVS behaves like a Layer 2 learning switch that maintains a MAC address learning table. The hypervisor host and VMs connect to virtual ports on the switch.

Each hypervisor hosts an OVS instance, and all OVS instances combine to form a single switch. As an example, the following diagram shows OVS instances running on two hypervisor hosts.

Figure. Open vSwitch Click to enlarge

Default Factory Configuration

The factory configuration of an AHV host includes a default OVS bridge named br0 (configured with the default virtual switch vs0) and a native linux bridge called virbr0.

Bridge br0 includes the following ports by default:

  • An internal port with the same name as the default bridge; that is, an internal port named br0. This is the access port for the hypervisor host.
  • A bonded port named br0-up. The bonded port aggregates all the physical interfaces available on the node. For example, if the node has two 10 GbE interfaces and two 1 GbE interfaces, all four interfaces are aggregated on br0-up. This configuration is necessary for Foundation to successfully image the node regardless of which interfaces are connected to the network.
    Note:

    Before you begin configuring a virtual network on a node, you must disassociate the 1 GbE interfaces from the br0-up port. This disassociation occurs when you modify the default virtual switch (vs0) and create new virtual switches. Nutanix recommends that you aggregate only the 10 GbE or faster interfaces on br0-up and use the 1 GbE interfaces on a separate OVS bridge deployed in a separate virtual switch.

    See Virtual Switch Management for information about virtual switch management.

The following diagram illustrates the default factory configuration of OVS on an AHV node:

Figure. Default factory configuration of Open vSwitch in AHV Click to enlarge

The Controller VM has two network interfaces by default. As shown in the diagram, one network interface connects to bridge br0. The other network interface connects to a port on virbr0. The Controller VM uses this bridge to communicate with the hypervisor host.

Virtual Switch Requirements

The requirements to deploy virtual switches are as follows:

  1. Virtual switches are supported on AOS 5.19 or later with AHV 20201105.12 or later. Therefore you must install or upgrade to AOS 5.19 or later, with AHV 20201105.12 or later, to use virtual switches in your deployments.

  2. Virtual bridges used for a VS on all the nodes must have the same specification such as name, MTU and uplink bond type. For example, if vs1 is mapped to br1 (virtual or OVS bridge 1) on a node, it must be mapped to br1 on all the other nodes of the same cluster.

Virtual Switch Migration Requirements

The AOS upgrade process initiates the virtual switch migration. The virtual switch migration is successful only when the following requirements are fulfilled:

  • Before migrating to Virtual Switch, all bridge br0 bond interfaces should have the same bond type on all hosts in the cluster. For example, all hosts should use the Active-backup bond type or balance-tcp . If some hosts use Active-backup and other hosts use balance-tcp , virtual switch migration fails.
  • Before migrating to Virtual Switch, if using LACP:
    • Confirm that all bridge br0 lacp-fallback parameters on all hosts are set to the case sensitive value True with manage_ovs show_uplinks |grep lacp-fallback: . Any host with lowercase true causes virtual switch migration failure.
    • Confirm that the LACP speed on the physical switch is set to fast or 1 second. Also ensure that the switch ports are ready to fallback to individual mode if LACP negotiation fails due to a configuration such as no lacp suspend-individual .
  • Before migrating to the Virtual Switch, confirm that the upstream physical switch is set to spanning-tree portfast or spanning-tree port type edge trunk . Failure to do so may lead to a 30-second network timeout and the virtual switch migration may fail because it uses 20-second non-modifiable timer.
  • Ensure that the pre-checks listed in LCM Prechecks section of the Life Cycle Manager Guide and the Always and Host Disruptive Upgrades types of pre-checks listed KB-4584 pass for Virtual Switch deployments.

  • For the default virtual switch vs0,
    • All configured uplink ports must be available for connecting the network. In Active-Backup bond type, the active port is selected from any configured uplink port that is linked. Therefore, the virtual switch vs0 can use all the linked ports for communication with other CVMs/hosts.
    • All the host IP addresses in the virtual switch vs0 must be resolvable to the configured gateway using ARP.

Virtual Switch Limitations

Virtual Switch Operations During Upgrade

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

MTU Restriction

The Nutanix Controller VM uses the standard Ethernet MTU (maximum transmission unit) of 1,500 bytes for all the network interfaces by default. The standard 1,500-byte MTU delivers excellent performance and stability. Nutanix does not support configuring higher values of MTU on the network interfaces of a Controller VM.

You can enable jumbo frames (MTU of 9,000 bytes) on the physical network interfaces of AHV, ESXi, or Hyper-V hosts and guest VMs if the applications on your guest VMs require such higher MTU values. If you choose to use jumbo frames on the hypervisor hosts, enable the jumbo frames end to end in the specified network, considering both the physical and virtual network infrastructure impacted by the change.

Single node and Two-node cluster configuration.

Virtual switch cannot be deployed is your single-node or two-node cluster has any instantiated user VMs. The virtual switch creation or update process involves a rolling restart, which checks for maintenance mode and whether you can migrate the VMs. On a single-node or two-node cluster, any instantiated user VMs cannot be migrated and the virtual switch operation fails.

Therefore, power down all user VMs for virtual switch operations in a single-node or two-node cluster.

Compute-only node is not supported.

Virtual switch is not compatible with Compute-only (CO) nodes. If a CO node is present in the cluster, then the virtual switches are not deployed (including the default virtual switch). You need to use the net.disable_virtual_switch aCLI command to disable the virtual switch workflow if you want to expand a cluster which has virtual switches and includes a CO node.

The net.disable_virtual_switch aCLI command cleans up all the virtual switch entries from the IDF. All the bridges mapped to the virtual switch or switches are retained as they are.

See Compute-Only Node Configuration (AHV Only).

Including a storage-only node in a VS is not necessary.

Virtual switch is compatible with Storage-only (SO) nodes but you do not need to include an SO node in any virtual switch, including the default virtual switch.

Mixed-mode Clusters with AHV Storage-only Nodes
Consider that you have deployed a mixed-node cluster where the compute-only nodes are ESXi or Hyper-V nodes and the storage-only nodes are AHV nodes. In such a case, the default virtual switch deployment fails.

Without the default VS, the Prism Element, Prism Central and CLI workflows for virtual switch required to manage the bridges and bonds are not available. You need to use the manage_ovs command options to update the bridge and bond configurations on the AHV hosts.

Virtual Switch Management

Virtual Switch can be viewed, created, updated or deleted from both Prism Web Console as well as Prism Central.

Virtual Switch Views and Visualization

For information on the virtual switch network visualization in Prism Element Web Console, see the Network Visualization topic in the Prism Web Console Guide .

Virtual Switch Create, Update and Delete Operations

For information about the procedures to create, update and delete a virtual switch in Prism Element Web Console, see the Configuring a Virtual Network for Guest VM Interfaces section in the Prism Web Console Guide .

For information about the procedures to create, update and delete a virtual switch in Prism Central, see the Network Connections section in the Prism Central Guide .

Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

Uplinks for Virtual Private Cloud Traffic

Starting with a minimum AOS version 6.1.1 with Prism Central version pc.2022.4 and Flow networking controller version 2.1.1, you can use virtual switches to separate traffic of the guest VMs that are networked using Flow Networking Virtual Private Cloud (VPC) configurations.

AHV uses the default virtual switch for the management and other Controller VM traffic (unless if you have configured network segmentation to route the Controller VM traffic on another virtual switch). When you enable Flow networking in a cluster, Prism Central with the Flow networking controller and network gateway allows you to deploy Virtual Private Clouds (VPCs) that network guest VMs on hosts within the cluster and on other clusters. By default, AHV uses the default virtual switch vs0 for the VPC (Flow networking) traffic as well.

You can configure AHV to route the VPC (Flow networking) traffic on a different virtual switch, other than the default virtual switch.

Conditions for VPC Uplinks

Certain conditions apply to the use of virtual switches to separate the Controller VM traffic and traffic of the guest VMs that are networked using Virtual Private Cloud (VPC) configurations.

Host IP Addresses in Virtual Switch

The virtual switch selected for Flow networking VPC traffic must have IP addresses configured on the hosts. If the selected virtual switch does not have IP addresses configured on the hosts, then the following error is displayed:

Bridge interface IP address is not configured for host: <host-UUID> on virtual_switch <name-of-selected-virtual-switch>

Configure IP addresses from this subnet on the hosts in the virtual switch.

Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

Requirements

Ensure that the default virtual switch vs0 is enabled.

The following conditions apply to the IP addresses that you configure:

  • Ensure that the host IP addresses in the subnet do not overlap with the primary IP addresses of the host configured during installation or the IP addresses used in any other configured virtual switches.
  • Ensure that the host IP addresses in the subnet do not overlap with the IP addresses configured for the backplane operations (using network segmentation).
  • Ensure that the host IP addresses configured on the hosts in the virtual switch is not a network IP address. For example, in a subnet 10.10.10.0/24, the network IP address of the subnet is 10.10.10.0. Ensure that this IP address (10.10.10.0) is not configured as a host IP address in the virtual switch. Failure message is as follows:
    Host IP address cannot be assigned equal to the subnet.
  • Ensure that the host IP addresses configured on the hosts in the virtual switch is not the broadcast IP address of the subnet. For example, in a subnet 10.10.10.0/24, the broadcast IP address of the subnet is 10.10.10.255. Ensure that this IP address (10.10.10.255) is not configured as a host IP address in the virtual switch. Failure message is as follows:
    Host IP address cannot be assigned equal to the subnet broadcast address.
  • Ensure that the subnet configured in the virtual switch has a prefix of /30 or less. For example, you can configure a subnet with a prefix of /30 such as 10.10.10.0/30, but not a subnet with prefix of /31 or /32 such as 10.10.10.0/31 or 10.10.10.0/32. Any subnet that you configure in a virtual switch must have not less than 2 usable IP addresses. Failure message is as follows:
    Prefix length cannot be greater than 30.
  • Ensure that the host IP addresses configured in a virtual switch belongs to the same subnet. In other words, you cannot configure host IP addresses from two or more different subnets. For example, one host IP address is 10.10.10.10 from the subnet 10.10.10.0/24 and another host IP address is 10.100.10.10 from the subnet 10.100.10.0/24. This configuration fails. Both the hosts must have IP addresses from the 10.10.10.0/24 subnet (or both IP addresses must be from 10.100.10.0/24 subnet). Failure message is as follows:
    Different host IP address subnets found.
  • Ensure that the gateway IP address for the host IP addresses configured in a virtual switch belongs to the same subnet as host IP addresses. In other words, you cannot configure host IP addresses from one subnet while the gateway IP address of any of those host IP address is in a different subnet. Failure message is as follows:
    Gateway IP address is not in the same subnet.
Configuring Virtual Switch for VPC Traffic

Configure a new or existing non-default virtual switch for Flow networking VPC traffic.

Before you begin

You need a virtual switch, other than the default virtual switch vs0, that can be used to route the VPC traffic. Create a separate virtual switch that you can use to route the Flow networking VPC traffic.

For information about the procedures to create or update in Prism Element Web Console, see the Configuring a Virtual Network for Guest VMs section in the Prism Web Console Guide .

For information about the procedures to create or update a virtual switch in Prism Central, see Network Connections in the Prism Central Guide .

Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

About this task

Follow these steps to configure the uplinks for guest VMs networked by Flow networking VPCs:

Procedure

  1. Create the virtual switch you want to use for VPC traffic. For example, create vs1 as a virtual switch for the VPC traffic.
    See Creating or Updating a Virtual Switch in the Prism Web Console Guide .
  2. Configure IP addresses for the hosts that you have included in the virtual switch and a gateway IP address for the network.
    Note: You can configure the IP addresses for the hosts when you are creating the virtual switch. Ensure that you add other necessary options like host_upink_config , bond_uplink in the net.create_virtual_switch or net.update_virtual_switch commands when you create or update a virtual switch, respectively, with the host IP addresses and gateway IP address.

    See the Command Reference for more information.

    The options are:

    • host_ip_addr_config= : Provide the host UUID and associated IP address with prefix as follows:
      host_ip_addr_config={host-uuid1:host_ip_address/prefix}
      Where there are more than one host on the virtual switch, use a semicolon separated list as follows:
      host_ip_addr_config={host-uuid1:host_ip_address/prefix;host-uuid2:host_ip_address/prefix;host-uuid3:host_ip_address/prefix}
    • gateway_ip_address= : Provide the gateway IP address as follows:
      gateway_ip_address=IP_address/prefix

    For example, to update the host IP addresses and gateway IP address for virtual switch vs1, the sample command would be as follows:

    nutanix@cvm$ acli net.update_virtual_switch vs1 host_ip_addr_config={ebeae8d8-47cb-40d0-87f9-d03a762ffad7:10.XX.XX.15/24} gateway_ip_address=10.XX.XX.1/24
  3. Set the virtual switch for use with Flow networking VPCs.
    Use the following command:
    nutanix@cvm$ acli net.set_vpc_east_west_traffic_config virtual_switch=virtual-switch-name
    Note: When you run this command, if the virtual switch does not have IP addresses configured for the hosts, the command fails with an error message. See Conditions for VPC Uplinks for more information.

    For example, to set vs1 as the virtual switch for Flow networking VPC traffic, the sample command is as follows:

    nutanix@cvm$ net.set_vpc_east_west_traffic_config virtual_switch=vs1
    Note: You can configure the virtual switch to route all traffic. Set the value for the permit_all_traffic= option in the net.set_vpc_east_west_traffic_config command to true to route all the traffic using the virtual switch. The default value for this option is false which allows the virtual switch to route only VPC traffic.

    Do not configure the permit_all_traffic= option if you want to use the virtual switch only for VPC traffic. Configure the permit_all_traffic= option with the value true only when you want the virtual switch to allow all traffic.

  4. You can update the virtual switch that is set for Flow networking VPC traffic using the following command:
    nutanix@cvm$ net.update_vpc_east_west_traffic_config virtual_switch=vs1

    To update the virtual switch to allow all traffic, use the permit_all_traffic= option with the value true as follows:

    nutanix@cvm$ net.update_vpc_east_west_traffic_config permit_all_traffic=true
  5. Update the subnet to use the new virtual switch for the external traffic, on the Prism Central VM.
    nutanix@pcvm$ atlas_cli subnet.update external_subnet_name virtual_switch_uuid=virtual_switch_uuid

What to do next

  • To verify if the settings are made as required, use the atlas_config.get command and check the output.

    <acropolis> atlas_config.get
    config {
      anc_domain_name_server_list: "10.xxx.xxx.xxx"
      dvs_physnet_mapping_list {
        dvs_uuid: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
        physnet: "physnet1"
      }
      enable_atlas_networking: True
      logical_timestamp: 54
      minimum_ahv_version: "20201105.2016"
      ovn_cacert_path: "/home/certs/OvnController/ca.pem"
      ovn_certificate_path: "/home/certs/OvnController/OvnController.crt"
      ovn_privkey_path: "/home/certs/OvnController/OvnController.key"
      ovn_remote_address: "ssl:anc-ovn-external.default.xxxx.nutanix.com:6652"
      vpc_east_west_traffic_config {
        dvs_uuid: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
        permit_all_traffic: True
      }
    }

    Where:

    • dvs_physnet_mapping_list provides details of the virtual switch.
    • vpc_east_west_traffic_config provides the configuration for traffic with permit_all_traffic being True . It also provides the UUID of the virtual switch being used for traffic as dvs_uuid: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
  • VLAN tagging: See VLAN Configuration.
Clearing, Disabling and Deleting the Virtual Switch

You can disable and delete a non-default virtual switch used for Flow networking VPC traffic.

About this task

Before you begin

Before you delete a virtual switch that allows Flow networking VPC traffic, you must clear the virtual switch configuration that assigns the VPC traffic to that virtual switch. Use the net.clear_vpc_east_west-traffic_config command.
Note:

Do not create, update or delete any virtual switch when the AOS or AHV upgrade process is running.

About this task

To disable or delete a virtual switch configured to manage Flow networking VPC traffic, do the following:

Note: After you clear the virtual switch settings using step1, you can disable and delete the virtual switch in Prism Central.

Procedure

  1. Use the net.clear_vpc_east_west_traffic_config command to remove the settings on the virtual switch or switches (vs1 per example) configured for Flow networking VPC traffic.
  2. Use the net.disable_virtual_switch virtual_switch=<virtual-switch-name> option to disable the virtual switch.
  3. Use the net.delete_virtual_switch virtual_switch=<virtual-switch-name> option to delete the virtual switch.

Re-Configuring Bonds Across Hosts Manually

If you are upgrading AOS to 5.20, 6.0 or later, you need to migrate the existing bridges to virtual switches. If there are inconsistent bond configurations across hosts before migration of the bridges, then after migration of bridges the virtual switches may not be properly deployed. To resolve such issues, you must manually configure the bonds to make them consistent.

About this task

Important: Use this procedure only when you need to modify the inconsistent bonds in a migrated bridge across hosts in a cluster, that is preventing Acropolis (AOS) from deploying the virtual switch for the migrated bridge.

Do not use ovs-vsctl commands to make the bridge level changes. Use the manage_ovs commands, instead.

The manage_ovs command allows you to update the cluster configuration. The changes are applied and retained across host restarts. The ovs-vsctl command allows you to update the live running host configuration but does not update the AOS cluster configuration and the changes are lost at host restart. This behavior of ovs-vsctl introduces connectivity issues during maintenance, such as upgrades or hardware replacements.

ovs-vsctl is usually used during a break/fix situation where a host may be isolated on the network and requires a workaround to gain connectivity before the cluster configuration can actually be updated using manage_ovs .

Note: Disable the virtual switch before you attempt to change the bonds or bridge.

If you hit an issue where the virtual switch is automatically re-created after it is disabled (with AOS versions 5.20.0 or 5.20.1), follow steps 1 and 2 below to disable such an automatically re-created virtual switch again before migrating the bridges. For more information, see KB-3263.

Be cautious when using the disable_virtual_switch command because it deletes all the configurations from the IDF, not only for the default virtual switch vs0, but also any virtual switches that you may have created (such as vs1 or vs2). Therefore, before you use the disable_virtual_switch command, ensure that you check a list of existing virtual switches, that you can get using the acli net.get_virtual_switch command.

Complete this procedure on each host Controller VM that is sharing the bridge that needs to be migrated to a virtual switch.

Procedure

  1. To list the virtual switches, use the following command.
    nutanix@cvm$ acli net.list_virtual_switch
  2. Disable all the virtual switches.
    nutanix@cvm$ acli net.disable_virtual_switch 

    This disables all the virtual switches.

    Note: You can use the nutanix@cvm$ acli net.delete_virtual_switch vs_name command to delete a specific VS and re-create it with the appropriate bond type.
  3. Change the bond type to align with the same bond type on all the hosts for the specified virtual switch
    nutanix@cvm$ manage_ovs --bridge_name bridge-name --bond_name bond_name --bond_mode bond-type update_uplinks

    Where:

    • bridge-name : Provide the name of the bridge, such as br0 for the virtual switch on which you want to set the uplink bond mode.
    • bond-name : Provide the name of the uplink port such as br0-up for which you want to set the bond mode.
    • bond-type : Provide the bond mode that you require to be used uniformly across the hosts on the named bridge.

    Use the manage_ovs --help command for help on this command.

    Note: To disable LACP, change the bond type from LACP Active-Active (balance-tcp) to Active-Backup/Active-Active with MAC pinning (balance-slb) by setting the bond_mode using this command as active-backup or balance-slb .

    Ensure that you turn off LACP on the connected ToR switch port as well. To avoid blocking of the bond uplinks during the bond type change on the host, ensure that you follow the ToR switch best practices to enable LACP fallback or passive mode.

    To enable LACP, configure bond-type as balance-tcp (Active-Active) with additional variables --lacp_mode fast and --lacp_fallback true .

  4. (If migrating to AOS version earlier than 5.20.2) Check if the issue in the note and disable the virtual switch.

What to do next

After making the bonds consistent across all the hosts configured in the bridge, migrate the bridge or enable the virtual switch. For more information, see:

To check whether LACP is enabled or disabled, use the following command.

nutanix@cvm$ manage_ovs show_uplinks

Enabling LACP and LAG (AHV Only)

If you select the Active-Active bond type, you must enable LACP and LAG on the corresponding ToR switch for each node in the cluster one after the other. This section describes the procedure to enable LAG and LACP in AHV nodes and the connected ToR switch.

About this task

Procedure

  1. Change the uplink Bond Type for the virtual switch.
    1. Open the Edit Virtual Switch window.
      • In Prism Central, open Network & Security > Subnets > Network Configuration > Virtual Switch .
      • In Prism Element or Web Console, open Settings > Network Configuration > Virtual Switch
    2. Click the Edit Edit icon
      icon of the virtual switch you want to configure LAG and LACP.
    3. On the Edit Virtual Switch page, in the General tab, ensure that the Standard option is selected for the Select Configuration Method parameter. Click Next .
      The Standard configuration method puts each node in maintenance mode before applying the updated settings. After applying the updated settings, the node exits from maintenance mode. See Virtual Switch Workflow .
    4. On the Uplink Configuration tab, in Bond Type , select Active-Active .
    5. Click Save .
    The Active-Active bond type configures all AHV hosts with the fast setting for LACP speed, causing the AHV host to request LACP control packets at the rate of one per second from the physical switch. In addition, the Active-Active bond type configuration sets LACP fallback to Active-Backup on all AHV hosts. You cannot modify these default settings after you have configured them in Prism, even by using the CLI.

    This completes the LAG and LACP configuration on the cluster.

Perform the following steps on each node, one at a time.
  1. Put the node and the Controller VM into maintenance mode.
    Before you put a node in maintenance mode, see Verifying the Cluster Health and carry out the necessary checks.

    See Putting a Node into Maintenance Mode using Web Console . Step 6 in this procedure puts the Controller VM in maintenance mode.

  2. Change the settings for the interface on the ToR switch that the node connects to, to match the LACP and LAG setting made on the cluster in step 1 above.
    This is an important step. See the documentation provided by the ToR switch vendor for more information about changing the LACP settings of the switch interface that the node is physically connected to.
    • Nutanix recommends that you enable LACP fallback.

    • Consider the LACP time options ( slow and fast ). If the switch has a fast configuration, set the LACP time to fast . This is to prevent an outage due to a mismatch on LACP speeds of the cluster and the ToR switch. Keep in mind that the Active-Active bond type configuration set the LACP of cluster to fast .

    Verify that LACP negotiation status is negotiated.

  3. Remove the node and Controller VM from maintenance mode.
    See Exiting a Node from the Maintenance Mode using Web Console . The Controller VM exits maintenance mode during the same process.

What to do next

Do the following after completing the procedure to enable LAG and LACP in all the AHV nodes the connected ToR switches:
  • Verify that the status of all services on all the CVMs are Up. Run the following command and check if the status of the services is displayed as Up in the output:
    nutanix@cvm$ cluster status
  • Log on to the Prism Element of the node and check the Data Resiliency Status widget displays OK .
    Figure. Data Resiliency Status Click to enlarge

VLAN Configuration

You can set up a VLAN-based segmented virtual network on an AHV node by assigning the ports on virtual bridges managed by virtual switches to different VLANs. VLAN port assignments are configured from the Controller VM that runs on each node.

For best practices associated with VLAN assignments, see AHV Networking Recommendations. For information about assigning guest VMs to a virtual switch and VLAN, see Network Connections in the Prism Central Guide .

Assigning an AHV Host to a VLAN

About this task

Note: Perform the following procedure during a scheduled downtime. Before you begin, stop the cluster. Once the process begins, hosts and CVMs partially lose network access to each other and VM data or storage containers become unavailable until the process completes.

To assign an AHV host to a VLAN, do the following on every AHV host in the cluster:

Procedure

  1. Log on to the AHV host with SSH.
  2. Put the AHV host and the CVM in maintenance mode.
    See Putting a Node into Maintenance Mode using CLI for instructions about how to put a node into maintenance mode.
  3. Assign port br0 (the internal port on the default OVS bridge, br0 on defaul virtual switch vs0) to the VLAN that you want the host be on.
    root@ahv# ovs-vsctl set port br0 tag=host_vlan_tag

    Replace host_vlan_tag with the VLAN tag for hosts.

  4. Confirm VLAN tagging on port br0.
    root@ahv# ovs-vsctl list port br0
  5. Check the value of the tag parameter that is shown.
  6. Verify connectivity to the IP address of the AHV host by performing a ping test.
  7. Exit the AHV host and the CVM from the maintenance mode.
    See Exiting a Node from the Maintenance Mode Using CLI for more information.

Assigning the Controller VM to a VLAN

By default, the public interface of a Controller VM is assigned to VLAN 0. To assign the Controller VM to a different VLAN, change the VLAN ID of its public interface. After the change, you can access the public interface from a device that is on the new VLAN.

About this task

Note: Perform the following procedure during a scheduled downtime. Before you begin, stop the cluster. Once the process begins, hosts and CVMs partially lose network access to each other and VM data or storage containers become unavailable until the process completes.
Note: To avoid losing connectivity to the Controller VM, do not change the VLAN ID when you are logged on to the Controller VM through its public interface. To change the VLAN ID, log on to the internal interface that has IP address 192.168.5.254.

Perform these steps on every Controller VM in the cluster. To assign the Controller VM to a VLAN, do the following:

Procedure

  1. Log on to the AHV host with SSH.
  2. Put the AHV host and the Controller VM in maintenance mode.
    See Putting a Node into Maintenance Mode using CLI for instructions about how to put a node into maintenance mode.
  3. Check the Controller VM status on the host.
    root@host# virsh list

    An output similar to the following is displayed:

    root@host# virsh list
     Id    Name                           State
    ----------------------------------------------------
     1     NTNX-CLUSTER_NAME-3-CVM            running
     3     3197bf4a-5e9c-4d87-915e-59d4aff3096a running
     4     c624da77-945e-41fd-a6be-80abf06527b9 running
    
    root@host# logout
  4. Log on to the Controller VM.
    root@host# ssh nutanix@192.168.5.254

    Accept the host authenticity warning if prompted, and enter the Controller VM nutanix password.

  5. Assign the public interface of the Controller VM to a VLAN.
    nutanix@cvm$ change_cvm_vlan vlan_id

    Replace vlan_id with the ID of the VLAN to which you want to assign the Controller VM.

    For example, add the Controller VM to VLAN 201.

    nutanix@cvm$ change_cvm_vlan 201
  6. Confirm VLAN tagging on the Controller VM.
    root@host# virsh dumpxml cvm_name

    Replace cvm_name with the CVM name or CVM ID to view the VLAN tagging information.

    Note: Refer to step 3 for Controller VM name and Controller VM ID.

    An output similar to the following is displayed:

    root@host# virsh dumpxml 1 | grep "tag id" -C10 --color
          <target dev='vnet2'/>
          <model type='virtio'/>
          <driver name='vhost' queues='4'/>
          <alias name='net2'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
        </interface>
        <interface type='bridge'>
          <mac address='50:6b:8d:b9:0a:18'/>
          <source bridge='br0'/>
          <vlan>
               <tag id='201'/> 
          </vlan>
          <virtualport type='openvswitch'>
            <parameters interfaceid='c46374e4-c5b3-4e6b-86c6-bfd6408178b5'/>
          </virtualport>
          <target dev='vnet0'/>
          <model type='virtio'/>
          <driver name='vhost' queues='4'/>
          <alias name='net3'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
        </interface>
    root@host#
  7. Check the value of the tag parameter that is shown.
  8. Restart the network service.
    nutanix@cvm$ sudo service network restart
  9. Verify connectivity to the Controller VMs external IP address by performing a ping test from the same subnet. For example, perform a ping from another Controller VM or directly from the host itself.
  10. Exit the AHV host and the Controller VM from the maintenance mode.
    See Exiting a Node from the Maintenance Mode Using CLI for more information.

IGMP Snooping

On an AHV host, when multicast traffic flows to a virtual switch the host floods the Mcast traffic to all the VMs on the specific VLAN. This mechanism is inefficient when many of the VMs on the VLAN do not need that multicast traffic. IGMP snooping allows the host to track which VMs on the VLAN need the multicast traffic and send the multicast traffic to only those VMs. For example, assume there are 50 VMs on VLAN 100 on virtual switch vs1 and only 25 VMs need to receive (hence, receiver VMs) the multicast traffic. Turn on IGMP snooping to help the AHV host track the 25 receiver VMs and deliver the multicast traffic to only the 25 receiver VMs instead of pushing the multicast traffic to all 50 VMs.

When IGMP snooping is enabled in a virtual switch on a VLAN, the ToR switch or router queries the VMs about the Mcast traffic that the VMs are interested in. When the switch receives a join request from a VM in response to the query, it adds the VM to the multicast list for that source entry as a receiver VM. When the switch sends a query, only the VMs that require the multicast traffic respond to the switch. The VMs that do not need the traffic do not respond at all. So, the switch does not add a VM to a multicast group or list unless it receives a response from that VM for the query.

Typically, in a multicast scenario, there is a source entity that casts the multicast traffic. This source may be another VM in this target cluster (that contains the target VMs that need to receive the multicast traffic) or another cluster connected to the target cluster. The host in the target cluster acts as the multicast router. Enable IGMP snooping in the virtual switch that hosts the VLAN connecting the VMs. You must also enable either the native Acropolis IGMP querier on the host or a separate third party querier that you install on the host. The native Acropolis IGMP querier sends IGMP v2 query packets to the VMs.

The IGMP Querier sends out queries periodically to keep the multicast groups or lists updated. The periodicity of the query is determined by the IGMP snooping timeout value that you must specify when you enable IGMP snooping. For example, if you have configured the IGMP snooping timeout as 30 seconds then the IGMP Querier sends out a query every 15 seconds.

When you enable IGMP snooping and are using the native Acropolis IGMP querier, you must configure the IGMP VLAN list. The IGMP VLAN list is a list of VLANs that the native IGMP Querier must send the query out to. This list value is a comma-separated list of the VLAN IDs that the query needs to be sent to. If you do not provide a list of VLANs, then the native IGMP Querier sends the query to all the VLANs in the switch.

When a VM needs to receive the multicast traffic from specific multicast source, configure the multicast application on the VM to listen to the queries received by the VM from the IGMP Querier. Also, configure The multicast application on the VM to respond to the relevant query, that is, the query for the specific multicast source. The response that the application sends is logged by the virtual switch which then sends the multicast traffic to that VM instead of flooding it to all the VMs on the VLAN.

A multicast source always sends multicast traffic to a multicast group or list that is indicated by a multicast group IP address.

Enabling or Disabling IGMP Snooping

IGMP snooping helps you manage multicast traffic to specific VMs configured on a VLAN.

About this task

You can only enable IGMP snooping using aCLI.

Procedure

Run the following command:
net.update_virtual_switch virtual-switch-name enable_igmp_snooping=true enable_igmp_querier=[true | false] igmp_query_vlan_list=VLAN IDs igmp_snooping_timeout=timeout

Provide:

  • virtual-switch-name —The name of the virtual switch in which the VLANs are configured. For example, the name of the default virtual switch is vs0 . Provide the name of the virtual switch exactly as it is configured.
  • enable_igmp_snooping=[ true | false ] true to enable IGMP snooping. Provide false to disable IGMP snooping. The default setting is false .
  • enable_igmp_querier=[ true | false ] true to enable the native IGMP querier. Provide false to disable the native IGMP querier. The default setting is false .
  • igmp_query_vlan_list= VLAN IDs —List of VLAN IDs mapped to the virtual switch for which IGMP querier is enabled. When it's not set or set as an empty list, querier is enabled for all VLANs of the virtual switch.
  • igmp_snooping_timeout= timeout —An integer indicating time in seconds. For example, you can provide 30 to indicate IGMP snooping timeout of 30 seconds.

    The default timeout is 300 seconds.

    You can set the timeout in the range of 15 - 3600 seconds.

What to do next

You can verify whether IGMP snooping is enabled or disabled by running the following command:
net.get_virtual_switch virtual-switch-name

The output of this command includes the following sample configuration:

igmp_config {
  enable_querier: True
  enable_snooping: True
 }

The above sample shows that IGMP snooping and the native acropolis IGMP querier are enabled.

Switch Port ANalyzer on AHV Hosts

Switch Port ANalyzer (SPAN) or port mirroring enables you to mirror traffic from interfaces of the AHV hosts to the VNIC of guest VMs. SPAN mirrors some or all packets from a set of source ports to a set of destination ports. You can mirror inbound, outbound, or bidirectional traffic on a set of source ports. You can then use the mirrored traffic for security analysis and gain visibility of traffic flowing through the set of source ports. SPAN is a useful tool for troubleshooting packets and can prove to be necessary for compliance reasons.

AHV supports the following types of source ports in a SPAN session:

  1. A bond port that is already mapped to a Virtual Switch (VS) such as vs0, vs1, or any other VS you have created.

  2. A non-bond port that is already mapped to a VS such as vs0, vs1, or any other VS you have created.

  3. An uplink port that is not assigned to any VS or bridge on the host.

Important Considerations

Consider the following before you configure SPAN on AHV hosts:

  • In this release, AHV supports mirroring of traffic only from physical interfaces.

  • The SPAN destination VM or guest VM must be running on the same AHV host where the source ports are located.

  • Delete the SPAN session before you delete the SPAN destination VM or VNIC. Otherwise, the state of the SPAN session is displayed as error.

  • AHV does not support SPAN from a member of a bond port. For example, if you have mapped br0-up to bridge br0 with members eth0 and eth1, you cannot create a SPAN session with either eth0 or eth1 as the source port. You must use only br0-up as the source port.

  • AHV supports different types of source ports in one session. For example, you can create a session with br0-up (bond port) and eth5 (single uplink port) on the same host as two different source ports in the same session. You can even have two different bond ports in the same session.

  • One SPAN session supports up to two source and two destination ports.

  • One host supports up to two SPAN sessions.

  • You cannot create a SPAN session on an AHV host that is in the maintenance mode.

  • If you move the uplink interface to another Virtual Switch, the SPAN session fails. Note that the system does not generate an alert in this situation.

  • With TCP Segmentation Offload, multiple packets belonging to the same stream can be coalesced into a single one before being delivered to the SPAN destination VM. With TCP Segmentation Offload enabled, there can be a difference between the number of packets received on the uplink interface and packets forwarded to the SPAN destination VM (session packet count <= uplink interface packet count). However, the byte count at the SPAN destination VM is closer to the number at the uplink interface.

Configuring SPAN on an AHV Host

To configure SPAN on an AHV host, create a SPAN destination VNIC where you assign that VNIC to a guest VM (SPAN destination VM). After you create the VNIC, create a SPAN session specifying the source and destination ports between which you want to run the SPAN session.

Before you begin

Ensure that you have created the guest VM that you want to configure as the SPAN destination VM.
Note: The SPAN destination VM must run on the same AHV host where the source ports are located. Therefore, Nutanix highly recommends that you create or modify the guest VM as an agent VM so that the VM is not migrated from the host.

Command and example for modifying a guest VM as an agent VM: (Recommended)

nutanix@cvm$ acli vm.update vm-name agent_vm=true
nutanix@cvm$ acli vm.update span-dest-VM agent_vm=true

In this example, span-dest-VM is the name of the guest VM that you are modifying as an agent VM.

About this task

Perform the following procedure to configure SPAN on an AHV host:

Procedure

  1. Log on to a Controller VM in the cluster with SSH.
  2. Determine the name and UUID of the guest VM that you want to configure as the SPAN destination VM.
    nutanix@cvm$ acli vm.list

    Example:

    nutanix@cvm$ acli vm.list
    VM name       VM UUID
    span-dest-VM  85abfdd5-7419-4f7c-bffa-8f961660e516

    In this example, span-dest-VM is the name and 85abfdd5-7419-4f7c-bffa-8f961660e516 is the UUID of the guest VM.

    Note: If you delete the SPAN destination VM without deleting the SPAN session you create with this SPAN destination VM, the SPAN session State displays kError .
  3. Create a SPAN destination VNIC for the guest VM.
    nutanix@cvm$ acli vm.nic_create vm-name type=kSpanDestinationNic

    Replace vm-name with the name of the guest VM on which you want to configure SPAN.

    Note: Do not include any other parameter when you are creating a SPAN destination VNIC.

    Example:

    nutanix@cvm$ acli vm.nic_create span-dest-VM type=kSpanDestinationNic
    NicCreate: complete
    Note: If you delete the SPAN destination VNIC without deleting the SPAN session you create with this SPAN destination VNIC, the SPAN session State displays kError .
  4. Determine the MAC address of the VNIC.
    nutanix@cvm$ acli vm.nic_get vm-name
    

    Replace vm-name with the name of the guest VM to which you assigned the VNIC.

    Example:

    nutanix@cvm$ acli vm.nic_get span-dest-VM
    x.x.x.x {
      connected: True
      ip_address: "x.x.x.x"
      mac_addr: "50:6b:8d:8b:2c:94"
      network_name: "mgmt"
      network_type: "kNativeNetwork"
      network_uuid: "c14b0092-877e-489b-a399-2749a60b3206"
      type: "kNormalNic"
      uuid: "9dd4f307-2506-4354-86a3-0b99abdeba6c"
      vlan_mode: "kAccess"
    }
    50:6b:8d:de:c6:44 {
      mac_addr: "50:6b:8d:de:c6:44"
      network_type: "kNativeNetwork"
      type: "kSpanDestinationNic"
      uuid: "b59e99bc-6bc7-4fab-ac35-543695c300d1"
    }
    

    Note the MAC address (value of mac_addr ) of the VNIC whose type is set to kSpanDestinationNic .

  5. Determine the UUID of the host whose traffic you want to monitor by using SPAN.
    nutanix@cvm$ acli host.list
  6. Create a SPAN session.
    nutanix@cvm$ acli net.create_span_session span-session-name description="description-text" source_list=\{uuid=host-uuid,type=kHostNic,identifier=source-port-name,direction=traffic-type} dest_list=\{uuid=vm-uuid,type=kVmNic,identifier=vnic-mac-address}

    Replace the variables mentioned in the command for the following parameters with their appropriate values as follows:

    • span-session-name : Replace span-session-name with a name for the session.
    • description (Optional): Replace description-text with a description for the session. This is an optional parameter.
    Note:

    All source_list and dest_list parameters are mandatory inputs. The parameters do not have default values. Provide an appropriate value for each parameter.

    Source list parameters:

    • uuid : Replace host-uuid with the UUID of the host whose traffic you want to monitor by using SPAN. (determined in step 5).
    • type : Specify kHostNic as the type. Only the kHostNic type is supported in this release.
    • identifier : Replace source-port-name with name of the source port whose traffic you want to mirror. For example, br0-up, eth0, or eth1.
    • direction : Replace traffic-type with kIngress if you want to mirror inbound traffic, kEgress for outbound traffic, or kBiDir for bidirectional traffic.

    Destination list parameters:

    • uuid : Replace vm-uuid with the UUID of the guest VM that you want to configure as the SPAN destination VM. (determined in step 2).
    • type : Specify kVmNic as the type. Only the kVmNic type is supported in this release.
    • identifier : Replace vnic-mac-address with the MAC address of the destination port where you want to mirror the traffic (determined in step 4).
    Note: The syntax for source_list and dest_list is as follows:

    source_list/dest_list=[{key1=value1,key2=value2,..}]

    Each pair of curly brackets includes the details of one source or destination port with a comma-separated list of the key-value pairs. There must not be any space between two key-value pairs.

    One SPAN session supports up to two source and two destination ports. If you want to include an extra port, separate the curly brackets with a semicolon (no space) and list the key-value pairs of the second port in the other curly bracket.

    Example:

    nutanix@cvm$ acli net.create_span_session span1 description="span session 1" source_list=\{uuid=492a2bda-ffc0-486a-8bc0-8ae929471714,type=kHostNic,identifier=br0-up,direction=kBiDir} dest_list=\{uuid=85abfdd5-7419-4f7c-bffa-8f961660e516,type=kVmNic,identifier=50:6b:8d:de:c6:44}
    SpanCreate: complete
  7. Display the list of all SPAN sessions running on a host.
    nutanix@cvm$ acli net.list_span_session

    Example:

    nutanix@cvm$ acli net.list_span_session
    Name   UUID                                  State
    span1  69252eb5-8047-4e3a-8adc-91664a7104af  kActive

    Possible values for State are:

    • kActive : Denotes that the SPAN session is active.
    • kError : denotes that there is an error and the configuration is not working. For example, if there are two surces and one source is down, the State of the session is diplayed as kError .
  8. Display the details of a SPAN session.
    nutanix@cvm$ acli net.get_span_session span-session-name

    Replace span-session-name with the name of the SPAN session whose details you want to view.

    Example:

    nutanix@cvm$ acli net.get_span_session span1
    span1 {
      config {
        datapath_name: "s6925"
        description: "span session 1"
        destination_list {
          nic_type: "kVmNic"
          port_identifier: "50:6b:8d:de:c6:44"
          uuid: "85abfdd5-7419-4f7c-bffa-8f961660e516"
        }
        name: "span1"
        session_uuid: "69252eb5-8047-4e3a-8adc-91664a7104af"
        source_list {
          direction: "kBiDir"
          nic_type: "kHostNic"
          port_identifier: "br0-up"
          uuid: "492a2bda-ffc0-486a-8bc0-8ae929471714"
        }
      }
      stats {
        name: "span1"
        session_uuid: "69252eb5-8047-4e3a-8adc-91664a7104af"
        state: "kActive"
        stats_list {
          tx_byte_cnt: 67498
          tx_pkt_cnt: 436
        }
      }
    }

    Note the value of the datapath_name field in the SPAN session configuration, which is a unique key that identifies the SPAN session. You might need the unique key to correctly identify the SPAN session for troubleshooting reasons.

Updating a SPAN Session

You can update any of the details of a SPAN session. When you are updating a SPAN session, specify the values of the parameters you want to update and then specify the rest of the parameters again as you specified them when you created the SPAN session. For example, if you want to change only the name and description, specify the updated name and description and then include the complete details of the source and destination ports again even though you are not updating those details.

About this task

Perform the following procedure to update a SPAN session:

Procedure

  1. Log on to a Controller VM in the cluster with SSH.
  2. Update the SPAN session.
    nutanix@cvm$ acli net.update_span_session span-session-name description="description-text" source_list=\{uuid=host-uuid,type=kHostNic,identifier=source-port-name,direction=traffic-type} dest_list=\{uuid=vm-UUID,type=kVmNic,identifier=vNIC-mac-address}

    The update command includes the same parameters as the create command. See Configuring SPAN on an AHV Host for more information.

    Example:

    nutanix@cvm$ acli net.update_span_session span1 name=span_br0_to_span_dest description="span from br0-up to span-dest VM" source_list=\{uuid=492a2bda-ffc0-486a-8bc0-8ae929471714,type=kHostNic,identifier=br0-up,direction=kBiDir} dest_list=\{uuid=85abfdd5-7419-4f7c-bffa-8f961660e516,type=kVmNic,identifier=50:6b:8d:de:c6:44}
    SpanUpdate: complete
    
    nutanix@cvm$ acli net.list_span_session
    Name                   UUID                                  State
    span_br0_to_span_dest  69252eb5-8047-4e3a-8adc-91664a7104af  kActive
    
    nutanix@cvm$ acli net.get_span_session span_br0_to_span_dest
    span_br0_to_span_dest {
      config {
        datapath_name: "s6925"
        description: "span from br0-up to span-dest VM"
        destination_list {
          nic_type: "kVmNic"
          port_identifier: "50:6b:8d:de:c6:44"
          uuid: "85abfdd5-7419-4f7c-bffa-8f961660e516"
        }
        name: "span_br0_to_span_dest"
        session_uuid: "69252eb5-8047-4e3a-8adc-91664a7104af"
        source_list {
          direction: "kBiDir"
          nic_type: "kHostNic"
          port_identifier: "br0-up"
          uuid: "492a2bda-ffc0-486a-8bc0-8ae929471714"
        }
      }
      stats {
        name: "span_br0_to_span_dest"
        session_uuid: "69252eb5-8047-4e3a-8adc-91664a7104af"
        state: "kActive"
        stats_list {
          tx_byte_cnt: 805705
          tx_pkt_cnt: 4792
        }
      }
    }

    In this example, only the name and description were updated. However, complete details of the source and destation ports were included in the command again.

    If you want to change the name of a SPAN session, specify the existing name first and then include the new name by using the “name=” parameter as shown in this example.

Deleting a SPAN Session

Delete the SPAN session if you want to disable SPAN on an AHV host. Nutanix recommends that you delete the SPAN session associated with a SPAN destination VM or SPAN destination VNIC.

About this task

Perform the following procedure to delete a SPAN session:

Procedure

  1. Log on to a Controller VM in the cluster with SSH.
  2. Delete the SPAN session.
    nutanix@cvm$ acli net.delete_span_session span-session-name

    Replace span-session-name with the name of the SPAN session you want to delete.

Enabling RSS Virtio-Net Multi-Queue by Increasing the Number of VNIC Queues

Multi-Queue in VirtIO-net enables you to improve network performance for network I/O-intensive guest VMs or applications running on AHV hosts.

About this task

You can enable VirtIO-net multi-queue by increasing the number of VNIC queues. If an application uses many distinct streams of traffic, Receive Side Scaling (RSS) can distribute the streams across multiple VNIC DMA rings. This increases the amount of RX buffer space by the number of VNIC queues (N). Also, most guest operating systems pin each ring to a particular vCPU, handling the interrupts and ring-walking on that vCPU, by that means achieving N-way parallelism in RX processing. However, if you increase the number of queues beyond the number of vCPUs, you cannot achieve extra parallelism.

Following workloads have the greatest performance benefit of VirtIO-net multi-queue:

  • VMs where traffic packets are relatively large
  • VMs with many concurrent connections
  • VMs with network traffic moving:
    • Among VMs on the same host
    • Among VMs across hosts
    • From VMs to the hosts
    • From VMs to an external system
  • VMs with high VNIC RX packet drop rate if CPU contention is not the cause

You can increase the number of queues of the AHV VM VNIC to allow the guest OS to use multi-queue VirtIO-net on guest VMs with intensive network I/O. Multi-Queue VirtIO-net scales the network performance by transferring packets through more than one Tx/Rx queue pair at a time as the number of vCPUs increases.

Nutanix recommends that you be conservative when increasing the number of queues. Do not set the number of queues larger than the total number of vCPUs assigned to a VM. Packet reordering and TCP retransmissions increase if the number of queues is larger than the number vCPUs assigned to a VM. For this reason, start by increasing the queue size to 2. The default queue size is 1. After making this change, monitor the guest VM and network performance. Before you increase the queue size further, verify that the vCPU usage has not dramatically or unreasonably increased.

Perform the following steps to make more VNIC queues available to a guest VM. See your guest OS documentation to verify if you must perform extra steps on the guest OS to apply the additional VNIC queues.

Note: You must shut down the guest VM to change the number of queues. Therefore, make this change during a planned maintenance window. The VNIC status might change from Up->Down->Up or a restart of the guest OS might be required to finalize the settings depending on the guest OS implementation requirements.

Procedure

  1. (Optional) Nutanix recommends that you ensure the following:
    1. AHV and AOS are running the latest version.
    2. AHV guest VMs are running the latest version of the Nutanix VirtIO driver package.
      For RSS support, ensure you are running Nutanix VirtIO 1.1.6 or later. See Nutanix VirtIO for Windows for more information about Nutanix VirtIO.
  2. Determine the exact name of the guest VM for which you want to change the number of VNIC queues.
    nutanix@cvm$ acli vm.list

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.list
    VM name          VM UUID
    ExampleVM1       a91a683a-4440-45d9-8dbe-xxxxxxxxxxxx
    ExampleVM2       fda89db5-4695-4055-a3d4-xxxxxxxxxxxx
    ...
  3. Determine the MAC address of the VNIC and confirm the current number of VNIC queues.
    nutanix@cvm$ acli vm.nic_get VM-name

    Replace VM-name with the name of the VM.

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.nic_get VM-name
    ...
    mac_addr: "50:6b:8d:2f:zz:zz"
    ...
    (queues: 2)    <- If there is no output of 'queues', the setting is default (1 queue).
    Note: AOS defines queues as the maximum number of Tx/Rx queue pairs (default is 1).
  4. Check the number of vCPUs assigned to the VM.
    nutanix@cvm$ acli vm.get VM-name | grep num_vcpus

    An output similar to the following is displayed:

    nutanix@cvm$ acli vm.get VM-name | grep num_vcpus
    num_vcpus: 1
  5. Shut down the guest VM.
    nutanix@cvm$ acli vm.shutdown VM-name

    Replace VM-name with the name of the VM.

  6. Increase the number of VNIC queues.
    nutanix@cvm$acli vm.nic_update VM-name vNIC-MAC-address queues=N

    Replace VM-name with the name of the guest VM, vNIC-MAC-address with the MAC address of the VNIC, and N with the number of queues.

    Note: N must be less than or equal to the vCPUs assigned to the guest VM.
  7. Start the guest VM.
    nutanix@cvm$ acli vm.on VM-name

    Replace VM-name with the name of the VM.

  8. Confirm in the guest OS documentation if any additional steps are required to enable multi-queue in VirtIO-net.
    Note: Microsoft Windows has RSS enabled by default.

    For example, for RHEL and CentOS VMs, do the following:

    1. Log on to the guest VM.
    2. Confirm if irqbalance.service is active or not.
      uservm# systemctl status irqbalance.service

      An output similar to the following is displayed:

      irqbalance.service - irqbalance daemon
         Loaded: loaded (/usr/lib/systemd/system/irqbalance.service; enabled; vendor preset: enabled)
         Active: active (running) since Tue 2020-04-07 10:28:29 AEST; Ns ago
    3. Start irqbalance.service if it is not active.
      Note: It is active by default on CentOS VMs. You might have to start it on RHEL VMs.
      uservm# systemctl start irqbalance.service
    4. Run the following command:
      uservm$ ethtool -L ethX combined M

      Replace M with the number of VNIC queues.

    Note the following caveat from the RHEL 7 Virtualization Tuning and Optimization Guide : 5.4. NETWORK TUNING TECHNIQUES document:

    "Currently, setting up a multi-queue virtio-net connection can have a negative effect on the performance of outgoing traffic. Specifically, this may occur when sending packets under 1,500 bytes over the Transmission Control Protocol (TCP) stream."

  9. Monitor the VM performance to make sure that the expected network performance increase is observed and that the guest VM vCPU usage is not dramatically increased to impact the application on the guest VM.
    For assistance with the steps described in this document, or if these steps do not resolve your guest VM network performance issues, contact Nutanix Support.

Changing the IP Address of an AHV Host

Change the IP address, netmask, or gateway of an AHV host.

Before you begin

Perform the following tasks before you change the IP address, netmask, or gateway of an AHV host:
Caution: All Controller VMs and hypervisor hosts must be on the same subnet.
Warning: Ensure that you perform the steps in the exact order as indicated in this document.
  1. Verify the cluster health by following the instructions in KB-2852.

    Do not proceed if the cluster cannot tolerate failure of at least one node.

  2. Put the AHV host into the maintenance mode.

    See Putting a Node into Maintenance Mode using CLI for instructions about how to put a node into maintenance mode.

About this task

Perform the following procedure to change the IP address, netmask, or gateway of an AHV host.

Procedure

  1. Edit the settings of port br0, which is the internal port on the default bridge br0.
    1. Log on to the host console as root.

      You can access the hypervisor host console either through IPMI or by attaching a keyboard and monitor to the node.

    2. Open the network interface configuration file for port br0 in a text editor.
      root@ahv# vi /etc/sysconfig/network-scripts/ifcfg-br0
    3. Update entries for host IP address, netmask, and gateway.

      The block of configuration information that includes these entries is similar to the following:

      ONBOOT="yes" 
      NM_CONTROLLED="no" 
      PERSISTENT_DHCLIENT=1
      NETMASK="subnet_mask" 
      IPADDR="host_ip_addr" 
      DEVICE="br0" 
      TYPE="ethernet" 
      GATEWAY="gateway_ip_addr"
      BOOTPROTO="none"
      • Replace host_ip_addr with the IP address for the hypervisor host.
      • Replace subnet_mask with the subnet mask for host_ip_addr.
      • Replace gateway_ip_addr with the gateway address for host_ip_addr.
    4. Save your changes.
    5. Restart network services.

      systemctl restart network.service
    6. Assign the host to a VLAN. For information about how to add a host to a VLAN, see Assigning an AHV Host to a VLAN.
    7. Verify network connectivity by pinging the gateway, other CVMs, and AHV hosts.
  2. Log on to the Controller VM that is running on the AHV host whose IP address you changed and restart genesis.
    nutanix@cvm$ genesis restart

    If the restart is successful, output similar to the following is displayed:

    Stopping Genesis pids [1933, 30217, 30218, 30219, 30241]
    Genesis started on pids [30378, 30379, 30380, 30381, 30403]

    See Controller VM Access for information about how to log on to a Controller VM.

    Genesis takes a few minutes to restart.

  3. Verify if the IP address of the hypervisor host has changed. Run the following nCLI command from any CVM other than the one in the maintenance mode.
    nutanix@cvm$ ncli host list 

    An output similar to the following is displayed:

    nutanix@cvm$ ncli host list 
        Id                        : aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee::1234  
        Uuid                      : ffffffff-gggg-hhhh-iiii-jjjjjjjjjjj 
        Name                      : XXXXXXXXXXX-X 
        IPMI Address              : X.X.Z.3 
        Controller VM Address     : X.X.X.1 
        Hypervisor Address        : X.X.Y.4 <- New IP Address 
    ... 
  4. Stop the Acropolis service on all the CVMs.
    1. Stop the Acropolis service on all the CVMs in the cluster.
      nutanix@cvm$ allssh genesis stop acropolis
      Note: You cannot manage your guest VMs after the Acropolis service is stopped.
    2. Verify if the Acropolis service is DOWN on all the CVMs, except the one in the maintenance mode.
      nutanix@cvm$ cluster status | grep -v UP 

      An output similar to the following is displayed:

      nutanix@cvm$ cluster status | grep -v UP 
      
      2019-09-04 14:43:18 INFO zookeeper_session.py:143 cluster is attempting to connect to Zookeeper 
      
      2019-09-04 14:43:18 INFO cluster:2774 Executing action status on SVMs X.X.X.1, X.X.X.2, X.X.X.3 
      
      The state of the cluster: start 
      
      Lockdown mode: Disabled 
              CVM: X.X.X.1 Up 
                                 Acropolis DOWN       [] 
              CVM: X.X.X.2 Up, ZeusLeader 
                                 Acropolis DOWN       [] 
              CVM: X.X.X.3 Maintenance
  5. From any CVM in the cluster, start the Acropolis service.
    nutanix@cvm$ cluster start 
  6. Verify if all processes on all the CVMs, except the one in the maintenance mode, are in the UP state.
    nutanix@cvm$ cluster status | grep -v UP 
  7. Exit the AHV host and the CVM from the maintenance mode.
    See Exiting a Node from the Maintenance Mode Using CLI for more information.

Virtual Machine Management

The following topics describe various aspects of virtual machine management in an AHV cluster.

Supported Guest VM Types for AHV

The compatibility matrix available on the Nutanix Support portal includes the latest supported AHV guest VM OSes.

AHV Configuration Maximums

The Nutanix configuration maximums available on the Nutanix support portal includes all the latest configuration limits applicable to AHV. Select the appropriate AHV version to view version specific information.

Creating a VM (AHV)

In AHV clusters, you can create a new virtual machine (VM) through the Prism Element web console.

About this task

Note: Use Prism Central to create a VM with the memory overcommit feature enabled. Prism Element web console does not allow you to enable memory overcommit while creating a VM. If you create a VM using the Prism Element web console and want to enable memory overcommit for it, update the VM using Prism Central and enable memory overcommit in the Update VM page in Prism Central.

When creating a VM, you can configure all of its components, such as number of vCPUs and memory, but you cannot attach a volume group to the VM. Attaching a volume group is possible only when you are modifying a VM.

To create a VM, do the following:

Procedure

  1. In the VM dashboard, click the Create VM button.
    Note: This option does not appear in clusters that do not support this feature.
    The Create VM dialog box appears.
    Figure. Create VM Dialog Box Click to enlarge Create VM screen

  2. Do the following in the indicated fields:
    1. Name : Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Timezone : Select the timezone that you want the VM to use. If you are creating a Linux VM, select (UTC) UTC .
      Note:

      The RTC of Linux VMs must be in UTC, so select the UTC timezone if you are creating a Linux VM.

      Windows VMs preserve the RTC in the local timezone, so set up the Windows VM with the hardware clock pointing to the desired timezone.

    4. Use this VM as an agent VM : Select this option to make this VM as an agent VM.

      You can use this option for the VMs that must be powered on before the rest of the VMs (for example, to provide network functions before the rest of the VMs are powered on on the host) and must be powered off after the rest of the VMs are powered off (for example, during maintenance mode operations). Agent VMs are never migrated to any other host in the cluster. If an HA event occurs or the host is put in maintenance mode, agent VMs are powered off and are powered on on the same host once that host comes back to a normal state.

      If an agent VM is powered off, you can manually start that agent VM on another host and the agent VM now permanently resides on the new host. The agent VM is never migrated back to the original host. Note that you cannot migrate an agent VM to another host while the agent VM is powered on.

    5. vCPU(s) : Enter the number of virtual CPUs to allocate to this VM.
    6. Number of Cores per vCPU : Enter the number of cores assigned to each virtual CPU.
    7. Memory : Enter the amount of memory (in GiB) to allocate to this VM.
  3. (For GPU-enabled AHV clusters only) To configure GPU access, click Add GPU in the Graphics section, and then do the following in the Add GPU dialog box:
    Figure. Add GPU Dialog Box Click to enlarge

    For more information, see GPU and vGPU Support .

    1. To configure GPU pass-through, in GPU Mode , click Passthrough , select the GPU that you want to allocate, and then click Add .
      If you want to allocate additional GPUs to the VM, repeat the procedure as many times as you need to. Make sure that all the allocated pass-through GPUs are on the same host. If all specified GPUs of the type that you want to allocate are in use, you can proceed to allocate the GPU to the VM, but you cannot power on the VM until a VM that is using the specified GPU type is powered off.

      For more information, see GPU and vGPU Support .

    2. To configure virtual GPU access, in GPU Mode , click virtual GPU , select a GRID license, and then select a virtual GPU profile from the list.
      Note: This option is available only if you have installed the GRID host driver on the GPU hosts in the cluster.

      For more information about the NVIDIA GRID host driver installation instructions, see the NVIDIA Grid Host Driver for Nutanix AHV Installation Guide .

      You can assign multiple virtual GPU to a VM. A vGPU is assigned to the VM only if a vGPU is available when the VM is starting up.

      Before you add multiple vGPUs to the VM, see Multiple Virtual GPU Support and Restrictions for Multiple vGPU Support .

      Note:

      Multiple vGPUs are supported on the same VM only if you select the highest vGPU profile type.

      After you add the first vGPU, to add multiple vGPUs, see Adding Multiple vGPUs to the Same VM .

  4. Select one of the following firmware to boot the VM.
    • Legacy BIOS : Select legacy BIOS to boot the VM with legacy BIOS firmware.
    • UEFI : Select UEFI to boot the VM with UEFI firmware. UEFI firmware supports larger hard drives, faster boot time, and provides more security features. For more information about UEFI firmware, see UEFI Support for VM .

    If you select UEFI, you can enable the following features:

    • Secure Boot : Select this option to enable UEFI secure boot policies for your guest VMs. For more information about Secure Boot, see Secure Boot Support for VMs .
    • Windows Defender Credential Guard : Select this option to enable the Windows Defender Credential Guard feature of Microsoft Windows operating systems that allows you to securely isolate user credentials from the rest of the operating system. Follow the detailed instructions described in Windows Defender Credential Guard Support in AHV to enable this feature.
      Note: To add virtual TPM, see Creating AHV VMs with vTPM (aCLI) .
  5. To attach a disk to the VM, click the Add New Disk button.
    The Add Disk dialog box appears.
    Figure. Add Disk Dialog Box Click to enlarge configure a disk screen

    Do the following in the indicated fields:
    1. Type : Select the type of storage device, DISK or CD-ROM , from the drop-down list.
      The following fields and options vary depending on whether you choose DISK or CD-ROM .
    2. Operation : Specify the device contents from the drop-down list.
      • Select Clone from ADSF file to copy any file from the cluster that can be used as an image onto the disk.
      • Select Empty CD-ROM to create a blank CD-ROM device. (This option appears only when CD-ROM is selected in the previous field.) A CD-ROM device is needed when you intend to provide a system image from CD-ROM.
      • Select Allocate on Storage Container to allocate space without specifying an image. (This option appears only when DISK is selected in the previous field.) Selecting this option means you are allocating space only. You have to provide a system image later from a CD-ROM or other source.
      • Select Clone from Image Service to copy an image that you have imported by using image service feature onto the disk. For more information about the Image Service feature, see Configuring Images and Image Management in the Prism Self Service Administration Guide .
    3. Bus Type : Select the bus type from the pull-down list. The choices are IDE , SCSI , or SATA .

      The options displayed in the Bus Type drop-down list varies based on the storage device Type selected in Step a.

      • For device Disk , select from SCSI , SATA , PCI , or IDE bus type.
      • For device CD-ROM , you can select either IDE ,or SATA bus type.
      Note: SCSI bus is the preferred bus type and it is used in most cases. Ensure you have installed the VirtIO drivers in the guest OS.
      Caution: Use SATA, PCI, IDE for compatibility purpose when the guest OS does not have VirtIO drivers to support SCSI devices. This may have performance implications.
      Note: For AHV 5.16 and later, you cannot use an IDE device if Secured Boot is enabled for UEFI Mode boot configuration.
    4. ADSF Path : Enter the path to the desired system image.
      This field appears only when Clone from ADSF file is selected. It specifies the image to copy. Enter the path name as / storage_container_name / iso_name .iso . For example to clone an image from myos.iso in a storage container named crt1 , enter /crt1/myos.iso . When a user types the storage container name ( / storage_container_name / ), a list appears of the ISO files in that storage container (assuming one or more ISO files had previously been copied to that storage container).
    5. Image : Select the image that you have created by using the image service feature.
      This field appears only when Clone from Image Service is selected. It specifies the image to copy.
    6. Storage Container : Select the storage container to use from the drop-down list.
      This field appears only when Allocate on Storage Container is selected. The list includes all storage containers created for this cluster.
    7. Size : Enter the disk size in GiB.
    8. Index : Displays Next Available by default.
    9. When all the field entries are correct, click the Add button to attach the disk to the VM and return to the Create VM dialog box.
    10. Repeat this step to attach additional devices to the VM.
  6. To create a network interface for the VM, click the Add New NIC button.
    Prism console displays the Create NIC dialog box.
    Note: To create or update a SPAN destination type VM or VNIC, use command line interface. Prism does not support SPAN destination type configurations. See Switch Port ANalyzer on AHV Hosts .

    Figure. Create NIC Dialog Box Click to enlarge configure a NIC screen

    Do the following in the indicated fields:
    1. Subnet Name : Select the target virtual LAN from the drop-down list.
      The list includes all defined networks (see Network Configuration For VM Interfaces .).
      Note: Selecting IPAM enabled subnet from the drop-down list displays the Private IP Assignment information that provides information about the number of free IP addresses available in the subnet and in the IP pool.
    2. Network Connection State : Select the state for the network that you want it to operate in after VM creation. The options are Connected or Disconnected .
    3. Private IP Assignment : This is a read-only field and displays the following:
      • Network Address/Prefix : The network IP address and prefix.
      • Free IPs (Subnet) : The number of free IP addresses in the subnet.
      • Free IPs (Pool) : The number of free IP addresses available in the IP pools for the subnet.
    4. Assignment Type : This is for IPAM enabled network. Select Assign with DHCP to assign IP address automatically to the VM using DHCP. For more information, see IP Address Management .
    5. When all the field entries are correct, click the Add button to create a network interface for the VM and return to the Create VM dialog box.
    6. Repeat this step to create additional network interfaces for the VM.
    Note: Nutanix guarantees a unique VM MAC address in a cluster. You can come across scenarios where two VM in different clusters can have the same MAC address.
    Note: Acropolis leader generates MAC address for the VM on AHV. The first 24 bits of the MAC address is set to 50-6b-8d ( 0101 0000 0110 1101 1000 1101 ) and are reserved by Nutanix, the 25th bit is set to 1 (reserved by Acropolis leader), the 26th bit to 48th bits are auto generated random numbers.
  7. To configure affinity policy for this VM, click Set Affinity .
    The Set VM Host Affinity dialog box appears.
    1. Select the host or hosts on which you want configure the affinity for this VM.
    2. Click Save .
      The selected host or hosts are listed. This configuration is permanent. The VM will not be moved from this host or hosts even in case of HA event and will take effect once the VM starts.
  8. To customize the VM by using Cloud-init (for Linux VMs) or Sysprep (for Windows VMs), select the Custom Script check box.
    Fields required for configuring Cloud-init and Sysprep, such as options for specifying a configuration script or answer file and text boxes for specifying paths to required files, appear below the check box.
    Figure. Create VM Dialog Box (custom script fields) Click to enlarge custom script fields in the create VM screen

  9. To specify a user data file (Linux VMs) or answer file (Windows VMs) for unattended provisioning, do one of the following:
    • If you uploaded the file to a storage container on the cluster, click ADSF path , and then enter the path to the file.

      Enter the ADSF prefix ( adsf:// ) followed by the absolute path to the file. For example, if the user data is in /home/my_dir/cloud.cfg , enter adsf:///home/my_dir/cloud.cfg . Note the use of three slashes.

    • If the file is available on your local computer, click Upload a file , click Choose File , and then upload the file.
    • If you want to create or paste the contents of the file, click Type or paste script , and then use the text box that is provided.
  10. To copy one or more files to a location on the VM (Linux VMs) or to a location in the ISO file (Windows VMs) during initialization, do the following:
    1. In Source File ADSF Path , enter the absolute path to the file.
    2. In Destination Path in VM , enter the absolute path to the target directory and the file name.
      For example, if the source file entry is /home/my_dir/myfile.txt then the entry for the Destination Path in VM should be /<directory_name>/copy_desitation> i.e. /mnt/myfile.txt .
    3. To add another file or directory, click the button beside the destination path field. In the new row that appears, specify the source and target details.
  11. When all the field entries are correct, click the Save button to create the VM and close the Create VM dialog box.
    The new VM appears in the VM table view.

Managing a VM (AHV)

You can use the web console to manage virtual machines (VMs) in AHV managed clusters.

About this task

Note: Use Prism Central to update a VM if you want to enable memory overcommit for it. Prism Element web console does not allow you to enable memory overcommit while updating a VM. You can enable memory overcommit in the Update VM page in Prism Central.

After creating a VM (see Creating a VM (AHV)), you can use the web console to start or shut down the VM, launch a console window, update the VM configuration, take a snapshot, attach a volume group, migrate the VM, clone the VM, or delete the VM.

Note: Your available options depend on the VM status, type, and permissions. Unavailable options are grayed out.

To accomplish one or more of these tasks, do the following:

Procedure

  1. In the VM dashboard, click the Table view.
  2. Select the target VM in the table (top section of screen).
    The Summary line (middle of screen) displays the VM name with a set of relevant action links on the right. You can also right-click on a VM to select a relevant action.

    The possible actions are Manage Guest Tools , Launch Console , Power on (or Power off ), Take Snapshot , Migrate , Clone , Update , and Delete .

    Note: VM pause and resume feature is not supported on AHV.
    The following steps describe how to perform each action.
    Figure. VM Action Links Click to enlarge

  3. To manage guest tools as follows, click Manage Guest Tools .
    You can also enable NGT applications (self-service restore, Volume Snapshot Service and application-consistent snapshots) also as part of manage guest tools.
    1. Select Enable Nutanix Guest Tools check box to enable NGT on the selected VM.
    2. Select Mount Nutanix Guest Tools to mount NGT on the selected VM.
      Ensure that VM must have at least one empty IDE CD-ROM slot to attach the ISO.
      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
    3. To enable self-service restore feature for Windows VMs, click Self Service Restore (SSR) check box.
      The Self-Service Restore feature is enabled of the VM. The guest VM administrator can restore the desired file or files from the VM. For more information about self-service restore feature, see Self-Service Restore in the Data Protection and Recovery with Prism Element guide.

    4. After you select Enable Nutanix Guest Tools check box the VSS snapshot feature is enabled by default.
      After this feature is enabled, Nutanix native in-guest VmQuiesced Snapshot Service (VSS) agent takes snapshots for VMs that support VSS.
      Note:

      The AHV VM snapshots are not application consistent. The AHV snapshots are taken from the VM entity menu by selecting a VM and clicking Take Snapshot .

      The application consistent snapshots feature is available with Protection Domain based snapshots and Recovery Points in Prism Central. For more information, see Conditions for Application-consistent Snapshots in the Data Protection and Recovery with Prism Element guide.

    5. Click Submit .
      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
      Note:
      If you eject the CD, you can mount the CD back again by logging into the Controller VM and running the following nCLI command.
      nutanix@cvm$ ncli ngt mount vm-id=virtual_machine_id

      For example, to mount the NGT on the VM with VM_ID=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-c1601e759987, type the following command.

      nutanix@cvm$ ncli ngt mount vm-id=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-
      c1601e759987
  4. To launch a console window, click the Launch Console action link.
    This opens a Virtual Network Computing (VNC) client and displays the console in a new tab or window. This option is available only when the VM is powered on. The console window includes four menu options (top right):
    • Clicking the Mount ISO button displays the following window that allows you to mount an ISO image to the VM. To mount an image, select the desired image and CD-ROM drive from the drop-down lists and then click the Mount button.
      Figure. Mount Disk Image Window Click to enlarge mount ISO image window from VNC console

      Note: For information about how to select CD-ROM as the storage device when you intent to provide a system image from CD-ROM, see Add New Disk in Creating a VM (AHV).
    • Clicking the C-A-D icon button sends a CtrlAltDel command to the VM.
    • Clicking the camera icon button takes a screenshot of the console window.
    • Clicking the power icon button allows you to power on/off the VM. These are the same options that you can access from the Power On Actions or Power Off Actions action link below the VM table (see next step).
    Figure. Virtual Network Computing (VNC) Window Click to enlarge

  5. To start or shut down the VM, click the Power on (or Power off ) action link.

    Power on begins immediately. If you want to power off the VMs, you are prompted to select one of the following options:

    • Power Off . Hypervisor performs a hard power off action on the VM.
    • Power Cycle . Hypervisor performs a hard restart action on the VM.
    • Reset . Hypervisor performs an ACPI reset action through the BIOS on the VM.
    • Guest Shutdown . Operating system of the VM performs a graceful shutdown.
    • Guest Reboot . Operating system of the VM performs a graceful restart.
    Note: If you perform power operations such as Guest Reboot or Guest Shutdown by using the Prism Element web console or API on Windows VMs, these operations might silently fail without any error messages if at that time a screen saver is running in the Windows VM. Perform the same power operations again immediately, so that they succeed.
  6. To make a snapshot of the VM, click the Take Snapshot action link.

    For more information, see Virtual Machine Snapshots.

  7. To migrate the VM to another host, click the Migrate action link.
    This displays the Migrate VM dialog box. Select the target host from the drop-down list (or select the System will automatically select a host option to let the system choose the host) and then click the Migrate button to start the migration.
    Figure. Migrate VM Dialog Box Click to enlarge

    Note: Nutanix recommends to live migrate VMs when they are under light load. If they are migrated while heavily utilized, migration may fail because of limited bandwidth.
  8. To clone the VM, click the Clone action link.

    This displays the Clone VM dialog box, which includes the same fields as the Create VM dialog box. A cloned VM inherits the most the configurations (except the name) of the source VM. Enter a name for the clone and then click the Save button to create the clone. You can optionally override some of the configurations before clicking the Save button. For example, you can override the number of vCPUs, memory size, boot priority, NICs, or the guest customization.

    Note:
    • You can clone up to 250 VMs at a time.
    • You cannot override the secure boot setting while cloning a VM, unless the source VM already had secure boot setting enabled.

    Figure. Clone VM Window Click to enlarge clone VM window display

  9. To modify the VM configuration, click the Update action link.

    The Update VM dialog box appears, which includes the same fields as the Create VM dialog box. Modify the configuration as needed, and then save the configuration. In addition to modifying the configuration, you can attach a volume group to the VM and enable flash mode on the VM. If you attach a volume group to a VM that is part of a protection domain, the VM is not protected automatically. Add the VM to the same Consistency Group manually.

    (For GPU-enabled AHV clusters only) You can add pass-through GPUs if a VM is already using GPU pass-through. You can also change the GPU configuration from pass-through to vGPU or vGPU to pass-through, change the vGPU profile, add more vGPUs, and change the specified vGPU license. However, you need to power off the VM before you perform these operations.

    You can add new network adapters or NICs using the Add New NIC option. You can also modify the network used by an existing NIC. See Limitation for vNIC Hot-Unplugging and Creating a VM (AHV) before you modify the NIC network or create a new NIC for a VM.

    Note: To create or update a SPAN destination type VM or VNIC, use command line interface. Prism does not support SPAN destination type configurations. See Switch Port ANalyzer on AHV Hosts .

    Figure. VM Update Dialog Box Click to enlarge clone VM window display

    Note: If you delete a vDisk attached to a VM and snapshots associated with this VM exist, space associated with that vDisk is not reclaimed unless you also delete the VM snapshots.
    To increase the memory allocation and the number of vCPUs on your VMs while the VMs are powered on (hot-pluggable), do the following:
    1. In the vCPUs field, you can increase the number of vCPUs on your VMs while the VMs are powered on.
    2. In the Number of Cores Per vCPU field, you can change the number of cores per vCPU only if the VMs are powered off.
      Note: This is not a hot-pluggable feature.
    3. In the Memory field, you can increase the memory allocation on your VMs while the VMs are powered on.
    For more information about hot-pluggable vCPUs and memory, see Virtual Machine Memory and CPU Hot-Plug Configurations in the AHV Administration Guide .
    To attach a volume group to the VM, do the following:
    1. In the Volume Groups section, click Add volume group , and then do one of the following:
      • From the Available Volume Groups list, select the volume group that you want to attach to the VM.
      • Click Create new volume group , and then, in the Create Volume Group dialog box, create a volume group (see Creating a Volume Group). After you create a volume group, select it from the Available Volume Groups list.
      Repeat these steps until you have added all the volume groups that you want to attach to the VM.
    2. Click Add .
  10. To enable flash mode on the VM, click the Enable Flash Mode check box.
    • After you enable this feature on the VM, the status is updated in the VM table view. To view the status of individual virtual disks (disks that are flashed to the SSD), click the update disk icon in the Disks pane in the Update VM window.
    • You can disable the flash mode feature for individual virtual disks. To update the flash mode for individual virtual disks, click the update disk icon in the Disks pane and deselect the Enable Flash Mode check box.
    Figure. Update VM Resources Click to enlarge VM update resources display - VM Flash Mode

    Figure. Update VM Resources - VM Disk Flash Mode Click to enlarge VM update resources display - VM Disk Flash Mode

  11. To delete the VM, click the Delete action link. A window prompt appears; click the OK button to delete the VM.
    The deleted VM disappears from the list of VMs in the table.

Limitation for vNIC Hot-Unplugging

If you detach (hot-unplug) the vNIC for the VM with guest OS installed on it, the AOS displays the detach result as successful, but the actual detach success depends on the status of the ACPI mechanism in guest OS.

The following table describes the vNIC detach observations and workaround applicable based on guest OS response to ACPI request:

Table 1. vNIC Detach - Observations and Workaround
Detach Procedure Followed Guest OS responds to ACPI request (Yes/No) AOS Behavior Actual Detach Result Workaround
vNIC Detach (hot-unplug)
  • Using Prism Central: See Managing a VM (AHV) topic in Prism Central Guide.
  • Using Prism Element web console: See Managing a VM (AHV).
  • Using acli: Log on to the CVM with SSH and run the following command:

    nutanix@cvm$ acli vm.nic_delete <vm_name> <nic mac address>

    or,

    nutanix@cvm$ acli vm.nic_update <vm_name> <nic mac address> connected=false

    Replace the following attributes in the above commands:

    • <vm_name> with the name of the guest VM for which the vNIC is to be detached.
    • <nic mac address> with the vNIC MAC address that needs to be detached.
Yes

vNIC detach is Successful.

Observe the following logs:

Device detached successfully

vNIC detach is successful. No action needed
No vNIC detach is not successful. Power cycle the VM for successful vNIC detach.
Note: In most cases, it is observed that the ACPI mechanism failure occurs when no guest OS is installed on the VM.

Virtual Machine Snapshots

You can generate snapshots of virtual machines or VMs. You can generate snapshots of VMs manually or automatically. Some of the purposes that VM snapshots serve are as follows:

  • Disaster recovery
  • Testing - as a safe restoration point in case something went wrong during testing.
  • Migrate VMs
  • Create multiple instances of a VM.

Snapshot is a point-in-time state of entities such as VM and Volume Groups, and used for restoration and replication of data.. You can generate snapshots and store them locally or remotely. Snapshots are mechanism to capture the delta changes that has occurred over time. Snapshots are primarily used for data protection and disaster recovery. Snapshots are not autonomous like backup, in the sense that they depend on the underlying VM infrastructure and other snapshots to restore the VM. Snapshots consume less resources compared to a full autonomous backup. Typically, a VM snapshot captures the following:

  • The state including the power state (for example, powered-on, powered-off, suspended) of the VMs.
  • The data includes all the files that make up the VM. This data also includes the data from disks, configurations, and devices, such as virtual network interface cards.

VM Snapshots and Snapshots for Disaster Recovery

The VM Dashboard only allows you to generate VM snapshots manually. You cannot select VMs and schedule snapshots of the VMs using the VM dashboard. The snapshots generated manually have very limited utility.

Note: These snapshots (stored locally) cannot be replicated to other sites.

You can schedule and generate snapshots as a part of the disaster recovery process using Nutanix DR solutions. AOS generates snapshots when you protect a VM with a protection domain using the Data Protection dashboard in Prism Web Console (see the Data Protection and Recovery with Prism Element guide). Similarly, Recovery Points (snapshots are called Recovery Points in Prism Central) when you protect a VM with a protection policy using Data Protection dashboard in Prism Central (see the Leap Administration Guide).

For example, in the Data Protection dashboard in Prism Web Console, you can create schedules to generate snapshots using various RPO schemes such as asynchronous replication with frequency intervals of 60 minutes or more, or NearSync replication with frequency intervals of as less as 20 seconds up to 15 minutes. These schemes create snapshots in addition to the ones generated by the schedules, for example, asynchronous replication schedules generate snapshots according to the configured schedule and, in addition, an extra snapshot every 6 hours. Similarly, NearSync generates snapshots according to the configured schedule and also generates one extra snapshot every hour.

Similarly, you can use the options in the Data Protection section of Prism Central to generate Recovery Points using the same RPO schemes.

Windows VM Provisioning

Nutanix VirtIO for Windows

Nutanix VirtIO is a collection of drivers for paravirtual devices that enhance the stability and performance of virtual machines on AHV.

Nutanix VirtIO is available in two formats:

  • To install Windows in a VM on AHV, use the VirtIO ISO.
  • To update VirtIO for Windows, use the VirtIO MSI installer file.

Use Nutanix Guest Tools (NGT) to install the Nutanix VirtlO Package. For more information about installing the Nutanix VirtIO package using the NGT, see NGT Installation in the Prism Web Console Guide .

VirtIO Requirements

Requirements for Nutanix VirtIO for Windows.

VirtIO supports the following operating systems:

  • Microsoft Windows server version: Windows 2008 R2 or later
  • Microsoft Windows client version: Windows 7 or later
Note: On Windows 7 and Windows Server 2008 R2, install Microsoft KB3033929 or update the operating system with the latest Windows Update to enable support for SHA2 certificates.
Caution: The VirtIO installation or upgrade may fail if multiple Windows VSS snapshots are present in the guest VM. The installation or upgrade failure is due to the timeout that occurs during installation of Nutanix VirtIO SCSI pass-through controller driver.

It is recommended to clean up the VSS snapshots or temporarily disconnect the drive that contains the snapshots. Ensure that you only delete the snapshots that are no longer needed. For more information about how to observe the VirtIO installation or upgrade failure that occurs due to availability of multiple Windows VSS snapshots, see KB-12374.

Installing or Upgrading Nutanix VirtIO for Windows

Download Nutanix VirtIO and the Nutanix VirtIO Microsoft installer (MSI). The MSI installs and upgrades the Nutanix VirtIO drivers.

Before you begin

Make sure that your system meets the VirtIO requirements described in VirtIO Requirements.

About this task

If you have already installed Nutanix VirtIO, use the following procedure to upgrade VirtIO to a latest version. If you have not yet installed Nutanix VirtIO, use the following procedure to install Nutanix VirtIO.

Procedure

  1. Go to the Nutanix Support portal, select Downloads > AHV , and click VirtIO .
  2. Select the appropriate VirtIO package.
    • If you are creating a new Windows VM, download the ISO file. The installer is available on the ISO if your VM does not have Internet access.
    • If you are updating drivers in a Windows VM, download the MSI installer file.
    Figure. Search filter and VirtIO options Click to enlarge Use filter to search for the latest VirtIO package, ISO or MSI.

  3. Run the selected package.
    • For the ISO: Upload the ISO to the cluster, as described in the Configuring Images topic in Prism Web Console Guide .
    • For the MSI: open the download file to run the MSI.
  4. Read and accept the Nutanix VirtIO license agreement. Click Install .
    Figure. Nutanix VirtIO Windows Setup Wizard Click to enlarge Accept the License Agreement for Nutanix VirtIO Windows Installer

    The Nutanix VirtIO setup wizard shows a status bar and completes installation.

Manually Installing or Upgrading Nutanix VirtIO

Manually install or upgrade Nutanix VirtIO.

Before you begin

Make sure that your system meets the VirtIO requirements described in VirtIO Requirements.

About this task

Note: To automatically install Nutanix VirtIO, see Installing or Upgrading Nutanix VirtIO for Windows.

If you have already installed Nutanix VirtIO, use the following procedure to upgrade VirtIO to a latest version. If you have not yet installed Nutanix VirtIO, use the following procedure to install Nutanix VirtIO.

Procedure

  1. Go to the Nutanix Support portal, select Downloads > AHV , and click VirtIO .
  2. Do one of the following:
    • Extract the VirtIO ISO into the same VM where you load Nutanix VirtIO, for easier installation.

      If you choose this option, proceed directly to step 7.

    • Download the VirtIO ISO for Windows to your local machine.

      If you choose this option, proceed to step 3.

  3. Upload the ISO to the cluster, as described in the Configuring Images topic of Prism Web Console Guide .
  4. Locate the VM where you want to install the Nutanix VirtIO ISO and update the VM.
  5. Add the Nutanix VirtIO ISO by clicking Add New Disk and complete the indicated fields.
    • TYPE : CD-ROM
    • OPERATION : CLONE FROM IMAGE SERVICE
    • BUS TYPE : IDE
    • IMAGE : Select the Nutanix VirtIO ISO
  6. Click Add .
  7. Log on to the VM and browse to Control Panel > Device Manager .
  8. Note: Select the x86 subdirectory for 32-bit Windows, or the amd64 for 64-bit Windows.
    Open the devices and select the specific Nutanix drivers for download. For each device, right-click and Update Driver Software into the drive containing the VirtIO ISO. For each device, follow the wizard instructions until you receive installation confirmation.
    1. System Devices > Nutanix VirtIO Balloon Drivers
    2. Network Adapter > Nutanix VirtIO Ethernet Adapter .
    3. Processors > Storage Controllers > Nutanix VirtIO SCSI pass through Controller
      The Nutanix VirtIO SCSI pass-through controller prompts you to restart your system. Restart at any time to install the controller.
      Figure. List of Nutanix VirtIO downloads Click to enlarge This image lists the Nutanix VirtIO downloads required for 32-bit Windows.

Creating a Windows VM on AHV with Nutanix VirtIO

Create a Windows VM in AHV, or migrate a Windows VM from a non-Nutanix source to AHV, with the Nutanix VirtIO drivers.

Before you begin

  • Upload the Windows installer ISO to your cluster as described in the Configuring Images topic in Web Console Guide .
  • Upload the Nutanix VirtIO ISO to your cluster as described in the Configuring Images topic in Web Console Guide .

About this task

To install a new or migrated Windows VM with Nutanix VirtIO, complete the following.

Procedure

  1. Log on to the Prism web console using your Nutanix credentials.
  2. At the top-left corner, click Home > VM .
    The VM page appears.
  3. Click + Create VM in the corner of the page.
    The Create VM dialog box appears.
    Figure. Create VM dialog box Click to enlarge Create VM dialog box

  4. Complete the indicated fields.
    1. NAME : Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Timezone : Select the timezone that you want the VM to use. If you are creating a Linux VM, select (UTC) UTC .
      Note:

      The RTC of Linux VMs must be in UTC, so select the UTC timezone if you are creating a Linux VM.

      Windows VMs preserve the RTC in the local timezone, so set up the Windows VM with the hardware clock pointing to the desired timezone.

    4. Number of Cores per vCPU : Enter the number of cores assigned to each virtual CPU.
    5. MEMORY : Enter the amount of memory for the VM (in GiB).
  5. If you are creating a Windows VM, add a Windows CD-ROM to the VM.
    1. Click the pencil icon next to the CD-ROM that is already present and fill out the indicated fields.
      • OPERATION : CLONE FROM IMAGE SERVICE
      • BUS TYPE : IDE
      • IMAGE : Select the Windows OS install ISO.
    2. Click Update .
      The current CD-ROM opens in a new window.
  6. Add the Nutanix VirtIO ISO.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE : CD-ROM
      • OPERATION : CLONE FROM IMAGE SERVICE
      • BUS TYPE : IDE
      • IMAGE : Select the Nutanix VirtIO ISO.
    2. Click Add .
  7. Add a new disk for the hard drive.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE : DISK
      • OPERATION : ALLOCATE ON STORAGE CONTAINER
      • BUS TYPE : SCSI
      • STORAGE CONTAINER : Select the appropriate storage container.
      • SIZE : Enter the number for the size of the hard drive (in GiB).
    2. Click Add to add the disk driver.
  8. If you are migrating a VM, create a disk from the disk image.
    1. Click Add New Disk and complete the indicated fields.
      • TYPE : DISK
      • OPERATION : CLONE FROM IMAGE
      • BUS TYPE : SCSI
      • CLONE FROM IMAGE SERVICE : Click the drop-down menu and choose the image you created previously.
    2. Click Add to add the disk driver.
  9. Optionally, after you have migrated or created a VM, add a network interface card (NIC).
    1. Click Add New NIC .
    2. In the VLAN ID field, choose the VLAN ID according to network requirements and enter the IP address, if necessary.
    3. Click Add .
  10. Click Save .

What to do next

Install Windows by following Installing Windows on a VM.

Installing Windows on a VM

Install a Windows virtual machine.

Before you begin

Create a Windows VM.

Procedure

  1. Log on to the web console.
  2. Click Home > VM to open the VM dashboard.
  3. Select the Windows VM.
  4. In the center of the VM page, click Power On .
  5. Click Launch Console .
    The Windows console opens in a new window.
  6. Select the desired language, time and currency format, and keyboard information.
  7. Click Next > Install Now .
    The Windows setup dialog box shows the operating systems to install.
  8. Select the Windows OS you want to install.
  9. Click Next and accept the license terms.
  10. Click Next > Custom: Install Windows only (advanced) > Load Driver > OK > Browse .
  11. Choose the Nutanix VirtIO driver.
    1. Select the Nutanix VirtIO CD drive.
    2. Expand the Windows OS folder and click OK .
    Figure. Select the Nutanix VirtIO drivers for your OS Click to enlarge Choose the driver folder.

    The Select the driver to install window appears.
  12. Select the VirtIO SCSI driver ( vioscsi.inf ) and click Next .
    Figure. Select the Driver for Installing Windows on a VM Click to enlarge Choose the VirtIO driver.

    The amd64 folder contains drivers for 64-bit operating systems. The x86 folder contains drivers for 32-bit operating systems.
    Note: From Nutanix VirtIO driver version 1.1.5, the driver package contains Windows Hardware Quality Lab (WHQL) certified driver for Windows.
  13. Select the allocated disk space for the VM and click Next .
    Windows shows the installation progress, which can take several minutes.
  14. Enter your user name and password information and click Finish .
    Installation can take several minutes.
    Once you complete the logon information, Windows setup completes installation.
  15. Follow the instructions in Installing or Upgrading Nutanix VirtIO for Windows to install other drivers which are part of Nutanix VirtIO package.

Windows Defender Credential Guard Support in AHV

AHV enables you to use the Windows Defender Credential Guard security feature on Windows guest VMs.

Windows Defender Credential Guard feature of Microsoft Windows operating systems allows you to securely isolate user credentials from the rest of the operating system. By that means, you can protect guest VMs from credential theft attacks such as Pass-the-Hash or Pass-The-Ticket.

See the Microsoft documentation for more information about the Windows Defender Credential Guard security feature.

Windows Defender Credential Guard Architecture in AHV

Figure. Architecture Click to enlarge

Windows Defender Credential Guard uses Microsoft virtualization-based security to isolate user credentials in the virtualization-based security (VBS) module in AHV. When you enable Windows Defender Credential Guard on an AHV guest VM, the guest VM runs on top of AHV running both the Windows OS and VBS. Each Windows OS guest VM, which has credential guard enabled, has a VBS to securely store credentials.

Windows Defender Credential Guard Requirements

Ensure the following to enable Windows Defender Credential Guard:

  1. AOS, AHV, and Windows Server versions support Windows Defender Credential Guard:
    • AOS version must be 5.19 or later
    • AHV version must be AHV 20201007.1 or later
    • Windows Server version must be Windows server 2016 or later, Windows 10 Enterprise or later and Windows Server 2019 or later
  2. UEFI, Secure Boot, and machine type q35 are enabled in the Windows VM from AOS.

    The Prism Element workflow to enable Windows Defender Credential Guard includes the workflow to enable these features.

Limitations

  • Windows Defender Credential guard is not supported on hosts with AMD CPUs.
  • If you enable Windows Defender Credential Guard for your AHV guest VMs, the following optional configurations are not supported:

    • vTPM (Virtual Trusted Platform Modules) to store MS policies.
      Note: vTPM is supported with AOS 6.5.1 or later and AHV 20220304.242 or later release versions only.
    • DMA protection (vIOMMU).
    • Nutanix Live Migration.
    • Cross hypervisor DR of Credential Guard VMs.
Caution: Use of Windows Defender Credential Guard in your AHV clusters impacts VM performance. If you enable Windows Defender Credential Guard on AHV guest VMs, VM density drops by ~15–20%. This expected performance impact is due to nested virtualization overhead added as a result of enabling credential guard.

Enabling Windows Defender Credential Guard Support in AHV Guest VMs

You can enable Windows Defender Credential Guard when you are either creating a VM or updating a VM.

About this task

Perform the following procedure to enable Windows Defender Credential Guard:

Procedure

  1. Enable Windows Defender Credential Guard when you are either creating a VM or updating a VM. Do one of the following:
    • If you are creating a VM, see step 2.
    • If you are updating a VM, see step 3.
  2. If you are creating a Windows VM, do the following:
    1. Log on to the Prism Element web console.
    2. In the VM dashboard, click Create VM .
    3. Fill in the mandatory fields to configure a VM.
    4. Under Boot Configuration , select UEFI , and then select the Secure Boot and Windows Defender Credential Guard options.
      Figure. Enable Windows Defender Credential Guard Click to enlarge

      See UEFI Support for VM and Secure Boot Support for VMs for more information about these features.

    5. Proceed to configure other attributes for your Windows VM.
      See Creating a Windows VM on AHV with Nutanix VirtIO for more information.
    6. Click Save .
    7. Turn on the VM.
  3. If you are updating an existing VM, do the following:
    1. Log on to the Prism Element web console.
    2. In the VM dashboard, click the Table view, select the VM, and click Update .
    3. Under Boot Configuration , select UEFI , and then select the Secure Boot and Windows Defender Credential Guard options.
      Note:

      If the VM is configured to use BIOS, install the guest OS again.

      If the VM is already configured to use UEFI, skip the step to select Secure Boot.

      See UEFI Support for VM and Secure Boot Support for VMs for more information about these features.

    4. Click Save .
    5. Turn on the VM.
  4. Enable Windows Defender Credential Guard in the Windows VM by using group policy.
    See the Enable Windows Defender Credential Guard by using the Group Policy procedure of the Manage Windows Defender Credential Guard topic in the Microsoft documentation to enable VBS, Secure Boot, and Windows Defender Credential Guard for the Windows VM.
  5. Open command prompt in the Windows VM and apply the Group Policy settings:
    > gpupdate /force

    If you have not enabled Windows Defender Credential Guard (step 4) and perform this step (step 5), a warning similar to the following is displayed:

    Updating policy...
     
    Computer Policy update has completed successfully.
     
    The following warnings were encountered during computer policy processing:
     
    Windows failed to apply the {F312195E-3D9D-447A-A3F5-08DFFA24735E} settings. {F312195E-3D9D-447A-A3F5-08DFFA24735E} settings might have its own log file. Please click on the "More information" link.
    User Policy update has completed successfully.
     
    For more detailed information, review the event log or run GPRESULT /H GPReport.html from the command line to access information about Group Policy results.
    

    Event Viewer displays a warning for the group policy with an error message that indicates Secure Boot is not enabled on the VM.

    To view the warning message in Event Viewer, do the following:

    • In the Windows VM, open Event Viewer .
    • Go to Windows Logs -> System and click the warning with the Source as GroupPolicy (Microsoft-Windows-GroupPolicy) and Event ID as 1085 .
    Figure. Warning in Event Viewer Click to enlarge

    Note: Ensure that you follow the steps in the order that is stated in this document to successfully enable Windows Defender Credential Guard.
  6. Restart the VM.
  7. Verify if Windows Defender Credential Guard is enabled in your Windows VM.
    1. Start a Windows PowerShell terminal.
    2. Run the following command.
      PS > Get-CimInstance -ClassName Win32_DeviceGu
      ard -Namespace 'root\Microsoft\Windows\DeviceGuard'

      An output similar to the following is displayed.

      PS > Get-CimInstance -ClassName Win32_DeviceGuard -Namespace 'root\Microsoft\Windows\DeviceGuard'
      AvailableSecurityProperties              	: {1, 2, 3, 5}
      CodeIntegrityPolicyEnforcementStatus     	: 0
      InstanceIdentifier                       	: 4ff40742-2649-41b8-bdd1-e80fad1cce80
      RequiredSecurityProperties               	: {1, 2}
      SecurityServicesConfigured               	: {1}
      SecurityServicesRunning                  	: {1}
      UsermodeCodeIntegrityPolicyEnforcementStatus : 0
      Version                                  	: 1.0
      VirtualizationBasedSecurityStatus        	: 2
      PSComputerName 
      

      Confirm that both SecurityServicesConfigured and SecurityServicesRunning have the value { 1 } .

    Alternatively, you can verify if Windows Defender Credential Guard is enabled by using System Information (msinfo32):

    1. In the Windows VM, open System Information by typing msinfo32 in the search field next to the Start menu.
    2. Verify if the values of the parameters are as indicated in the following screen shot:
      Figure. Verify Windows Defender Credential Guard Click to enlarge

Windows Subsystem for Linux (WSL2) Support on AHV

AHV supports WSL2 that enables you to run a Linux setup on a Windows OS without a dedicated VM and dual-boot environment.

For more information about WSL, refer to What is the Windows Subsystem for Linux? topic in Microsoft Technical Documentation.
Note:
  • Both hardware and software support are required to enable a guest VM to communicate with its nested guest VMs in a WSL2 setup.
  • The system performance gets impacted in a WSL2 environment due to specific workloads and lack of hardware features in the processors. These attributes are required to enhance the virtualization environment.
  • VM live migration is currently not supported for WSL. You must power off the VM during any AOS or AHV upgrades.

Limitations

The following table provides the information about the limitations that are applicable for WSL2 based on the AOS and AHV version:

Table 1. Limitations for WSL2
AOS Version AHV Version Limitations for WSL2

AOS 6.5.1

AHV 20201105.30411 (default bundled AHV with AOS 6.5.1)

The following optional configurations are not supported:

  • vTPM (Virtual Trusted Platform Modules) to store MS policies

  • Hosts with AMD CPUs

  • DMA protection (vIOMMU).

  • Nutanix Live Migration.

  • Cross hypervisor DR of WSL2 VM

AOS 6.5.1 or later

AHV 20220304.242 or later

The following optional configurations are not supported:

  • Hosts with AMD CPUs

  • DMA protection (vIOMMU).

  • Nutanix Live Migration.

  • Cross hypervisor DR of WSL2 VM

Enabling WSL2 on AHV

This section provides the information on how to enable the WSL2 on AHV.

Before you begin

Ensure that the AOS 6.5.1 or later along with AHV 20220304.242 or later release versions are deployed at your site.

About this task

Note:

In the following procedure, ensure that you replace the <VM_name> with the actual guest VM name.

To configure WSL2 on AHV:

Procedure

  1. Power off the guest VM on which you want to configure WSL2.
  2. Log on to any CVM in the cluster with SSH.
  3. Run the following command to enable the guest VM to support WSL2:
    nutanix@CVM~ $ acli vm.update <VM_name> hardware_virtualization=true
    Note: In case you need to create a new guest VM (Windows VM) on AHV with Nutanix VirtIO, see Creating a Windows VM on AHV with Nutanix VirtIO.
  4. (Optional) Retrieve the guest VM details using the following command to check whether all the attributes are correctly set for the guest VM:
    nutanix@CVM~ $ acli vm.get <VM_name>

    Observe the following log attributes to verify whether the infrastructure to support WSL2 is configured successfully in the guest VM:

    hardware_virtualization: True
  5. Power on the guest VM using the following command:
    nutanix@CVM~ $ acli vm.on <VM_name>
  6. Enable WSL2 on Windows OS. For information on how to install WSL, refer Install WSL topic in Microsoft Technical Documentation.

Affinity Policies for AHV

As an administrator of an AHV cluster, you can create VM-Host affinity policies for virtual machines on an AHV cluster. By defining these policies, you can control the placement of virtual machines on the hosts within a cluster.

Note: VMs with Host affinity policies can only be migrated to the hosts specified in the affinity policy. If only one host is specified, the VM cannot be migrated or started on another host during an HA event. For more information, see Non-Migratable Hosts.

You can define affinity policies for a VM at two levels:

Affinity Policies defined in Prism Element

In Prism Element, you can define affinity policies at VM level during the VM create or update operation. You can use an affinity policy to specify that a particular VM can only run on the members of the affinity host list.

Affinity Policies defined in Prism Central

In Prism Central, you can define category-based VM-Host affinity policies, where a set of VMs can be affined to run only on a particular set of hosts. Category-based affinity policy enables you to easily manage affinities for a large number of VMs.

Affinity Policies Defined in Prism Element

In Prism Element, you can define scheduling policies for virtual machines on an AHV cluster at a VM level. By defining these policies, you can control the placement of a virtual machine on specific hosts within a cluster.

You can define two types of affinity policies in Prism Element.

VM-Host Affinity Policy

The VM-host affinity policy controls the placement of a VM. You can use this policy to specify that a selected VM can only run on the members of the affinity host list. This policy checks and enforces where a VM can be hosted when you power on or migrate the VM.
Note:
  • If you choose to apply the VM-host affinity policy, it limits Acropolis HA and Acropolis Dynamic Scheduling (ADS) in such a way that a virtual machine cannot be powered on or migrated to a host that does not conform to requirements of the affinity policy as this policy is enforced mandatorily.
  • The VM-host anti-affinity policy is not supported.
  • VMs configured with host affinity settings retain these settings if the VM is migrated to a new cluster. Remove the VM-host affinity policies applied to a VM that you want to migrate to another cluster, as the UUID of the host is retained by the VM and it does not allow VM restart on the destination cluster. When you attempt to protect such VMs, it is successful. However, some disaster recovery operations like migration fail and attempts to power on these VMs also fail.

You can define the VM-host affinity policies by using Prism Element during the VM create or update operation. For more information, see Creating a VM (AHV).

VM-VM Anti-Affinity Policy

You can use this policy to specify anti-affinity between the virtual machines. The VM-VM anti-affinity policy keeps the specified virtual machines apart in such a way that when a problem occurs with one host, you should not lose both the virtual machines. However, this is a preferential policy. This policy does not limit the Acropolis Dynamic Scheduling (ADS) feature to take necessary action in case of resource constraints.
Note:
  • Currently, you can only define VM-VM anti-affinity policy by using aCLI. For more information, see Configuring VM-VM Anti-Affinity Policy.
  • The VM-VM affinity policy is not supported.
Note: If a VM is cloned that has the affinity policies configured, then the policies are not automatically applied to the cloned VM. However, if a VM is restored from a DR snapshot, the policies are automatically applied to the VM.

Limitations of Affinity Rules

Even though if a host is removed from a cluster, the host UUID is not removed from the host-affinity list for a VM.

Configuring VM-VM Anti-Affinity Policy

To configure VM-VM anti-affinity policies, you must first define a group and then add all the VMs on which you want to define VM-VM anti-affinity policy.

About this task

Note: Currently, the VM-VM affinity policy is not supported.

Perform the following procedure to configure the VM-VM anti-affinity policy.

Procedure

  1. Log on to the Controller VM with SSH session.
  2. Create a group.
    nutanix@cvm$ acli vm_group.create group_name

    Replace group_name with the name of the group.

  3. Add the VMs on which you want to define anti-affinity to the group.
    nutanix@cvm$ acli vm_group.add_vms group_name vm_list=vm_name

    Replace group_name with the name of the group. Replace vm_name with the name of the VMs that you want to define anti-affinity on. In case of multiple VMs, you can specify comma-separated list of VM names.

  4. Configure VM-VM anti-affinity policy.
    nutanix@cvm$ acli vm_group.antiaffinity_set group_name

    Replace group_name with the name of the group.

    After you configure the group and then power on the VMs, the VMs that are part of the group are started (attempt to start) on the different hosts. However, this is a preferential policy. This policy does not limit the Acropolis Dynamic Scheduling (ADS) feature to take necessary action in case of resource constraints.

Removing VM-VM Anti-Affinity Policy

Perform the following procedure to remove the VM-VM anti-affinity policy.

Procedure

  1. Log on to the Controller VM with SSH session.
  2. Remove the VM-VM anti-affinity policy.
    nutanix@cvm$ acli vm_group.antiaffinity_unset group_name

    Replace group_name with the name of the group.

    The VM-VM anti-affinity policy is removed for the VMs that are present in the group, and they can start on any host during the next power on operation (as necessitated by the ADS feature).

Affinity Policies Defined in Prism Central

In Prism Central, you can define category-based VM-Host affinity policies, where a set of VMs can be affined to run only on a particular set of hosts. Category-based affinity policy enables you to easily manage affinities for a large number of VMs. In case of any changes to the affined hosts, you only need to update the category of the host, and it updates the affinity policy for all the affected VMs.

This policy checks and enforces where a VM can be hosted when you start or migrate the VM. If there are no resources available on any of the affined hosts, the VM does not get started.

Note:

If you create a VM-Host affinity policy for a VM that is configured for asynchronous replication, you must create similar categories and corresponding policies on the remote site as well. If you define similar categories and policies on the remote site, affinity policies will be applied when the VMs are migrated to the remote site.

Limitations of Affinity Policies

Affinity policies created in Prism Central have the following limitations:

  • Only a super admin can create, modify, or delete affinity policies.
  • The minimum supported versions for VM-Host affinity policies are version 6.1 for Prism Element and version 2022.1 for Prism Central.
  • You cannot apply VM-Host affinity policy on a VM that is enabled for synchronous replication. Also, you cannot enable synchronous replication on a VM that is associated with a VM-Host affinity policy.
  • Host category attachment or detachment takes around 5 minutes to get reflected in the applicable affinity policies.

    When you assign a category to a host and map the host category to the affinity policy, you can observe that the host count gets updated immediately on the Entities page.

    The following figure shows the host count on Entities page:

    Figure. Host Count - Entities Page Click to enlarge Host Count on Entities page

    However, the system takes approximately 5 minutes to update the host count on Affinity Policies page.
    Note: The delay in host count update is due to the usage of different APIs to derive the host count on Entities and Affinity Policies pages.

    The following figure shows the host count on Affinity Policies page after delay of approximately 5 minutes:

    Figure. Host Count - Affinity Policies Page Click to enlarge Host count on Affinity Policies page

    For information about how to create a category, see Creating a Category topic in Prism Central Guide .

    For information about how to assign a category to host, see Associating Hosts with Categories.

    For information about how to create the affinity policy and map the host category to the affinity policy, see Creating an Affinity Policy.

Affinity Policy Configuration Workflow

About this task

To set up an affinity policy, do the following:

Procedure

  1. Create categories for the following entities:
    1. VMs
    2. Hosts
    For information about creating a category, see Creating a Category.
  2. Apply the VM categories to the VMs and host categories to the hosts.
    For information about associating categories with VMs, see Associating VMs with Categories topic in the Prism Central Guide (see Prism) . For information about associating categories with hosts, see Associating hosts with Categories .
  3. Create the affinity policy. See Creating an Affinity Policy topic in the Prism Central Guide (see Prism) .

Associating VMs with Categories

About this task

To associate categories with VMs, do the following:

Procedure

  1. In Prism Central, in the Entities menu, go to Compute & Storage > VMs .
  2. Select the VMs that you want to associate with a category.
  3. In the Actions menu that is displayed, click Manage Categories .
  4. In the Search for a category field, type the name of the category or value that you want to add, and then select the category and value from the list that is displayed.
  5. Click the add button to add as many search fields and repeat this step for each category that you want to add. To remove a category, click the remove button beside the field.
  6. Click Save .

Associating Hosts with Categories

About this task

To associate categories with hosts, do the following:

Procedure

  1. In Prism Central, in the Entities menu, go to Hardware > Hosts .
  2. Select the hosts that you want to associate with a category.
  3. In the Actions menu that is displayed, click Manage Categories .
  4. In the Search for a category field, type the name of the category or value that you want to add, and then select the category and value from the list that is displayed.
  5. Click the add button to add as many search fields and repeat this step for each category that you want to add. To remove a category, click the remove button beside the field.
  6. Click Save .

Creating an Affinity Policy

About this task

To create an affinity policy, do the following:

Before you begin

Complete the following tasks:
  • Configure the categories that you need to associate with the VMs. Associate this category with all the relevant VMs. Alternatively, you can associate the category with the VMs after creation of the affinity policy.
  • Configure the categories that you need to associate with the hosts. Associate this category with all the relevant hosts. Alternatively, you can associate the category with the hosts after creation of the affinity policy.
Note: If you have configured any legacy affinity policy (non-category-based affinity policy) associated with the VMs, you must first remove those legacy affinity policies to allow the creation of category-based affinity policies associated with the same VMs.

Procedure

  1. In the VMs Summary View, go to Policies > Affinity Policies , and then click Create .
  2. In the Create Affinity Policy page, enter a name and description (optional) for the policy.
    Figure. Create Affinity Policy Click to enlarge Create Affinity Policy

  3. Click inside the VM Categories search field and select the category you want to associate with the VMs.
  4. Click inside the Host Categories search field and select the category you associated with the hosts.
  5. Click Create .

Updating an Affinity Policy

You can update an affinity policy. Updates to an affinity policy can result in a policy violation. Prism Central attempts to correct the violation by executing a series of actions.

About this task

To update an affinity policy, do the following:

Procedure

  1. In the entity menu, go to Policies > Affinity Policies , and then click the affinity policy that you want to update.
  2. From the Actions menu that is displayed, click Update .
  3. The Update Affinity Policy page that is displayed includes the same options and settings as the Create Affinity Policy page.
    Figure. Update Affinity Policy Click to enlarge Update Affinity Policy

  4. Update the settings that you want, and then click Save .

Affinity Policies Summary View

To access the affinity policies dashboard, select Compute & Storage > VMs > Policies > Affinity Policies from the Entities menu in Prism Central .

Note: This section describes the information and options that appear in the affinity policies dashboard.
  • See Entity Exploring chapter in the Prism Central Guide (see Prism) for instructions on how to view and organize that information in various ways.
  • See Affinity Policies Defined in Prism Central topic in the Prism Central Guide (see Prism) for information on how to create or modify the affinity policies.

The affinity policies dashboard displays a list of current policies that includes the name and type for each policy.

Figure. Affinity Policies Dashboard Click to enlarge

The following table describes the fields that appear in the affinity policies list. A dash (-) is displayed in a field when a value is not available or not applicable.

Table 1. Affinity Policies List Fields
Parameter Description Values
Name Displays the policy name. (name)
VMs Displays the count of VMs associated with this policy. (number of VMs)
Hosts Displays the count of hosts associated with this policy. (number of hosts)
VM Compliance Status Displays the compliance status of the VMs associated with this policy. If the policy is being applied and the compliance status is not yet known, the status is displayed as Pending.

If a VM is part of multiple VM-Host affinity policies, the oldest policy is applied on the VM. For rest of the policies, the VM is displayed as non-compliant.

(number of VMs Compliant/Non Compliant/Pending)
Modified By Displays the name of the user who modified the policy last time. (user)
Last Modified Displays the date and time when the policy was modified last time. (date & time)
Affinity Policies Details View

To access the details page for an Affinity policy, click the desired policy name in the list (see Affinity Policies Summary View). The affinity policy details page includes the following tabs:

  • Summary : On the Summary tab, you can view the Overview , Associations , and Compliance Status sections. All the three sections display information related to the policy.

    The Summary tab view also includes options to Update , Delete , and Re-Enforce the policy. If any of the VMs become non-compliant, you can use this option to re-enforce the policy after fixing the issue for the non-compliance.

  • Entities : On the Entities tab, you can view the details of the VMs and Hosts entities that are associated with the policy. The details displayed on the Hosts tab include the host name, cluster name, and the category used to associate, while the details displayed on the VMs tab include the VM name, host name, cluster name, category used to associate, and VM compliance status. If a VM is non-compliant, the cause of non-compliance is also displayed along with the status.
Figure. Affinity Policies Details view Click to enlarge

Non-Migratable Hosts

VMs with GPU, CPU passthrough, PCI passthrough, and host affinity policies are not migrated to other hosts in the cluster. Such VMs are treated in a different manner in scenarios where VMs are required to migrate to other hosts in the cluster.

Table 1. Scenarios Where VMs Are Required to Migrate to Other Hosts
Scenario Behavior
One-click upgrade VM is powered off.
Life-cycle management (LCM) Pre-check for LCM fails and the VMs are not migrated.
Rolling restart VM is powered off.
AHV host maintenance mode Use the tunable option to shut down the VMs while putting the node in maintenance mode. For more information, see Putting a Node into Maintenance Mode using CLI.

Performing Power Operations on VMs by Using Nutanix Guest Tools (aCLI)

You can initiate safe and graceful power operations such as soft shutdown and restart of the VMs running on the AHV hosts by using the aCLI. Nutanix Guest Tools (NGT) initiates and performs the soft shutdown and restart operations within the VM. This workflow ensures a safe and graceful shutdown or restart of the VM. You can create a pre-shutdown script that you can choose to run before a shutdown or restart of the VM. In the pre-shutdown script, include any tasks or checks that you want to run before a VM is shut down or restarted. You can choose to cancel the power operation if the pre-shutdown script fails. If the script fails, an alert (guest_agent_alert) is generated in the Prism web console.

Before you begin

Ensure that you have met the following prerequisites before you initiate the power operations:
  1. NGT is enabled on the VM. All operating systems that NGT supports are supported for this feature.
  2. NGT version running on the Controller VM and guest VM is the same.
  3. (Optional) If you want to run a pre-shutdown script, place the script in the following locations depending on your VMs:
    • Windows VMs: installed_dir\scripts\power_off.bat

      The file name of the script must be power_off.bat .

    • Linux VMs: installed_dir/scripts/power_off

      The file name of the script must be power_off .

About this task

Note: You can also perform these power operations by using the V3 API calls. For more information, see developer.nutanix.com.

Perform the following steps to initiate the power operations:

Procedure

  1. Log on to a Controller VM with SSH.
  2. Do one of the following:
    • Soft shut down the VM.
      nutanix@cvm$ acli vm.guest_shutdown vm_name enable_script_exec=[true or false] fail_on_script_failure=[true or false]

      Replace vm_name with the name of the VM.

    • Restart the VM.
      nutanix@cvm$ acli vm.guest_reboot vm_name enable_script_exec=[true or false] fail_on_script_failure=[true or false]

      Replace vm_name with the name of the VM.

    Set the value of enable_script_exec to true to run your pre-shutdown script and set the value of fail_on_script_failure to true to cancel the power operation if the pre-shutdown script fails.

UEFI Support for VM

UEFI firmware is a successor to legacy BIOS firmware that supports larger hard drives, faster boot time, and provides more security features.

VMs with UEFI firmware have the following advantages:

  • Boot faster
  • Avoid legacy option ROM address constraints
  • Include robust reliability and fault management
  • Use UEFI drivers
Note:
  • Nutanix supports the starting of VMs with UEFI firmware in an AHV cluster. However, if a VM is added to a protection domain and later restored on a different cluster, the VM loses boot configuration. To restore the lost boot configuration, see Setting up Boot Device.
  • Nutanix also provides limited support for VMs migrated from a Hyper-V cluster.

You can create or update VMs with UEFI firmware by using acli commands, Prism Element web console, or Prism Central web console. For more information about creating a VM by using the Prism Element web console or Prism Central web console, see Creating a VM (AHV). For information about creating a VM by using aCLI, see Creating UEFI VMs by Using aCLI.

Note: If you are creating a VM by using aCLI commands, you can define the location of the storage container for UEFI firmware and variables. Prism Element web console or Prism Central web console does not provide the option to define the storage container to store UEFI firmware and variables.

For more information about the supported OSes for the guest VMs, see the AHV Guest OS section in the ]Compatibility and Interoperability Matrix document.

Creating UEFI VMs by Using aCLI

In AHV clusters, you can create a virtual machine (VM) to start with UEFI firmware by using Acropolis CLI (aCLI). This topic describes the procedure to create a VM by using aCLI. See the "Creating a VM (AHV)" topic for information about how to create a VM by using the Prism Element web console.

Before you begin

Ensure that the VM has an empty vDisk.

About this task

Perform the following procedure to create a UEFI VM by using aCLI:

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. Create a UEFI VM.
    
    nutanix@cvm$ acli vm.create vm-name uefi_boot=true
    A VM is created with UEFI firmware. Replace vm-name with a name of your choice for the VM. By default, the UEFI firmware and variables are stored in an NVRAM container. If you would like to specify a location of the NVRAM storage container to store the UEFI firmware and variables, do so by running the following command.
    nutanix@cvm$ acli vm.create vm-name uefi_boot=true nvram_container=NutanixManagementShare
    Replace NutanixManagementShare with a storage container in which you want to store the UEFI variables.
    The UEFI variables are stored in a default NVRAM container. Nutanix recommends you to choose a storage container with at least RF2 storage policy to ensure the VM high availability for node failure scenarios. For more information about RF2 storage policy, see Failure and Recovery Scenarios in the Prism Web Console Guide document.
    Note: When you update the location of the storage container, clear the UEFI configuration and update the location of nvram_container to a container of your choice.

What to do next

Go to the UEFI BIOS menu and configure the UEFI firmware settings. For more information about accessing and setting the UEFI firmware, see Getting Familiar with UEFI Firmware Menu.

Getting Familiar with UEFI Firmware Menu

After you launch a VM console from the Prism Element web console, the UEFI firmware menu allows you to do the following tasks for the VM.

  • Changing default boot resolution
  • Setting up boot device
  • Changing boot-time value

Changing Boot Resolution

You can change the default boot resolution of your Windows VM from the UEFI firmware menu.

Before you begin

Ensure that the VM is in powered on state.

About this task

Perform the following procedure to change the default boot resolution of your Windows VM by using the UEFI firmware menu.

Procedure

  1. Log on to the Prism Element web console.
  2. Launch the console for the VM.
    For more details about launching console for the VM, see Managing a VM (AHV) section in Prism Web Console Guide .
  3. To go to the UEFI firmware menu, press the F2 keys on your keyboard.
    Tip: To enter UEFI menu, open the VM console, select Reset in the Power off/Reset VM dialog box, and immediately press F2 when the VM starts to boot.
    Important: Resetting the VM causes a downtime. We suggest that you reset the VM only during off-production hours or during a maintenance period.
    Figure. UEFI Firmware Menu Click to enlarge UEFI Firmware Menu

  4. Use the up or down arrow key to go to Device Manager and press Enter .
    The Device Manager page appears.
  5. In the Device Manager screen, use the up or down arrow key to go to OVMF Platform Configuration and press Enter .
    Figure. OVMF Settings Click to enlarge

    The OVMF Settings page appears.
  6. In the OVMF Settings page, use the up or down arrow key to go to the Change Preferred field and use the right or left arrow key to increase or decrease the boot resolution.
    The default boot resolution is 1280X1024.
  7. Do one of the following.
    • To save the changed resolution, press the F10 key.
    • To go back to the previous screen, press the Esc key.
  8. Select Reset and click Submit in the Power off/Reset dialog box to restart the VM.
    After you restart the VM, the OS displays the changed resolution.

Setting up Boot Device

You cannot set the boot order for UEFI VMs by using the aCLI, Prism Central web console, or Prism Element web console. You can change the boot device for a UEFI VM by using the UEFI firmware menu.

Before you begin

Ensure that the VM is in powered on state.

Procedure

  1. Log on to the Prism Element web console.
  2. Launch the console for the VM.
    For more details about launching console for the VM, see Managing a VM (AHV) section in Prism Web Console Guide .
  3. To go to the UEFI firmware menu, press the F2 keys on your keyboard.
    Tip: To enter UEFI menu, open the VM console, select Reset in the Power off/Reset VM dialog box, and immediately press F2 when the VM starts to boot.
    Important: Resetting the VM causes a downtime. We suggest that you reset the VM only during off-production hours or during a maintenance period.
  4. Use the up or down arrow key to go to Boot Manager and press Enter .
    The Boot Manager screen displays the list of available boot devices in the cluster.
    Figure. Boot Manager Click to enlarge

  5. In the Boot Manager screen, use the up or down arrow key to select the boot device and press Enter .
    The boot device is saved. After you select and save the boot device, the VM boots up with the new boot device.
  6. To go back to the previous screen, press Esc .

Changing Boot Time-Out Value

The boot time-out value determines how long the boot menu is displayed (in seconds) before the default boot entry is loaded to the VM. This topic describes the procedure to change the default boot-time value of 0 seconds.

About this task

Ensure that the VM is in powered on state.

Procedure

  1. Log on to the Prism Element web console.
  2. Launch the console for the VM.
    For more details about launching console for the VM, see Managing a VM (AHV) section in Prism Web Console Guide .
  3. To go to the UEFI firmware menu, press the F2 keys on your keyboard.
    Tip: To enter UEFI menu, open the VM console, select Reset in the Power off/Reset VM dialog box, and immediately press F2 when the VM starts to boot.
    Important: Resetting the VM causes a downtime. We suggest that you reset the VM only during off-production hours or during a maintenance period.
  4. Use the up or down arrow key to go to Boot Maintenance Manager and press Enter .
    Figure. Boot Maintenance Manager Click to enlarge

  5. In the Boot Maintenance Manager screen, use the up or down arrow key to go to the Auto Boot Time-out field.
    The default boot-time value is 0 seconds.
  6. In the Auto Boot Time-out field, enter the boot-time value and press Enter .
    Note: The valid boot-time value ranges from 1 second to 9 seconds.
    The boot-time value is changed. The VM starts after the defined boot-time value.
  7. To go back to the previous screen, press Esc .

Secure Boot Support for VMs

The pre-operating system environment is vulnerable to attacks by possible malicious loaders. Secure boot addresses this vulnerability with UEFI secure boot using policies present in the firmware along with certificates, to ensure that only properly signed and authenticated components are allowed to execute.

Supported Operating Systems

For more information about the supported OSes for the guest VMs, see the AHV Guest OS section in the Compatibility and Interoperability Matrix document.

Secure Boot Considerations

This section provides the limitations and requirements to use Secure Boot.

Limitations

Secure Boot for guest VMs has the following limitation:

  • Nutanix does not support converting a VM that uses IDE disks or legacy BIOS to VMs that use Secure Boot.
  • The minimum supported version of the Nutanix VirtIO package for Secure boot-enabled VMs is 1.1.6.
  • Secure boot VMs do not permit CPU, memory, or PCI disk hot plug.

Requirements

Following are the requirements for Secure Boot:

  • Secure Boot is supported only on the Q35 machine type.

Creating/Updating a VM with Secure Boot Enabled

You can enable Secure Boot with UEFI firmware, either while creating a VM or while updating a VM by using aCLI commands or Prism Element web console.

See Creating a VM (AHV) for instructions about how to enable Secure Boot by using the Prism Element web console.

Creating a VM with Secure Boot Enabled

About this task

To create a VM with Secure Boot enabled:

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. To create a VM with Secure Boot enabled:
    nutanix@cvm$ acli vm.create  <vm_name> secure_boot=true machine_type=q35
    Note: Specifying the machine type is required to enable the secure boot feature. UEFI is enabled by default when the Secure Boot feature is enabled.

Updating a VM to Enable Secure Boot

About this task

To update a VM to enable Secure Boot:

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. To update a VM to enable Secure Boot, ensure that the VM is powered off.
    nutanix@cvm$ acli vm.update <vm_name> secure_boot=true machine_type=q35
    Note:
    • If you disable the secure boot flag alone, the machine type remains q35, unless you disable that flag explicitly.
    • UEFI is enabled by default when the Secure Boot feature is enabled. Disabling Secure Boot does not revert the UEFI flags.

Virtual Machine Network Management

Virtual machine network management involves configuring connectivity for guest VMs through virtual switches, VLANs, and VPCs.

For information about creating or updating a virtual switch and other VM network options, see Network and Security Management in Prism Central Guide . Virtual switch creation and updates are also covered in Network Management in Prism Web Console Guide .

Virtual Machine Memory and CPU Hot-Plug Configurations

Memory and CPUs are hot-pluggable on guest VMs running on AHV. You can increase the memory allocation and the number of CPUs on your VMs while the VMs are powered on. You can change the number of vCPUs (sockets) while the VMs are powered on. However, you cannot change the number of cores per socket while the VMs are powered on.

Note: You cannot decrease the memory allocation and the number of CPUs on your VMs while the VMs are powered on.

You can change the memory and CPU configuration of your VMs by using the Acropolis CLI (aCLI) (see Managing a VM (AHV) in the Prism Web Console Guide or see Managing a VM (AHV) and Managing a VM (Self Service) in the Prism Central Guide).

See the AHV Guest OS Compatibility Matrix for information about operating systems on which you can hot plug memory and CPUs.

Memory OS Limitations

  1. On Linux operating systems, the Linux kernel might not make the hot-plugged memory online. If the memory is not online, you cannot use the new memory. Perform the following procedure to make the memory online.
    1. Identify the memory block that is offline.

      Display the status of all of the memory.

      $ cat /sys/devices/system/memory/memoryXXX/state 
      

      Display the state of a specific memory block.

      $ grep line /sys/devices/system/memory/*/state 
      
    2. Make the memory online.
      $ echo online > /sys/devices/system/memory/memoryXXX/state 
      
  2. If your VM has CentoOS 7.2 as the guest OS and less than 3 GB memory, hot plugging more memory to that VM so that the final memory is greater than 3 GB, results in a memory-overflow condition. To resolve the issue, restart the guest OS (CentOS 7.2) with the following setting:
    swiotlb=force 
    

CPU OS Limitation

On CentOS operating systems, if the hot-plugged CPUs are not displayed in /proc/cpuinfo , you might have to bring the CPUs online. For each hot-plugged CPU, run the following command to bring the CPU online.

$ echo 1 > /sys/devices/system/cpu/cpu<n>/online  

Replace <n> with the number of the hot plugged CPU.

Hot-Plugging the Memory and CPUs on Virtual Machines (AHV)

About this task

Perform the following procedure to hot plug the memory and CPUs on the AHV VMs.

Procedure

  1. Log on the Controller VM with SSH.
  2. Update the memory allocation for the VM.
    nutanix@cvm$ acli vm.update vm-name memory=new_memory_size 
    

    Replace vm-name with the name of the VM and new_memory_size with the memory size.

  3. Update the number of CPUs on the VM.
    nutanix@cvm$ acli vm.update vm-name num_vcpus=n 
    

    Replace vm-name with the name of the VM and n with the number of CPUs.

Virtual Machine Memory Management (vNUMA)

AHV hosts support Virtual Non-uniform Memory Access (vNUMA) on virtual machines. You can enable vNUMA on VMs when you create or modify the VMs to optimize memory performance.

Non-uniform Memory Access (NUMA)

In a NUMA topology, the memory access times of a VM depend on the memory location relative to a processor. A VM accesses memory local to a processor faster than the non-local memory. If the VM uses both CPU and memory from the same physical NUMA node, you can achieve optimal resource utilization. If you are running the CPU on one NUMA node (for example, node 0) and the VM accesses the memory from another node (node 1) then memory latency occurs. Ensure that the virtual topology of VMs matches the physical hardware topology to achieve minimum memory latency.

Virtual Non-uniform Memory Access (vNUMA)

vNUMA optimizes the memory performance of virtual machines that require more vCPUs or memory than the capacity of a single physical NUMA node. In a vNUMA topology, you can create multiple vNUMA nodes where each vNUMA node includes vCPUs and virtual RAM. When you assign a vNUMA node to a physical NUMA node, the vCPUs can intelligently determine the memory latency (high or low). Low memory latency within a vNUMA node results in low latency in the physical NUMA node as well.

vNUMA vCPU hard-pinning

When you configure NUMA and hyper-threading, you ensure that the VM can schedule on virtual peers. You also expose the NUMA topology to the VM. While this configuration helps you limit the amount of memory that is available to each virtual NUMA node, the distribution underneath, in the hardware, still occurs randomly.

Enable virtual CPU (vCPU) hard-pinning in the topology to define which NUMA node the vCPUs (and hyper-threads or peers) are located on and how much memory that NUMA node has. vCPU hard-pinning also allows you to see a proper mapping of vCPU to CPU set (virtual CPU to physical core or hyper-thread). It ensures that a VM is never scheduled on a different core or peer that is not defined in the hard-pin configuration. It also results in memory being allocated and distributed correctly across the configured mapping,

While vCPU hard-pinning gives a benefit to scheduling operations and memory operations, it also has a couple of caveats.

  • Acropolis Dynamic Scheduling (ADS) is not NUMA aware, so the high availability (HA) process is not NUMA aware. This lack of awareness can lead to potential issues when a host fails.

  • When you start a VM, a process running in the background nullifies the memory pages for a VM. The more memory is allocated to a VM, the longer this process takes. Consider a deployment having 10 VMs: 9 have 4GB RAM and one has 4.5TB RAM. The process runs faster on the 9 VMs with lesser RAM while it takes longer to complete on the VM with more RAM (perhaps a couple of seconds for the VMs with less RAM vs potentially 2 minutes for the VM with more RAM). The potential issue this time lag leads to is: the smaller VMs are already running on a socket, and when trying to power on the large memory VM, that socket or the cores is unavailable. The unavailability could result in a boot failure and error message when starting the VM.

    The workaround is to use affinity rules and ensure that large VMs that have vCPU hard-pinning configured have a failover node available to them, with a different affinity rule for the non-pinned VMs.

For information about configuring vCPU hard-pinning, see Enabling vNUMA on Virtual Machines.

Enabling vNUMA on Virtual Machines

Before you begin

Before you enable vNUMA, see AHV Best Practices Guide under Solutions Documentation .

About this task

Perform the following procedure to enable vNUMA on your VMs running on the AHV hosts.

Procedure

  1. Log on to a Controller VM with SSH.
  2. Check how many NUMA nodes are available on each AHV host in the cluster.
    nutanix@cvm$ hostssh "numactl --hardware"

    The console displays an output similar to the following:

    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
    node 0 size: 128837 MB
    node 0 free: 862 MB
    node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
    node 1 size: 129021 MB
    node 1 free: 352 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128859 MB
    node 0 free: 1076 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129000 MB
    node 1 free: 436 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128859 MB
    node 0 free: 701 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129000 MB
    node 1 free: 357 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128838 MB
    node 0 free: 1274 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129021 MB
    node 1 free: 424 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10
    ============= 10.x.x.x ============
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
    node 0 size: 128837 MB
    node 0 free: 577 MB
    node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
    node 1 size: 129021 MB
    node 1 free: 612 MB
    node distances:
    node   0   1
      0:  10  21
      1:  21  10

    The example output shows that each AHV host has two NUMA nodes.

  3. Do one of the following:
    • Enable vNUMA if you are creating a VM.
      nutanix@cvm$ acli vm.create <vm_name> num_vcpus=x \
      num_cores_per_vcpu=x memory=xG \
      num_vnuma_nodes=x
    • Enable vNUMA if you are modifying an existing VM.
      nutanix@cvm$ acli vm.update <vm_name> \
      num_vnuma_nodes=x
    Replace <vm_name> with the name of the VM on which you want to enable vNUMA or vUMA. Replace x with the values for the following indicated parameters:
    • num_vcpus : Type the number of vCPUs for the VM.
    • num_cores_per_vcpu : Type the number of cores per vCPU.
    • memory : Type the memory in GB for the VM.
    • num_vnuma_nodes : Type the number of vNUMA nodes for the VM.

    For example:

    nutanix@cvm$ acli vm.create test_vm num_vcpus=20 memory=150G num_vnuma_nodes=2

    This command creates a VM with 2 vNUMA nodes, 10 vCPUs and 75 GB memory for each vNUMA node.

What to do next

To configure vCPU hard-pinning on existing VMs, do the following:
nutanix@cvm$ acli vm.update <vm_name> num_vcpus=x
nutanix@cvm$ acli vm.update <vm_name> num_cores_per_vcpu=x
nutanix@cvm$ acli vm.update <vm_name> num_threads_per_core=x
nutanix@cvm$ acli vm.update <vm_name> num_vnuma_nodes=x
nutanix@cvm$ acli vm.update <vm_name> vcpu_hard_pin=true

For example,

nutanix@cvm$ acli vm.update <vm_name> num_vcpus=3
nutanix@cvm$ acli vm.update <vm_name> num_cores_per_vcpu=28
nutanix@cvm$ acli vm.update <vm_name> num_threads_per_core=2
nutanix@cvm$ acli vm.update <vm_name> num_vnuma_nodes=3
nutanix@cvm$ acli vm.update <vm_name> vcpu_hard_pin=true

GPU and vGPU Support

AHV supports GPU-accelerated computing for guest VMs. You can configure either GPU pass-through or a virtual GPU.
Note: You can configure either pass-through or a vGPU for a guest VM but not both.

This guide describes the concepts related to the GPU and vGPU support in AHV. For the configuration procedures, see the Prism Web Console Guide.

For driver installation instructions, see the NVIDIA Grid Host Driver for Nutanix AHV Installation Guide .

Note: VMs with GPU are not migrated to other hosts in the cluster. For more information, see Non-Migratable Hosts.

Supported GPUs

The following GPUs are supported:
Note: These GPUs are supported only by the AHV version that is bundled with the AOS release.
  • NVIDIA® Ampere® A10
  • NVIDIA® Ampere® A16
  • NVIDIA® Ampere® A30
  • NVIDIA® Ampere® A40
  • NVIDIA® Ampere® A100
  • NVIDIA® Quadro® RTX 6000
  • NVIDIA® Quadro® RTX 8000
  • NVIDIA® Tesla® M10
  • NVIDIA® Tesla® M60
  • NVIDIA® Tesla® P40
  • NVIDIA® Tesla® P100
  • NVIDIA® Tesla® P4
  • NVIDIA® Tesla® T4 16 GB
  • NVIDIA® Tesla® V100 16 GB
  • NVIDIA® Tesla® V100 32 GB
  • NVIDIA® Tesla® V100S 32 GB

GPU Pass-Through for Guest VMs

AHV hosts support GPU pass-through for guest VMs, allowing applications on VMs direct access to GPU resources. The Nutanix user interfaces provide a cluster-wide view of GPUs, allowing you to allocate any available GPU to a VM. You can also allocate multiple GPUs to a VM. However, in a pass-through configuration, only one VM can use a GPU at any given time.

Host Selection Criteria for VMs with GPU Pass-Through

When you power on a VM with GPU pass-through, the VM is started on the host that has the specified GPU, provided that the Acropolis Dynamic Scheduler determines that the host has sufficient resources to run the VM. If the specified GPU is available on more than one host, the Acropolis Dynamic Scheduler ensures that a host with sufficient resources is selected. If sufficient resources are not available on any host with the specified GPU, the VM is not powered on.

If you allocate multiple GPUs to a VM, the VM is started on a host if, in addition to satisfying Acropolis Dynamic Scheduler requirements, the host has all of the GPUs that are specified for the VM.

If you want a VM to always use a GPU on a specific host, configure host affinity for the VM.

Support for Graphics and Compute Modes

AHV supports running GPU cards in either graphics mode or compute mode. If a GPU is running in compute mode, Nutanix user interfaces indicate the mode by appending the string compute to the model name. No string is appended if a GPU is running in the default graphics mode.

Switching Between Graphics and Compute Modes

If you want to change the mode of the firmware on a GPU, put the host in maintenance mode, and then flash the GPU manually by logging on to the AHV host and performing standard procedures as documented for Linux VMs by the vendor of the GPU card.

Typically, you restart the host immediately after you flash the GPU. After restarting the host, redo the GPU configuration on the affected VM, and then start the VM. For example, consider that you want to re-flash an NVIDIA Tesla® M60 GPU that is running in graphics mode. The Prism web console identifies the card as an NVIDIA Tesla M60 GPU. After you re-flash the GPU to run in compute mode and restart the host, redo the GPU configuration on the affected VMs by adding back the GPU, which is now identified as an NVIDIA Tesla M60.compute GPU, and then start the VM.

Supported GPU Cards

For a list of supported GPUs, see Supported GPUs.

Limitations

GPU pass-through support has the following limitations:

  • Live migration of VMs with a GPU configuration is not supported. Live migration of VMs is necessary when the BIOS, BMC, and the hypervisor on the host are being upgraded. During these upgrades, VMs that have a GPU configuration are powered off and then powered on automatically when the node is back up.
  • VM pause and resume are not supported.
  • You cannot hot add VM memory if the VM is using a GPU.
  • Hot add and hot remove support is not available for GPUs.
  • You can change the GPU configuration of a VM only when the VM is turned off.
  • The Prism web console does not support console access for VMs that are configured with GPU pass-through. Before you configure GPU pass-through for a VM, set up an alternative means to access the VM. For example, enable remote access over RDP.

    Removing GPU pass-through from a VM restores console access to the VM through the Prism web console.

Configuring GPU Pass-Through

For information about configuring GPU pass-through for guest VMs, see Creating a VM (AHV) in the "Virtual Machine Management" chapter of the Prism Web Console Guide .

NVIDIA GRID Virtual GPU Support on AHV

AHV supports NVIDIA GRID technology, which enables multiple guest VMs to use the same physical GPU concurrently. Concurrent use is made possible by dividing a physical GPU into discrete virtual GPUs (vGPUs) and allocating those vGPUs to guest VMs. Each vGPU is allocated a fixed range of the physical GPUs framebuffer and uses all the GPU processing cores in a time-sliced manner.

Virtual GPUs are of different types (vGPU types are also called vGPU profiles) and differ by the amount of physical GPU resources allocated to them and the class of workload that they target. The number of vGPUs into which a single physical GPU can be divided therefore depends on the vGPU profile that is used on a physical GPU.

Each physical GPU supports more than one vGPU profile, but a physical GPU cannot run multiple vGPU profiles concurrently. After a vGPU of a given profile is created on a physical GPU (that is, after a vGPU is allocated to a VM that is powered on), the GPU is restricted to that vGPU profile until it is freed up completely. To understand this behavior, consider that you configure a VM to use an M60-1Q vGPU. When the VM is powering on, it is allocated an M60-1Q vGPU instance only if a physical GPU that supports M60-1Q is either unused or already running the M60-1Q profile and can accommodate the requested vGPU.

If an entire physical GPU that supports M60-1Q is free at the time the VM is powering on, an M60-1Q vGPU instance is created for the VM on the GPU, and that profile is locked on the GPU. In other words, until the physical GPU is completely freed up again, only M60-1Q vGPU instances can be created on that physical GPU (that is, only VMs configured with M60-1Q vGPUs can use that physical GPU).

Note: NVIDIA does not support Windows Guest VMs on the C-series NVIDIA vGPU types. See the NVIDIA documentation on Virtual GPU software for more information.

NVIDIA Grid Host Drivers and License Installation

To enable guest VMs to use vGPUs on AHV, you must install NVIDIA drivers on the guest VMs, install the NVIDIA GRID host driver on the hypervisor, and set up an NVIDIA GRID License Server.

See the NVIDIA Grid Host Driver for Nutanix AHV Installation Guide for details about the workflow to enable guest VMs to use vGPUs on AHV and the NVIDIA GRID host driver installation instructions.

vGPU Profile Licensing

vGPU profiles are licensed through an NVIDIA GRID license server. The choice of license depends on the type of vGPU that the applications running on the VM require. Licenses are available in various editions, and the vGPU profile that you want might be supported by more than one license edition.

Note: If the specified license is not available on the licensing server, the VM starts up and functions normally, but the vGPU runs with reduced capability.

You must determine the vGPU profile that the VM requires, install an appropriate license on the licensing server, and configure the VM to use that license and vGPU type. For information about licensing for different vGPU types, see the NVIDIA GRID licensing documentation.

Guest VMs check out a license over the network when starting up and return the license when shutting down. As the VM is powering on, it checks out the license from the licensing server. When a license is checked back in, the vGPU is returned to the vGPU resource pool.

When powered on, guest VMs use a vGPU in the same way that they use a physical GPU that is passed through.

Supported GPU Cards

For a list of supported GPUs, see Supported GPUs.

High Availability Support for VMs with vGPUs

Nutanix conditionally supports high availability (HA) of VMs that have NVIDIA GRID vGPUs configured. The cluster does not reserve any specific resources to guarantee High Availability for the VMs with vGPUs. The vGPU VMs are restarted on best effort basis in the event of a node failure. You can restart a VM with vGPUs on another (failover) host which has compatible or identical vGPU resources available. The vGPU profile available on the failover host must be identical to the vGPU profile configured on the VM that needs HA. The system attempts to restart the VM after an event. If the failover host has insufficient memory and vGPU resources for the VM to start, the VM fails to start after failover.

The following conditions are applicable to HA of VMs with vGPUs:

  • Memory is not reserved for the VM on the failover host by the HA process. When the VM fails over, if sufficient memory is not available, the VM cannot power on.
  • vGPU resource is not reserved on the failover host. When the VM fails over, if the required vGPU resources are not available on the failover host, the VM cannot power on.

Limitations for vGPU Support

vGPU support on AHV has the following limitations:

  • You cannot hot-add memory to VMs that have a vGPU.
  • The Prism web console does not support console access for a VM that is configured with multiple vGPUs. The Prism web console supports console access for a VM that is configured with a single vGPU only.

    Before you add multiple vGPUs to a VM, set up an alternative means to access the VM. For example, enable remote access over RDP. For Linux VMs, instead of RDP, use Virtual Network Computing (VNC) or equivalent.

Console Support for VMs with vGPU

Like other VMs, you can access a VMs with a single vGPU using the console. Enable or disable console support for a VM with only one vGPU configured. Enabling console support for a VM with multiple vGPUs is not supported. By default, console support for a vGPU VM is disabled.

Recovery of vGPU Console-enabled VMs

With AHV, you can recover vGPU console-enabled guest VMs efficiently. When you perform DR of vGPU console-enabled guest VMs, the VMs recovers with the vGPU console. The guest VMs fail to recover when you perform cross-hypervisor disaster recovery (CHDR).

For AHV with minimum AOS versions 6.1, 6.0.2.4 and 5.20:

  • vGPU-enabled VMs can be recovered when protected by protection domains in PD-based DR or protection policies in Leap based solutions using asynchronous, NearSync, or Synchronous (Leap only) replications.
    Note: GPU Passthrough is not supported.
  • If both site A and site B have the same GPU boards (and the same assignable vGPU profiles), failovers work seamlessly. However, with protection domains, no additional steps are required. GPU profiles are restored correctly and vGPU console settings persist after recovery. With Leap DR, vGPU console settings do not persist after recovery.
  • If site A and site B have different GPU boards and vGPU profiles, you must manually remove the vGPU profile before you power on the VM in site B.

The vGPU console settings are persistent after recovery and all failovers are supported for the following:

Table 1. Persistent vGPU Console Settings with Failover Support
Recovery using For vGPU enabled AHV VMs
Protection domain based DR Yes
VMware SRM with Nutanix SRA Not applicable

For information about the behavior See the Recovery of vGPU-enabled VMs topic in the Data Protection and Recovery with Prism Element guide.

See Enabling or Disabling Console Support for vGPU VMs for more information about configuring the support.

For SRA and SRM support, see the Nutanix SRA documentation.

ADS support for VMs with vGPUs

AHV supports Acropolis Dynamic Scheduling (ADS) for VMs with vGPUs.

Note: ADS support requires live migration of VMs with vGPU be operational in the cluster. See Live Migration of VMs with vGPUs above for minimum NVIDIA and AOS versions that support live migration of VMs with vGPUs.

When a number of VMs with vGPUs are running on a host and you enable ADS support for the cluster, the Lazan manager invokes VM migration tasks to resolve resource hotspots or fragmentation in the cluster to power on incoming vGPU VMs. The Lazan manager can migrate vGPU-enabled VMs to other hosts in the cluster only if:

  • The other hosts support compatible or identical vGPU resources as the source host (hosting the vGPU-enabled VMs).

  • The host affinity is not set for the vGPU-enabled VM.

For more information about limitations, see Live Migration of vGPU-enabled VMs and Limitations of Live Migration Support.

For more information about ADS, see Acropolis Dynamic Scheduling in AHV.

Multiple Virtual GPU Support

Prism Central and Prism Element Web Console can deploy VMs with multiple virtual GPU instances. This support harnesses the capabilities of NVIDIA GRID virtual GPU (vGPU) support for multiple vGPU instances for a single VM.

Note: Multiple vGPUs on the same VM are supported on NVIDIA Virtual GPU software version 10.1 (440.53) or later.

You can deploy virtual GPUs of different types. A single physical GPU can be divided into the number of vGPUs depending on the type of vGPU profile that is used on the physical GPU. Each physical GPU on a GPU board supports more than one type of vGPU profile. For example, a Tesla® M60 GPU device provides different types of vGPU profiles like M60-0Q, M60-1Q, M60-2Q, M60-4Q, and M60-8Q.

You can only add multiple vGPUs of the same type of vGPU profile to a single VM. For example, consider that you configure a VM on a Node that has one NVidia Tesla® M60 GPU board. Tesla® M60 provides two physical GPUs, each supporting one M60-8Q (profile) vGPU, thus supporting a total of two M60-8Q vGPUs for the entire host.

For restrictions on configuring multiple vGPUs on the same VM, see Restrictions for Multiple vGPU Support.

For steps to add multiple vGPUs to the same VM, see Creating a VM (AHV) and Adding Multiple vGPUs to a VM in Prism Web Console Guide or Creating a VM through Prism Central (AHV) and Adding Multiple vGPUs to a VM in Prism Central Guide .

Restrictions for Multiple vGPU Support

You can configure multiple vGPUs subject to the following restrictions:

  • All the vGPUs that you assign to one VM must be of the same type. In the aforesaid example, with the Tesla® M60 GPU device, you can assign multiple M60-8Q vGPU profiles. You cannot assign one vGPU of the M60-1Q type and another vGPU of the M60-8Q type.

    Note: You can configure any number of vGPUs of the same type on a VM. However, the cluster calculates a maximum number of vGPUs of the same type per VM. This number is defined as max_instances_per_vm. This number is variable and changes based on the GPU resources available in the cluster and the number of VMs deployed. If the number of vGPUs of a specific type that you configured on a VM exceeds the max_instances_per_vm number, then the VM fails to power on and the following error message is displayed:
    Operation failed: NoHostResources: No host has enough available GPU for VM <name of VM>(UUID of VM).
    You could try reducing the GPU allotment...

    When you configure multiple vGPUs on a VM, after you select the appropriate vGPU type for the first vGPU assignment, Prism (Prism Central and Prism Element Web Console) automatically restricts the selection of vGPU type for subsequent vGPU assignments to the same VM.

    Figure. vGPU Type Restriction Message Click to enlarge vGPU Type Restriction Message

    Note:

    You can use CLI (acli) to configure multiple vGPUs of multiple types to the same VM. See Acropolis Command-Line Interface (aCLI) for information about aCLI. Use the vm.gpu_assign <vm.name> gpu=<gpu-type> command multiple times, once for each vGPU, to add multiple vGPUs of multiple types to the same VM.

    See the GPU board and software documentation for information about the combinations of the number and types of vGPUs profiles supported by the GPU resources installed in the cluster. For example, see the NVIDIA Virtual GPU Software Documentation for the vGPU type and number combinations on the Tesla® M60 board.

  • Configure multiple vGPUs only of the highest type using Prism. The highest type of vGPU profile is based on the driver deployed in the cluster. In the aforesaid example, on a Tesla® M60 device, you can only configure multiple vGPUs of the M60-8Q type. Prism prevents you from configuring multiple vGPUs of any other type such as M60-2Q.

    Figure. vGPU Type Restriction Message Click to enlarge Message showing the restriction of number of vGPUs of specified type.

    Note:

    You can use CLI (acli) to configure multiple vGPUs of other available types. See Acropolis Command-Line Interface (aCLI) for the aCLI information. Use the vm.gpu_assign <vm.name> gpu=<gpu-type> command multiple times, once for each vGPU, to configure multiple vGPUs of other available types.

    See the GPU board and software documentation for more information.

  • Configure either a passthrough GPU or vGPUs on the same VM. You cannot configure both passthrough GPU and vGPUs. Prism automatically disallows such configurations after the first GPU is configured.

  • The VM powers on only if the requested type and number of vGPUs are available in the host.

    In the aforesaid example, the VM, which is configured with two M60-8Q vGPUs, fails to power on if another VM sharing the same GPU board is already using one M60-8Q vGPU. This is because the Tesla® M60 GPU board allows only two M60-8Q vGPUs. Of these, one is already used by another VM. Thus, the VM configured with two M60-8Q vGPUs fails to power on due to unavailability of required vGPUs.

  • Multiple vGPUs on the same VM are supported on NVIDIA Virtual GPU software version 10.1 (440.53) or later. Ensure that the relevant GRID version license is installed and select it when you configure multiple vGPUs.
Adding Multiple vGPUs to the Same VM

About this task

You can add multiple vGPUs of the same vGPU type to:

  • A new VM when you create it.

  • An existing VM when you update it.

Important:

Before you add multiple vGPUs to the VM, see Multiple Virtual GPU Support and Restrictions for Multiple vGPU Support in the AHV Administration Guide .

After you add the first vGPU, do the following on the Create VM or Update VM dialog box (the main dialog box) to add more vGPUs:

Procedure

  1. Click Add GPU .
  2. In the Add GPU dialog box, click Add .

    The License field is grayed out because you cannot select a different license when you add a vGPU for the same VM.

    The VGPU Profile is also auto-selected because you can only select the additional vGPU of the same vGPU type as indicated by the message at the top of the dialog box.

    Figure. Add GPU for multiple vGPUs Click to enlarge Adding multiple vGPU

  3. In the main dialog box, you see the newly added vGPU.
    Figure. New vGPUs Added Click to enlarge Multiple vGPUs Added

  4. Repeat the steps for each vGPU addition you want to make.

Live Migration of vGPU-enabled VMs

You can perform live migration of VMs enabled with virtual GPUs (vGPU-enabled VM). The primary advantage of the live migration support is that unproductive downtime is avoided. Therefore, your vGPUs can continue to run while the VMs that are running the vGPUs are seamlessly migrated in the background. With very low stun times, as a graphics user, you barely notice the migration.

Note: Live migration of VMs with vGPUs is supported for vGPUs created with minimum NVIDIA Virtual GPU software version 10.1 (440.53).
Table 1. Minimum Versions
Component Supports With Minimum Version
AOS Live migration within the same cluster 5.18.1
AHV Live migration within the same cluster 20190916.294
AOS Live migration across cluster 6.1
AHV Live migration across cluster 20201105.30142
Important: In an HA event involving any GPU node, the node locality of the affected vGPU-enabled VMs is not restored after GPU node recovery. The affected vGPU-enabled VMs are not migrated back to their original GPU host intentionally to avoid extended VM stun time expected during migration. If vGPU-enabled VM node locality is required, migrate the affected vGPU-enabled VMs to the desired host manually.

Virtual GPU Limitations

Important frame buffer and VM stun time considerations are:

  • The GPU board (for example, NVIDIA Tesla M60) vendor provides the information for maximum frame buffer size of vGPU types (for example, M60-8Q type) that can be configured on VMs. However, the actual frame buffer usage may be lower than the maximum sizes.

  • The VM stun time depends on the number of vGPUs configured on the VM being migrated. Stun time may be longer in case of multiple vGPUs operating on the VM.

    The stun time also depends on the network factors such bandwidth available for use during the migration.

For information about the limitations applicable to the live migration support, see Limitations of Live Migration Support and Restrictions for Multiple vGPU Support.

For information about the steps to migrate live a VM with vGPUs, see Migrating Live a VM with Virtual GPUs in the Prism Central Guide and Migrating Live a VM with Virtual GPUs in the Prism Web Console Guide .

Limitations of Live Migration Support
  • Live migration is supported only for VMs configured with single or multiple virtual GPUs. It is not supported for VMs configured with passthrough GPUs.

  • Live migration to a host in another cluster is supported only if the VM is protected by protection policy with Synchronous replication schedule.

  • The target host for the migration must have adequate and available GPU resources, with the same vGPU types as configured for the VMs to be migrated, to support the vGPUs on the VMs that need to be migrated.

    See Restrictions for Multiple vGPU Support for more details.

  • The vGPU-enabled VMs that need to be migrated live cannot be protected with high availability.
  • Ensure that the VM is not powered off.
  • Ensure that you have the right GPU software license(an appropriate license of NVIDIA GRID software version) that supports live migration of vGPUs. The source and target hosts must have the same license type.

Enabling or Disabling Console Support for vGPU VMs

About this task

Enable or disable console support for a VM with only one vGPU configured. Enabling console support for a VM with multiple vGPUs is not supported. By default, console support for a vGPU VM is disabled.

To enable or disable console support for each VM with vGPUs, do the following:

Procedure

  1. Run the following aCLI command to check if console support is enabled or disabled for the VM with vGPUs.
    acli> vm.get vm-name

    Where vm-name is the name of the VM for which you want to check the console support status.

    The step result includes the following parameter for the specified VM:
    gpu_console=False

    Where False indicates that console support is not enabled for the VM. This parameter is displayed as True when you enable console support for the VM. The default value for gpu_console= is False since console support is disabled by default.

    Note: The console may not display the gpu_console parameter in the output of the vm.get command if the gpu_console parameter was not previously enabled.
  2. Run the following aCLI command to enable or disable console support for the VM with vGPU:
    vm.update vm-name gpu_console=true | false

    Where:

    • true —indicates that you are enabling console support for the VM with vGPU.
    • false —indicates that you are disabling console support for the VM with vGPU.
  3. Run the vm.get command to check if gpu_console value is true indicating that console support is enabled or false indicating that console support is disabled as you configured it.

    If the value indicated in the vm.get command output is not what is expected, then perform Guest Shutdown of the VM with vGPU. Next, run the vm.on vm-name aCLI command to turn the VM on again. Then run vm.get command and check the gpu_console= value.

  4. Click a VM name in the VM table view to open the VM details page. Click Launch Console .
    The Console opens but only a black screen is displayed.
  5. Click on the console screen. Click one of the following key combinations based on the operating system you are accessing the cluster from.
    • For Apple Mac OS: Control+Command+2
    • For MS Windows: Ctrl+Alt+2
    The console is fully enabled and displays the content.

PXE Configuration for AHV VMs

You can configure a VM to boot over the network in a Preboot eXecution Environment (PXE). Booting over the network is called PXE booting and does not require the use of installation media. When starting up, a PXE-enabled VM communicates with a DHCP server to obtain information about the boot file it requires.

Configuring PXE boot for an AHV VM involves performing the following steps:

  • Configuring the VM to boot over the network.
  • Configuring the PXE environment.

The procedure for configuring a VM to boot over the network is the same for managed and unmanaged networks. The procedure for configuring the PXE environment differs for the two network types, as follows:

  • An unmanaged network does not perform IPAM functions and gives VMs direct access to an external Ethernet network. Therefore, the procedure for configuring the PXE environment for AHV VMs is the same as for a physical machine or a VM that is running on any other hypervisor. VMs obtain boot file information from the DHCP or PXE server on the external network.
  • A managed network intercepts DHCP requests from AHV VMs and performs IP address management (IPAM) functions for the VMs. Therefore, you must add a TFTP server and the required boot file information to the configuration of the managed network. VMs obtain boot file information from this configuration.

A VM that is configured to use PXE boot boots over the network on subsequent restarts until the boot order of the VM is changed.

Configuring the PXE Environment for AHV VMs

The procedure for configuring the PXE environment for a VM on an unmanaged network is similar to the procedure for configuring a PXE environment for a physical machine on the external network and is beyond the scope of this document. This procedure configures a PXE environment for a VM in a managed network on an AHV host.

About this task

To configure a PXE environment for a VM on a managed network on an AHV host, do the following:

Procedure

  1. Log on to the Prism web console, click the gear icon, and then click Network Configuration in the menu.
  2. On Network Configuration > Subnets tab, click the Edit action link of the network for which you want to configure a PXE environment.
    The VMs that require the PXE boot information must be on this network.
  3. In the Update Subnet dialog box:
    1. Select the Enable IP address management check box and complete the following configurations:
      • In the Network IP Prefix field, enter the network IP address, with prefix, of the subnet that you are updating.
      • In the Gateway IP Address field, enter the gateway IP address of the subnet that you are updating.
      • To provide DHCP settings for the VM, select the DHCP Settings check box and provide the following information.
Fields Description and Values
Domain Name Servers

Provide a comma-separated list of DNS IP addresses.

Example: 8.8.8.8, or 9.9.9.9

Domain Search

Enter the VLAN domain name. Use only the domain name format.

Example: nutanix.com

TFTP Server Name

Enter a valid TFTP host server name of the TFTP server where you host the host boot file. The IP address of the TFTP server must be accessible to the virtual machines to download a boot file.

Example: tftp_vlan103

Boot File Name

The name of the boot file that the VMs need to download from the TFTP host server.

Example: boot_ahv202010

  1. Under IP Address Pools , click Create Pool to add IP address pools for the subnet.

    (Mandatory for Overlay type subnets) This section provides the Network IP Prefix and Gateway IP fields for the subnet.

    (Optional for VLAN type subnet) Check this box to display the Network IP Prefix and Gateway IP fields and configure the IP address details.

  2. (Optional and for VLAN networks only) Check the Override DHCP Server dialog box and enter an IP address in the DHCP Server IP Address field.

    You can configure a DHCP server using the Override DHCP Server option only in case of VLAN networks.

    The DHCP Server IP address (reserved IP address for the Acropolis DHCP server) is visible only to VMs on this network and responds only to DHCP requests. If this box is not checked, the DHCP Server IP Address field is not displayed and the DHCP server IP address is generated automatically. The automatically generated address is network_IP_address_subnet.254 , or if the default gateway is using that address, network_IP_address_subnet.253 .

    Usually the default DHCP server IP is configured as the last usable IP in the subnet (For eg., its 10.0.0.254 for 10.0.0.0/24 subnet). If you want to use a different IP address in the subnet as the DHCP server IP, use the override option.

  3. Click Close .

Configuring a VM to Boot over a Network

To enable a VM to boot over the network, update the VM's boot device setting. Currently, the only user interface that enables you to perform this task is the Acropolis CLI (aCLI).

About this task

To configure a VM to boot from the network, do the following:

Procedure

  1. Log on to any CVM in the cluster using SSH.
  2. Create a VM.
    
    nutanix@cvm$ acli vm.create vm num_vcpus=num_vcpus memory=memory

    Replace vm with a name for the VM, and replace num_vcpus and memory with the number of vCPUs and amount of memory that you want to assign to the VM, respectively.

    For example, create a VM named nw-boot-vm.

    nutanix@cvm$ acli vm.create nw-boot-vm num_vcpus=1 memory=512
  3. Create a virtual interface for the VM and place it on a network.
    nutanix@cvm$ acli vm.nic_create vm network=network

    Replace vm with the name of the VM and replace network with the name of the network. If the network is an unmanaged network, make sure that a DHCP server and the boot file that the VM requires are available on the network. If the network is a managed network, configure the DHCP server to provide TFTP server and boot file information to the VM. See Configuring the PXE Environment for AHV VMs.

    For example, create a virtual interface for VM nw-boot-vm and place it on a network named network1.

    nutanix@cvm$ acli vm.nic_create nw-boot-vm network=network1
  4. Obtain the MAC address of the virtual interface.
    nutanix@cvm$ acli vm.nic_list vm

    Replace vm with the name of the VM.

    For example, obtain the MAC address of VM nw-boot-vm.

    nutanix@cvm$ acli vm.nic_list nw-boot-vm
    00-00-5E-00-53-FF
  5. Update the boot device setting so that the VM boots over the network.
    nutanix@cvm$ acli vm.update_boot_device vm mac_addr=mac_addr

    Replace vm with the name of the VM and mac_addr with the MAC address of the virtual interface that the VM must use to boot over the network.

    For example, update the boot device setting of the VM named nw-boot-vm so that the VM uses the virtual interface with MAC address 00-00-5E-00-53-FF.

    nutanix@cvm$ acli vm.update_boot_device nw-boot-vm mac_addr=00-00-5E-00-53-FF
  6. Power on the VM.
    nutanix@cvm$ acli vm.on vm_list [host="host"]

    Replace vm_list with the name of the VM. Replace host with the name of the host on which you want to start the VM.

    For example, start the VM named nw-boot-vm on a host named host-1.

    nutanix@cvm$ acli vm.on nw-boot-vm host="host-1"

Uploading Files to DSF for Microsoft Windows Users

If you are a Microsoft Windows user, you can securely upload files to DSF by using the following procedure.

Procedure

  1. Use WinSCP, with SFTP selected, to connect to Controller VM through port 2222 and start browsing the DSF datastore.
    Note: The root directory displays storage containers and you cannot change it. You can only upload files to one of the storage containers and not directly to the root directory. To create or delete storage containers, you can use the Prism user interface.
  2. Authenticate by using Prism username and password or, for advanced users, use the public key that is managed through the Prism cluster lockdown user interface.

Enabling Load Balancing of vDisks in a Volume Group

AHV hosts support load balancing of vDisks in a volume group for guest VMs. Load balancing of vDisks in a volume group enables IO-intensive VMs to use the storage capabilities of multiple Controller VMs (CVMs).

About this task

If you enable load balancing on a volume group, the guest VM communicates directly with each CVM hosting a vDisk. Each vDisk is served by a single CVM. Therefore, to use the storage capabilities of multiple CVMs, create more than one vDisk for a file system and use the OS-level striped volumes to spread the workload. This configuration improves performance and prevents storage bottlenecks.

Note:
  • vDisk load balancing is disabled by default for volume groups that are directly attached to VMs.

    However, vDisk load balancing is enabled by default for volume groups that are attached to VMs by using a data services IP address.

  • If you use web console to clone a volume group that has load balancing enabled, the volume group clone does not have load balancing enabled by default. To enable load balancing on the volume group clone, you must set the load_balance_vm_attachments parameter to true using acli or Rest API.
  • You can attach a maximum number of 10 load balanced volume groups per guest VM.
  • For Linux VMs, ensure that the SCSI device timeout is 60 seconds. For information about how to check and modify the SCSI device timeout, see the Red Hat documentation at https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/online_storage_reconfiguration_guide/task_controlling-scsi-command-timer-onlining-devices.

Perform the following procedure to enable load balancing of vDisks by using aCLI.

Procedure

  1. Log on to a Controller VM with SSH.
  2. Do one of the following:
    • Enable vDisk load balancing if you are creating a volume group.
      nutanix@cvm$ acli vg.create vg_name load_balance_vm_attachments=true

      Replace vg_name with the name of the volume group.

    • Enable vDisk load balancing if you are updating an existing volume group.
      nutanix@cvm$ acli vg.update vg_name load_balance_vm_attachments=true

      Replace vg_name with the name of the volume group.

      Note: To modify an existing volume group, you must first detach all the VMs that are attached to that volume group before you enable vDisk load balancing.
  3. Verify if vDisk load balancing is enabled.
    nutanix@cvm$ acli vg.get vg_name

    An output similar to the following is displayed:

    nutanix@cvm$ acli vg.get ERA_DB_VG_xxxxxxxx
    ERA_DB_VG_xxxxxxxx {
      attachment_list {
        vm_uuid: "xxxxx"
    .
    .
    .
    .
    iscsi_target_name: "xxxxx"
    load_balance_vm_attachments: True
    logical_timestamp: 4
    name: "ERA_DB_VG_xxxxxxxx"
    shared: True
    uuid: "xxxxxx"
    }

    If vDisk load balancing is enabled on a volume group, load_balance_vm_attachments: True is displayed in the output. The output does not display the load_balance_vm_attachments: parameter at all if vDisk load balancing is disabled.

  4. (Optional) Disable vDisk load balancing.
    nutanix@cvm$ acli vg.update vg_name load_balance_vm_attachments=false

    Replace vg_name with the name of the volume group.

VM High Availability in Acropolis

If you have not modified the High Availability (HA) configuration from previous version of AOS releases, best effort VM availability is implemented by default in Acropolis.

If you have not enabled HA, in case of host failure, the VMs are restarted from the failed host to any available space on the other hosts in the cluster. Once the failed host joins the cluster again, VMs are migrated back to the host. This type of VM high availability is implemented without reserving any resources. Admission control is not enforced and hence there may not be sufficient capacity available to start all the VMs.
Note: Nutanix does not support VMs that are running with 100% remote storage for High Availability. The VMs must have at least one local disk that is present on the cluster.

The VM HA uses the Guarantee mode as default with segment-based reservation method, and not the host-based reservation method for HA.

In segment-based reservation, the cluster is divided into segments to ensure that enough space is reserved for any host failure. Each segment corresponds to the largest VM that is guaranteed to be restarted in case the failure occurs. The other factor is the number of host failures that can be tolerated. Using these inputs, the scheduler implements admission control to always have enough resources reserved so that the VMs can be restarted upon failure of any host in the cluster.

The segment size ensures that the largest VM can be powered on in HA failover when cluster is fully loaded (if the cluster is fully used except the reserved segments). The number of segments that is reserved is such a way that for each host enough resources are reserved to ensure that any host failure in the cluster is tolerated. Multiple VMs may fit into a segment. If anything changes in the cluster, the reservation is computed again. The total resources reserved for segments can be more than the resources used by running VMs. This implementation guarantees successful failover even in the case of fragmentation of segments. The actual number of reserved resources depends on the current load of the cluster, but it is typically at 1 to 1.25 times the resource usage on the most loaded host.

If the host enters maintenance mode (in case of host upgrade), you might not be protected against further host failures. Maintenance mode uses reservations made for HA for migrating VMs from the host. Although you are not protected against host failure if you have reservation for HA, hypervisor upgrade occurs without any difficulty because from the perspective of a user it is the same as host failure except that the VMs are migrated (instead of restarted) and hence no runtime state is lost. The HA status goes through the same states as it goes when the host failure had occurred.

Segment-based reservation is the default method that Acropolis uses to enable VM high availability.
Caution: Do not change segment-based reservation method with the reserve host method.

Fault Detection for High Availability

Acropolis version 6.1 (with minimum supported AHV version 20201105.2229) onwards, the fault detection mechanism for High Availability checks for the heartbeat of the management services on the host. If any one of the services are down, then the host is marked as disconnected.

In addition to this, the fault detection mechanism also performs the following health checks for the host:

  • Root file system corruption
  • Read-only root file system

If any of the above-mentioned health checks are affirmative, then the host is marked as disconnected.

If the host remains in the disconnected status for 40 seconds, the VMs running on the affected host are automatically restarted (based on the resource availability) .

You can view the alerts raised for any of the above-mentioned checks in the Activity > Alerts view in Prism UI.

Enabling High Availability for the Cluster

In Acropolis managed clusters, you can enable high availability for the cluster to ensure that VMs can be migrated and restarted on another host in case of failure.

About this task

After you enable high availability for the cluster, if a host failure occurs the cluster goes through following changes.
  • OK: This state implies that the cluster is protected against a host failure.
  • Healing: Healing period is the time that Acropolis brings the cluster to the protected state. There are two phases to this state. The first phase occurs when the host fails. The VMs are restarted on the available host. After restarting all the VMs if there are enough resources to protect the VM, the HA status of the cluster comes back to OK state. If this does not occur, the cluster goes into critical state. The second phase occurs when the host comes back from the failure. Once the host comes back from failure, no VMs are present on the host and hence during this healing phase restore locality task occurs (VMs are migrated back). Apart from restoring the locality of the VMs, the restore locality task ensures that the cluster is back to the same state before the HA failure. Once it is finished, the HA status is back to OK state.
  • Critical: If the host is down, the HA status of the cluster goes into Critical state. This happens because the cluster cannot tolerate any more host failures. You have to ensure that you bring back the host so that your cluster is protected against any further host failures.
    Note: On a less loaded cluster, it is possible for HA to go directly back to OK state if enough resources is reserved to protect another host failure. The start and migrate operations on the VMs are restricted in the Critical state because Acropolis continuously tries to ensure that the HA status is back to the OK state.

Procedure

  1. Click the gear icon in the main menu and then select Manage VM High Availability in the Settings page.
    Note: This option does not appear in clusters that do not support this feature.
    The Manage VM High Availability dialog box appears.
    Figure. Manage VM High Availability Window Click to enlarge Manage VM High Availability

  2. Check the Enable HA Reservation box and then click the Save button to enable.

Viewing list of restarted VMs after an HA event

This section provides the information about how to view the list of VMs that are restarted after an HA event in the AHV cluster.

About this task

If an AHV host becomes inaccessible or fails due to some unplanned event, the AOS restarts the VMs across the remaining hosts in the cluster.

To view the list of restarted VMs after an HA event:

Procedure

  1. Log in to Prism Central or Prism web console.
  2. View the list of restarted VMs on either of the following page:
    • Events page:
      1. Navigate to Activity > Events from the entities menu to access the Events page in Prism Central .

        Navigate to Alerts > Events from the main menu to access the Events page in the Prism web console .

      2. Locate or search for the following string, and hover over or click the string:

        VMs restarted due to HA failover

        The system displays the list of restarted VMs in the Summary page and as a hover text for the selected event.

        For example:

        VMs restarted due to HA failover: <VM_Name1>, <VM_Name2>, <VM_Name3>, <VM_Name4>. VMs were running on host X.X.X.1 prior to HA.

      Observe <VM_Name1>, <VM_Name2>, <VM_Name3>, and <VM_Name4> as the actual VMs in your cluster.

    • Tasks page:
      1. Navigate to Activity > Tasks from the entities menu to access the Tasks page in Prism Central .

        Navigate to Tasks from the main menu to access the Tasks page in the Prism web console .

      2. Locate or search for the following task, and click Details :

        HA failover

        The system displays a list of related tasks for the HA failover event.

      3. Locate or search for the following related task, and click Details :

        Host restart all VMs

        The system displays Restart VM group task for the HA failover event.

      4. In the Entity Affected column, click Details , or hover over the VMs text for Restart VM group task:

      The system displays the list of restarted VMs:

      Figure. List of restarted VMs Click to enlarge This figure shows the list of restarted VMs.

Live vDisk Migration Across Storage Containers

vDisk migration allows you to change the container of a vDisk. You can migrate vDisks across storage containers while they are attached to guest VMs without the need to shut down or delete VMs (live migration). You can either migrate all vDisks attached to a VM or migrate specific vDisks to another container.

In a Nutanix solution, you group vDisks into storage containers and attach vDisks to guest VMs. AOS applies storage policies such as replication factor, encryption, compression, deduplication, and erasure coding at the storage container level. If you apply a storage policy to a storage container, AOS enables that policy on all the vDisks of the container. If you want to change the policies of the vDisks (for example, from RF2 to RF3), create another container with a different policy and move the vDisk to that container. With live migration of vDisks across containers, you can migrate vDisk across containers even if those vDisks are attached to a live VM. Thus, live migration of vDisks across storage containers enables you to efficiently manage storage policies for guest VMs.

General Considerations

You cannot migrate images or volume groups.

You cannot perform the following operations during an ongoing vDisk migration:

  • Clone the VM
  • Resize the VM
  • Take a snapshot
Note: During vDisk migration, the logical usage of a vDisk is more than the total capacity of the vDisk. The issue occurs because the logical usage of the vDisk includes the space occupied in both the source and destination containers. Once the migration is complete, the logical usage of the vDisk returns to its normal value.

Migration of vDisks stalls if sufficient storage space is not available in the target storage container. Ensure that the target container has sufficient storage space before you begin migration.

Disaster Recovery Considerations

Consider the following points if you have a disaster recovery and backup setup:

  • You cannot migrate vDisks of a VM that is protected by a protection domain or protection policy. When you start the migration, ensure that the VM is not protected by a protection domain or protection policy. If you want to migrate vDisks of such a VM, do the following:
    • Remove the VM from the protection domain or protection policy.
    • Migrate the vDisks to the target container.
    • Add the VM back to the protection domain or protection policy.
    • Configure the remote site with the details of the new container.

    vDisk migration fails if the VM is protected by a protection domain or protection policy.

  • If you are using a third-party backup solution, AOS temporarily blocks snapshot operations for a VM if vDisk migration is in progress for that VM.

Migrating a vDisk to Another Container

You can either migrate all vDisks attached to a VM or migrate specific vDisks to another container.

About this task

Perform the following procedure to migrate vDisks across storage containers:

Procedure

  1. Log on to a CVM in the cluster with SSH.
  2. Do one of the following:
    • Migrate all vDisks of a VM to the target storage container.
      nutanix@cvm$ acli vm.update_container vm-name container=target-container wait=false

      Replace vm-name with the name of the VM whose vDisks you want to migrate and target-container with the name of the target container.

    • Migrate specific vDisks by using either the UUID of the vDisk or address of the vDisk.

      Migrate specific vDisks by using the UUID of the vDisk.

      nutanix@cvm$ acli vm.update_container vm-name device_uuid_list=device_uuid container=target-container wait=false

      Replace vm-name with the name of VM, device_uuid with the device UUID of the vDisk, and target-container with the name of the target storage container.

      Run nutanix@cvm$ acli vm.get <vm-name> to determine the device UUID of the vDisk.

      You can migrate multiple vDisks at a time by specifying a comma-separated list of device UUIDs of the vDisks.

      Alternatively, you can migrate vDisks by using the address of the vDisk.

      nutanix@cvm$ acli vm.update_container vm-name disk_addr_list=disk-address container=target-container wait=false

      Replace vm-name with the name of VM, disk-address with the address of the disk, and target-container with the name of the target storage container.

      Run nutanix@cvm$ acli vm.get <vm-name> to determine the address of the vDisk.

      Following is the format of the vDisk address:

      bus.index

      Following is a section of the output of the acli vm.get vm-name command:

      disk_list {
           addr {
             bus: "scsi"
             index: 0
           }

      Combine the values of bus and index as shown in the following example:

      nutanix@cvm$ acli vm.update_container TestUVM_1 disk_addr_list=scsi.0 container=test-container-17475

      You can migrate multiple vDisks at a time by specifying a comma-separated list of vDisk addresses.

  3. Check the status of the migration in the Tasks menu of the Prism Element web console.
  4. (Optional) Cancel the migration if you no longer want to proceed with it.
    nutanix@cvm$ ecli task.cancel task_list=task-ID

    Replace task-ID with the ID of the migration task.

    Determine the task ID as follows:

    nutanix@cvm$ ecli task.list

    In the Type column of the tasks list, look for VmChangeDiskContainer .

    VmChangeDiskContainer indicates that it is a vDisk migration task. Note the ID of such a task.

    Note: Note the following points about canceling migration:
    • If you cancel an ongoing migration, AOS retains the vDisks that have not yet been migrated in the source container. AOS does not migrate vDisks that have already been migrated to the target container back to the source container.
    • If sufficient storage space is not available in the original storage container, migration of vDisks back to the original container stalls. To resolve the issue, ensure that the source container has sufficient storage space.

Memory Overcommit

The Memory Overcommit feature enables you to optimize the physical memory utilization of a VM. It allows the host to automatically detect whether the memory is under-utilized or over-utilized for a VM. Using Prism Central, you can enable the Memory Overcommit feature. For information on how to enable the Memory Overcommit feature, see Memory Overcommit Management.

After you enable the Memory Overcommit feature, the host performs the following actions:

  • Automatically combines and uses the excess or unutilized memory of a VM, or even uses the disk space to swap out VM memory.
  • Uses the reclaimed physical memory to provision the memory requirements of another VM, or to create another VM.
Note: Memory overcommit enables the host to optimize and overcommit memory for any existing or new VM within the host. It does not support utilization of memory for VMs across the hosts. For example, overcommitted memory of VM1 on Host1 cannot be utilized on VMs on Host2. It can only be utilized for another VM on Host1 itself.

With the Memory Overcommit feature enabled on the host, the total memory assigned to all the VMs by the host can be greater than the total physical memory installed on the host. For more information on VM memory allocation with Memory Overcommit feature, see the deployment example described in Memory Overcommit Deployment Example.

Note: The Memory Overcommit feature is only available with AOS version 6.0.2 with a recommended minimum AHV version of 20201105.30142 . The AHV version 20201105.30007 is the absolute minimum version with some Memory Overcommit support. For more information, see the Acropolis Family Release Notes .

Deployment Workflow

You can enable or disable the Memory Overcommit feature using Prism Central on new or existing VMs.

Note: Ensure the VMs are shut down before you enable or disable the Memory Overcommit feature.

Memory Overcommit is useful in test and development environments to verify the performance of the VM. Based on the memory usage of the VM, AHV calculates an appropriate memory size for each VM enabled with Memory Overcommit.

Essential Concepts

Over Provisioned VM

A VM with a memory allocation that is more than the actual requirement to run the application on it is called an Over Provisioned VM. For example, a VM with an allocation of 10 GB memory to run an SQL server that only requires 7 GB of memory to operate is categorized as an Over Provisioned VM. In this case, the VM is over provisioned by 3 GB. This over provisioned memory is unused by the VM. The host can use this unutilized memory to add a new VM or allocate more memory to an existing VM, even if the physical memory is not available on the host.

Memory Overcommitted VM

A memory overcommitted VM is a VM that has the Memory Overcommit feature enabled and shares its unutilized memory with the host. By default, each VM is guaranteed to have at least 25% of its memory provisioned as physical memory. The remaining 75% is either unutilized in the VM and reclaimed for use in other VMs or swapped out to the host swap disks.

Memory Overcommit is required for the following types of VMs:

  • A VM with an unutilized memory due to over provisioning.
  • A VM with workloads which are not sensitive to frequent memory access. The frequency of memory access is increased due to swapping of VM memory to host swap disks.

Host Swap

After installing or upgrading the current AOS version of the cluster to AOS 6.0.2 with AHV 20201105.30007 or later, Memory Overcommit is available for the cluster. Swap disks are created on the ADSF for every host in the cluster.

Memory Overcommit reclaims memory from a guest VM that has unutilized memory by default. The host swap memory is utilized when the guest VM does not permit the reclamation of memory due to the amount of memory utilized in the guest VM.

Note: The performance of the swap memory depends on the hardware, workload, and the guest VM. The swap memory defined in the operating system of the guest VM (guest VM swap) may result in better performance characteristics than host swap. It is always recommended to increase guest VM swap instead of relying on host swap. For example, you can increase the guest VM swap up to RAM size of the guest VM.

Memory Overcommit Deployment Example

Example: A 128 GB host with three VMs (64 GB + 32 GB + 32 GB)

Observations Before Memory Overcommit

Observe the following attributes before you enable Memory Overcommit:

  • The physical memory of the host is equal to the total memory of all the VMs.
  • The host cannot add a new VM as the host's physical memory is fully committed.

Action

Enable the Memory Overcommit feature on all the three VMs.

Observations After Memory Overcommit

If the VMs are not fully utilizing their allocated memory, the host allows you to create a new VM with a memory size that is equal to or less than the amount of unutilized memory. When the new VM is started, the host attempts to reclaim the unutilized memory from the running VMs and makes it available for the new VM.

The following table shows the example of a 128 GB host with the total utilized and unutilized memory:

Table 1. Memory Overcommit Example
Host 1 (128 GB)
VMs VM Memory (GB) Utilized Memory (GB) Untilized Memory (GB)
VM1 64 GB 48 GB 16 GB
VM2 32 GB 20 GB 12 GB
VM3 32 GB 24 GB 8 GB
Total 128 GB 92 GB 36 GB

In this example, the host can combine the 36 GB of unutilized physical memory to create a new VM up to an aggregate size of 36 GB. Even though the unutilized memory can be reclaimed from a running VM, all the new VMs memory must be allocated from physical memory. In this case, you can start a new VM only with less than or equal to the total memory that is reclaimable from other VMs.

In the following table, you can observe the following attributes:

  • VMs 1,2 and 3 are over-committed by 16 GB, 12 GB, and 8 GB respectively (thus over-committed by 36 GB cumulatively) which can be reclaimed by the host.
  • The information on the memory reclaim process is also indicated in the Utilized Memory (GB) column.
  • The host creates a new VM - VM4 with an allocated memory of 32 GB.

In this scenario, if the VM4 is only utilizing 28 GB of the allocated 32 GB, it is possible to start a new VM with up to 8GB of memory allocated with reclamation of unutilized memory of 4 GB.

Table 2. Memory Overcommit Example
Host 1 (128 GB)
VMs VM Memory (GB) Utilized Memory (GB) Unutilized Memory (GB)
VM1 64 GB 48 GB 4 GB
VM2 32 GB 20 GB
VM3 32 GB 24 GB
VM4 32 GB 28 GB 4 GB
Total 160 GB 120 GB 8 GB
Note: Before the Memory Overcommit feature is enabled, the physical memory of the host is equal to the total memory of all the VMs. After the memory overcommit feature is enabled on the host, the total of all the memory assigned to VMs by the host can appear to be greater than the total physical memory installed on the host.
Important: The above example is subject to the considerations mentioned in Requirements and Limitations sections. The guest VM utilizes the memory, which is reclaimed to permit the starting of a new VM, only when it experiences a greater memory pressure. In this case, the system identifies the requirement of additional memory for the guest VM and attempts to correctly balance the memory in other VMs based on their memory utilization. If the system is not able to reduce the memory pressure with unutilized memory, it falls back to utilize the host swap to reduce the memory pressure of the guest VM. For example, if VM1 listed in the above table requires an additional 10GB of memory, the system attempts to recover the unutilized 1GB from VM2, 1GB from VM3 and 4GB from VM4. You can observe that a shortfall of 4GB occurs to reduce the memory pressure in VM1, even when the system applies this action. In this case, the system allocates the shortfall memory of 4GB from host swap.
Note: ADS tracks the memory usage on the host and, if the host is in an over-committed state for memory, it is required to migrate VMs to mitigate over-committed memory hotspots.

Requirements for Memory Overcommit

Memory Overcommit is only available with AOS 6.0.2 and later, and has the following requirements:

  • Ensure that you have an AOS version with AHV 20201105.30007 or later. See Release Notes .
  • Install the latest VirtIO driver package.

    The latest VirtIO drivers ensure minimal impact on the performance of the VM enabled with Memory Overcommit.

  • With a minimum Prism Central version of pc.2022.4, you can enable or disable the Memory Overcommit feature using Prism Central on new or existing VMs.

Limitations of Memory Overcommit

Memory overcommit has the following limitations:

  • You can enable or disable Memory Overcommit only while the VM is powered off.
  • Power off the VM enabled with memory overcommit before you change the memory allocation for the VM.

    For example, you cannot update the memory of a VM that is enabled with memory overcommit when it is still running. The system displays the following alert: InvalidVmState: Cannot complete request in state on .

  • Memory overcommit is not supported with VMs that use GPU passthrough and vNUMA.

    For example, you cannot update a VM to a vNUMA VM when it is enabled with memory overcommit. The system displays the following alert: InvalidArgument: Cannot use memory overcommit feature for a vNUMA VM error .

  • Memory overcommit feature can slow down the performance and the predictable performance of the VM.

    For example, migrating a VM enabled with Memory Overcommit takes longer than migrating a VM not enabled with Memory Overcommit.

  • There may be a temporary spike in the aggregate memory usage in the cluster during the migration of a VM enabled with Memory Overcommit from one node to another.

    For example, when you migrate a VM from Node A to Node B, the total memory used in the cluster during migration is greater than the memory usage before the migration.

    The memory usage of the cluster eventually drops back to pre-migration levels when the cluster reclaims the memory for other VM operations.

  • Using Memory Overcommit heavily can cause a spike in the disk space utilization in the cluster. This spike is caused because the Host Swap uses some of the disk space in the cluster.

    If the VMs do not have a swap disk, then in case of memory pressure, AHV uses space from the swap disk created on ADSF to provide memory to the VM. This can lead to an increase in disk space consumption on the cluster.

  • All DR operations except Cross Cluster Live Migration (CCLM) are supported.

    On the destination side, if a VM fails when you enable Memory Overcommit, the failed VM fails over (creating the VM on the remote site) as a fixed size VM. You can enable Memory Overcommit on this VM after the failover is complete.

Total Memory Allocation Mechanism on Linux and Windows VM

When you enable the Memory Overcommit on a Linux VM and Windows VM with the same memory size, you can observe the following difference in memory allocation mechanism between Linux VM and Windows VM:

  • In the Linux VM case, the system adjusts the Total Memory to accommodate the balloon driver memory consumption.
  • In the Windows VM case, the system adjusts the Used Memory (memory usage) to accommodate the balloon driver memory consumption. The Total Memory always remains the same as assigned during initial provisioning.
The following table provides the information about the memory allocation observations made on the Linux VM and Windows VM:
Note:
  • The cache, shared, and buffer memories are intentionally not shown in the following table. The system adjusts the cache memory in the Free Memory.
  • The following attributes are used in the following table:
    • X - Indicates the Total Memory assigned to the VM during initial provisioning.
    • Y - Indicates the memory consumed by the Linux or Windows Balloon Drivers
    • Z -Indicates the Used Memory (VM memory usage)
Table 1. Memory Allocation - Linux VM and Windows VM
Total Memory assigned to the VM (in GB VM OS Balloon Drivers Memory Consumption (GB) Memory Allocation Observation on VM
Used Memory (in GB) Free Memory (in GB) Total Memory (in GB)
X Linux Y Z X - Z X – Y
Windows Z + Y X - (Y +Z) X
For example, if X = 16, Y = 0.5, and Z = 2.5
16 Linux 0.5 2.5 13.5 15.5
Windows 3 13 16

Memory Overcommit Management

Enabling Memory Overcommit While Creating a VM

This procedure helps you enable memory overcommit in a VM that you create in Prism Central. You need a minimum Prism Central version of pc.2022.4 for this procedure.

Before you begin

You cannot enable the memory overcommit feature using the Prism Element Web Console. Enable memory overcommit on a VM that you create by using Prism Central. If you create a VM using Prism Web Console, then you can enable memory overcommit on that VM using Prism Central. For the procedure to enable memory overcommit on a VM that you have already created, see Enabling Memory Overcommit on Existing VMs.

About this task

When you are creating a VM, you can enable memory overcommit for it. Follow the procedure to create a VM.

Procedure

  1. On the VMs dashboard, click Create VM .
    The Create VM page opens.
  2. On the Configuration tab, provide the other details necessary to create the VM. Select the Enable Memory Overcommit checkbox.
    Figure. Create VM - Enable Memory Overcommit Click to enlarge

  3. Click Next .
  4. Provide other details necessary for the VM creation in the Resources and Management tabs, and click Next .

    For more information, see Creating a VM (AHV) topic in the Prism Central Guide .

  5. In the Review tab, under Configuration ensure that the Memory Overcommit configuration is displayed as Enabled .
    If Memory Overcommit configuration is not displayed as Enabled , click Edit to go back to the Configuration tab and select the Enable Memory Overcommit checkbox.
  6. Click Save .

What to do next

The Tasks page displays the VM creation task. After it is successfully completed, check the VMs dashboard to verify that the VM is created and the value in the Memory Overcommit column displays Enabled .
Note: The default General list view does not provide the Memory Overcommit column. Create your own customized view and add the Memory Overcommit column to that view. You can also add other columns to your customized view.

Click the newly created VM to open the VM details page. In the Summary tab, the Properties widget displays Enabled for the Memory Overcommit property.

Figure. VM Details Page - Memory Overcommit Enabled Click to enlarge Properties widget in the Summary tab of VM Details page displays Memory Overcommit property value as Enabled

Enabling Memory Overcommit on Existing VMs

The procedures in this section enable you to enable memory overcommit in one or more existing VMs. You need a minimum Prism Central version of pc.2022.4 for this procedure.

Before you begin

Ensure that the VM on which you want to enable memory overcommit is powered off. If the VM is powered on or is in Soft Shutdown state, the Enable Memory Overcommit action is not available in the Actions dropdown list.

About this task

To enable memory overcommit in one or more existing VM by updating it, follow this procedure.

Procedure

  1. On the VMs dashboard ( List tab), select the VM or VMs for which you want to enable memory overcommit.
    You can select one VM and enable memory overcommit. You could also select multiple VMs and enable memory overcommit in all the selected VMs as a bulk update.
  2. In the Actions dropdown list, click the Enable Memory Overcommit action.
    Figure. VM(s) Selection to Enable Memory Overcommit Click to enlarge Enable Memory Overcommit action for selected VMs

    Note: For an individual VM, you can also select the VM, click Actions > Update and in the Configuration tab of the Update VM page, select the Enable Memory Overcommit checkbox.

    Click Next on the Configuration , Resources and the Management tabs, and Save on the Review tab.

What to do next

You can check the update tasks in the Tasks page. If you selected multiple VMs to enable memory overcommit on, the Task page displays the update for each VM as a separate Update VM task.

You could verify the status as Enabled in the individual VM details page of each VM. You could also verify the status in the List tab with customized view having the Memory Overcommit column.

Disabling Memory Overcommit on Existing VMs

This procedure helps you to disable memory overcommit in one or more existing VMs. You need a minimum Prism Central version of pc.2022.4 for this procedure.

Before you begin

Ensure that the VM on which you want to disable memory overcommit is powered off. If the VM is powered on or is in Soft Shutdown state, the Disable Memory Overcommit action is not available in the Actions dropdown list.

About this task

To disable memory overcommit in one or more existing VM by updating it, follow this procedure.

Procedure

  1. On the VMs dashboard ( List tab), select the VM or VMs for which you want to disable memory overcommit.
    You can select one VM and disable memory overcommit. You could also select multiple VMs and disable memory overcommit in all the selected VMs as a bulk update.
  2. In the Actions dropdown list, click the Disable Memory Overcommit action.
    Figure. VM(s) Selection to Disable Memory Overcommit Click to enlarge Disable Memory Overcommit action for selected VMs

    Note: For an individual VM, you can also select the VM, click Actions > Update and in the Configuration tab of the Update VM page, select the Disable Memory Overcommit checkbox.

    Click Next on the Configuration , Resources and the Management tabs, and Save on the Review tab.

What to do next

You can check the update tasks in the Tasks page. If you selected multiple VMs to disable memory overcommit on, the Task page displays the update for each VM as a separate Update VM task.

You could verify the status as Disabled in the individual VM details page of each VM. You could also verify the status in the List tab with customized view having the Memory Overcommit column.

Enabling Memory Overcommit using CLI

You can configure memory overcommit on a new VM or on an existing VM after shutting it down and enabling memory overcommit on it.

About this task

Perform the following procedure to enable memory overcommit on an existing VM after shutting it down:

Procedure

  1. Log on to the Controller VM using SSH.
  2. At the CVM prompt, type acli to enter the acropolis CLI mode.
  3. Shut down the VM to enforce an update on the VM.
    acli> vm.shutdown vm-name

    Replace vm-name with the name of the VM.

  4. Update the VM to enable memory overcommit.
    acli> vm.update vm-name memory_overcommit=True

    Replace vm-name with the name of the VM.

    Note: Use the wildcard * to list out VMs of similar naming scheme and enable memory overcommit on them.
    acli> vm.update vm-name* memory_overcommit=True

    Use vm-name* to update multiple VMs that follow identical naming scheme.

    For example, Host 1 has six VMs (VM1, VM2, VM3, FileServer, SQL server, VM4), then vm.update VM* memory_overcommit=True CLI command will only update VM1, VM2, VM3, and VM4. It will not update the FileServer and SQL Server VMs on the host.

    Note: Use your discretion while using the wildcard * without mentioning the VM naming scheme. Doing so will update all the VMs in the cluster, including the VMs that you may not want to configure with memory overcommit.
  5. Verify memory overcommit configuration.

    The memory_overcommit parameter should be set to True .

    acli> vm.get vm-name

    Replace vm-name with the name of the VM.

  6. Start the VM.
    acli> vm.on vm-name

    Replace vm-name with the name of the VM.

Disabling Memory Overcommit using CLI

You can disable memory overcommit on an existing VM after shutting it down and changing the memory overcommit configuration of the VM.

About this task

Perform the following procedure to disable memory overcommit on an existing VM:

Procedure

  1. Log on to the Controller VM using SSH.
  2. At the CVM prompt, type acli to enter the acropolis CLI mode.
  3. Shut down the VM.
    acli> vm.shutdown vm-name

    Replace vm-name with the name of the VM.

  4. Update the VM to disable memory overcommit.
    acli> vm.update vm_name memory_overcommit=False

    Replace vm-name with the name of the VM.

    Note: Use the wildcard * to list out VMs of similar naming scheme and disable memory overcommit on them.
    acli> vm.update vm-name* memory_overcommit=False

    Use vm-name* to update multiple VMs that follow identical naming scheme.

    For example, Host 1 has six VMs (VM1, VM2, VM3, FileServer, SQL server, VM4), then vm.update VM* memory_overcommit=False CLI command will only update VM1, VM2, VM3, and VM4. It will not update the FileServer and SQL Server VMs on the host.

    Note: Use your discretion while using the wildcard * without mentioning the VM naming scheme. Doing so will update all the VMs in the cluster, including the VMs that you may not want to configure with memory overcommit.
  5. Start the VM.
    acli> vm.on vm-name

    Replace vm-name with the name of the VM.

OVAs

An Open Virtual Appliance (OVA) file is a tar archive file created by converting a virtual machine (VM) into an Open Virtualization Format (OVF) package for easy distribution and deployment. OVA helps you to quickly create, move or deploy VMs on different hypervisors.

Prism Central helps you perform the following operations with OVAs:

  • Export an AHV VM as an OVA file.

  • Upload OVAs of VMs or virtual appliances (vApps). You can import (upload) an OVA file with the QCOW2 or VMDK disk formats from a URL or the local machine.

  • Deploy an OVA file as a VM.

  • Download an OVA file to your local machine.

  • Rename an OVA file.

  • Delete an OVA file.

  • Track or monitor the tasks associated with OVA operations in Tasks .

The access to OVA operations is based on your role. See Role Details View in the Prism Central Guide to check if your role allows you to perform the OVA operations.

For information about:

  • Restrictions applicable to OVA operations, see OVA Restrictions.

  • The OVAs dashboard, see OVAs View in the Prism Central Guide .

  • Exporting a VM as an OVA, see Exporting a VM as an OVA in the Prism Central Guide .

  • Other OVA operations, see OVA Management in the Prism Central Guide .

OVA Restrictions

You can perform the OVA operations subject to the following restrictions:

  • Export to or upload OVAs with one of the following disk formats:
    • QCOW2: Default disk format auto-selected in the Export as OVA dialog box.
    • VMDK: Deselect QCOW2 and select VMDK, if required, before you submit the VM export request when you export a VM.
    • When you export a VM or upload an OVA and the VM or OVA does not have any disks, the disk format is irrelevant.
  • Upload an OVA to multiple clusters using a URL as the source for the OVA. You can upload an OVA only to a single cluster when you use the local OVA File source.
  • Perform the OVA operations only with appropriate permissions. You can run the OVA operations that you have permissions for, based on your assigned user role.
  • The OVA that results from exporting a VM on AHV is compatible with any AHV version 5.18 or later.
  • The minimum supported versions for performing OVA operations are AOS 5.18, Prism Central 2020.8, and AHV-20190916.253.

VM Template Management

In Prism Central, you can create VM templates to manage the golden image of a VM. A VM template can be considered as a master copy of a virtual machine. It captures the virtual machine configuration and the contents of the VM including the guest operating system and the applications installed on the VM. You can use this template to deploy multiple VMs across clusters.
Note: You can create or manage a VM template only as an admin user.

Limitations of VM Template Feature

The current implementation of the VM template feature has the following limitations:

  • You cannot create a VM template in the following conditions:
    • If the VM is not on AHV.
    • If the VM is an agent or a PC VM.
    • If the VM has any volume group attached to it.
    • If the VM is undergoing vDisk migration.
    • If the VM has disks located on RF1 containers.
  • Templates do not copy the following attributes from the source VMs:
    • Host affinity attributes
    • HA priority attributes
    • Nutanix Guest Tools (NGT) installation

    You must reconfigure the above-listed attributes at the deployed VMs.

Creating a VM Template

About this task

To create a VM template, do the following:

Procedure

  1. Select the VM in List tab under the VM dashboard.
    Note: Before selecting a VM for creating a template, ensure that the VM is powered off.
  2. In the Actions list, select Create VM Template .

    Alternatively, on the details page of the VM, select More and click Create VM Template .

  3. This displays the Create template from VM dialog box. In the Create Template from VM window, do the following in the indicated fields:
    1. In the VM Template Name field, enter a name for the template.
    2. In the Description field, enter a description for the template. Description is an optional field.
    3. In the Guest Customization field, select your options for guest operating system (OS) customization.
      You can choose to provide a customization option for the OS of the VMs that will be deployed using this template. In the Script Type field, select Sysprep (Windows) for customizing the Windows OS, and Cloud-init (Linux) for customizing the Linux OS. For each of these script types, you can choose to either upload a custom script or opt for a guided setup in the Configuration Method field.
      • If you select Custom Script , you can either upload a script to customize the guest OS of the VMs, or you can copy-paste the script in the text box.
      • If you select Guided Setup , you must enter the authentication information such as username, password, locale, hostname, domain join, and license key. The information that you enter is used for customizing the OS of the VMs that are deployed using this template.

      You can also specify if you want to allow the template users to override the guest customization settings of the template while deploying the VM. If you select a script type and then allow the users to override the settings, the users can change the settings only for the configuration method. For example, the users can change the authentication information at the time of deploying a VM from template, or they can change from a guided setup to a custom script.

      If you select No Customization at the time of creating the template but allow the users to override the guest customization settings, it gives the maximum customization control to the users. They can customize the script type and the configuration method.

      Note: If you opt for a guest customization script, ensure that the script is in a valid format. The guest customization scripts are not validated, hence even though the VM deployment might succeed, the guest customization script may not work as expected and it will be apparent only by observing the VM after it gets deployed.
      Figure. Create VM Template Click to enlarge Create VM Template

  4. Click Next .
    On the next page, you can review the configuration details, resource details, network details, and management details. Except the name, description and guest customization options that you chose on the previous page, you cannot modify any other setting while creating the template.
  5. Click Save to save the inputs and create a template.

    The new template appears in the templates entity page list.

Deploying VM from a Template

About this task

After creating a VM template, you can use the template to deploy any number of VMs across clusters.

To deploy a VM from a template, do the following:

Procedure

  1. You can use any of the following methods to deploy VMs:
    • Select the target template in the templates dashboard (see VM Template Summary View topic in the Prism Central Guide ) and click the Deploy VMs button. By default, the active version of the template is used for deployment.
    • Go to the details page of a selected template (see VM Template Details View topic in the Prism Central Guide ) and click the Deploy VMs button. By default, the active version of the template is used for deployment.
    • Go to the details page of a selected template (see VM Template Details View) and then go to Versions view. Select the desired version that you want to use for VM deployment, and click the Deploy VMs button. Here, you can choose any active or non-active version for the deployment.
  2. The Deploy VM from Template page appears. By default, you see the Quick Deploy page. You can click Advanced Deploy button to access the Advanced Deploy page. In Advanced Deploy option, you can view and modify some VM properties and network settings.
    • Quick Deploy : To deploy VMs using the quick deploy method, provide inputs in the indicated fields:
      • Name : Enter a name for the VM.
      • Cluster : Select the cluster where you want to deploy the VM.
      • Number of VMs : Enter the number of VMs that you want to deploy.
      • Starting Index Number : Enter the starting index number for the VMs when you are deploying multiple VMs simultaneously. These index numbers are used in the VM names. For example, if you are deploying two VMs and specify the starting index number as 5, the VMs are named as vm_name -5 and vm_name -6.
      • Guest Customization : The template can have any one of the following options for guest OS customization: No Customization , Sysprep (Windows) , or Cloud-init (Linux) . For Sysprep (Windows) , or Cloud-init (Linux) , you can choose to either upload a custom script or opt for a guided setup. The fields are enabled for modification only if the template allows you to override its guest customization settings while deploying the VM. If you are not allowed to override the guest customization settings, the settings that are already provided in the template are used for VM deployment.
      • Click Next to verify the configuration details of the VMs to be deployed.
      • Click Deploy to deploy the VMs.
      Figure. Deploy VM from Template using Quick Deploy Method Click to enlarge Deploy VM from Template using Quick Deploy Method

    • Advanced Deploy : To deploy VMs using the advanced deploy method, provide inputs in the indicated fields and review the information displayed under various tabs:
      • Configuration : Provide inputs for name and description (optional) of the VM, cluster where you want to deploy the VM, Number of VMs to be deployed, and starting index number (only if deploying multiple VMs). In this tab, you can also view and modify the VM properties such as CPU, core per CPU, and memory.
      • Resources : Review the configuration settings for the VM resources such as disks, networks, and boot configuration. Here, you can modify the network settings but cannot modify any other settings.
      • Management : If the template allows you to modify the settings for guest OS customization, provide inputs for the same.
      • Review : Review the information displayed on this tab. Click Deploy to deploy the VMs.
      Figure. Deploy VM from Template using Advanced Deploy Method Click to enlarge Deploy VM from Template using Advanced Deploy Method

Managing a VM Template

About this task

After creating a template (see Creating a VM Template), you can update the guest OS of the source VM of the template, complete guest OS update, cancel guest OS update, update the configuration of the template to create a version, or delete the template.

You can perform these tasks by using any of the following methods:

  • Select the target template in the templates dashboard (see VM Template Summary View topic in the Prism Central Guide ) and choose the required action from the Actions menu.
  • Go to the details page of a selected template (see VM Template Details View topic in the Prism Central Guide ) and select the desired action.
Note: The available actions appear in bold; other actions are not available. The available actions depend on the current state of the template and your permissions.

Procedure

  • To update the guest OS of the template, select Update Guest OS . Click Proceed .
    This step deploys a temporary VM and gives you access to that VM. You can also access the new VM from VM dashboard. You must start the VM, log on to the VM, and update the guest OS of that VM.
    Figure. Update Guest OS Click to enlarge Update Guest OS

    Once you have updated the guest OS of the temporary VM, you must complete the guest OS update from Prism Central UI. Alternatively, you can choose to cancel the guest OS update, if you choose to discontinue the guest OS update process.

    The temporary VM automatically gets deleted after completion or cancellation of the guest OS upgrade process.

  • To complete the process of guest OS update that you had earlier initiated, select Complete Guest OS Update . You must select this option only after successful update of the guest OS of the temporary VM.
  • To cancel the process of guest OS update that you had earlier initiated, select Cancel Guest OS Update .
  • To modify the template configuration, select Update Configuration . You cannot modify the configuration of the initial version of the template, but you can update the configuration settings to create a new version of the template. In Select a version to Upgrade window, select the version of the template that you want to modify and create a version on top of it.

    The Update Template Configuration window displays the following sections:

    • Configuration : You can view the name of the base version that you want to update, change notes for that version, cluster name, VM properties (CPU, cores per CPU), and memory). In this section, you can modify only VM properties.
    • Resources : You can view the information about disks, networks, and boot configuration. In this section, you can modify only network resources.
    • Management : In this section, you can modify the guest customization settings.
    • Review : In this section, you can review and modify the configuration settings that you are allowed to modify. You must provide a name and change notes for the new version. You can also choose to set this new version as active version. An active version is the version of the template that by default gets deployed when you click the Deploy VMs button.

    Click Save to save the settings and create a new version of the template.

    Figure. Update Template Configuration Click to enlarge Update Template Configuration

  • To delete a template, select Delete Template . A window prompt appears; click the OK button to delete the template.
Read article

Hyper-V Administration for Acropolis

AOS 5.20

Product Release Date: 2021-05-17

Last updated: 2022-09-20

Node Management

Logging on to a Controller VM

If you need to access a Controller VM on a host that has not been added to SCVMM or Hyper-V Manager, use this method.

Procedure

  1. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
  2. Log on to the Controller VM.
    > ssh nutanix@192.168.5.254

    Accept the host authenticity warning if prompted.

Placing the Controller VM and Hyper-V Host in Maintenance Mode

It is recommended that you place the Controller VM and Hyper-V host into maintenance mode when performing any maintenance or patch installation for the cluster.

Before you begin

Migrate the VMs that are running on the node to other nodes in the cluster.

About this task

Caution: Verify the data resiliency status of your cluster. You can only place one node in maintenance mode for each cluster.

To place the Controller VM and Hyper-V host in maintenance mode, do the following.

Procedure

  1. Log on to the Controller VM with SSH and get the CVM host ID.
    nutanix@cvm$ ncli host ls
  2. Run the following command to place the CVM in maintenance mode.
    nutanix@cvm$ ncli host edit id=host_id enable-maintenance-mode=true
    Replace host_id with the CVM host ID
  3. Log on to the Hyper-V host with Remote Desktop Connection and pause the Hyper-V host in the failover cluster using PowerShell.
    > Suspend-ClusterNode

Shutting Down a Node in a Cluster (Hyper-V)

Shut down a node in a Hyper-V cluster.

Before you begin

Shut down guest VMs that are running on the node, or move them to other nodes in the cluster.

In a Hyper-V cluster, you do not need to put the node in maintenance mode before you shut down the node. The steps to shut down the guest VMs running on the node or moving them to another node, and shutting down the CVM are adequate.

About this task

Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

Perform the following procedure to shut down a node in a Hyper-V cluster.

Procedure

  1. Log on to the Controller VM with SSH and shut down the Controller VM.
    nutanix@cvm$ cvm_shutdown -P now
    Note:

    Always use the cvm_shutdown command to reset, or shutdown the Controller VM. The cvm_shutdown command notifies the cluster that the Controller VM is unavailable.

  2. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
  3. Do one of the following to shut down the node.
    • > shutdown /s /t 0
    • > Stop-Computer -ComputerName localhost

    See the Microsoft documentation for up-to-date and additional details about how to shut down a Hyper-V node.

Starting a Node in a Cluster (Hyper-V)

After you start or restart a node in a Hyper-V cluster, verify if the Controller VM (CVM) is powered on and if the CVM is added to the metadata.

About this task

Perform the following steps to start a node in a Hyper-V cluster.

Procedure

  1. Power on the node. Do one of the following:
    • Press the power button on the front of the physical hardware server.
    • Use a remote tool such as iDRAC, iLO, or IPMI depending on your hardware.
  2. Log on to Hyper-V Manager and start PowerShell.
  3. Determine if the Controller VM is running.
    > Get-VM | Where {$_.Name -match 'NTNX.*CVM'}
    • If the Controller VM is off, a line similar to the following should be returned:
      NTNX-13SM35230026-C-CVM Stopped -           -             - Opera...

      Make a note of the Controller VM name in the second column.

    • If the Controller VM is on, a line similar to the following should be returned:
      NTNX-13SM35230026-C-CVM Running 2           16384             05:10:51 Opera...
  4. If the CVM is not powered on, power on the CVM by using Hyper-V Manager.
  5. Log on to the CVM with SSH and verify if the CVM is added back to the metadata.
    nutanix@cvm$ nodetool -h 0 ring

    The state of the IP address of the CVM you started must be Normal as shown in the following output.

    nutanix@cvm$ nodetool -h 0 ring
    Address         Status State      Load            Owns    Token                                                          
                                                              kV0000000000000000000000000000000000000000000000000000000000   
    XX.XXX.XXX.XXX  Up     Normal     1.84 GB         25.00%  000000000000000000000000000000000000000000000000000000000000   
    XX.XXX.XXX.XXX  Up     Normal     1.79 GB         25.00%  FV0000000000000000000000000000000000000000000000000000000000   
    XX.XXX.XXX.XXX  Up     Normal     825.49 MB       25.00%  V00000000000000000000000000000000000000000000000000000000000   
    XX.XXX.XXX.XXX  Up     Normal     1.87 GB         25.00%  kV0000000000000000000000000000000000000000000000000000000000
  6. Power on or failback the guest VMs by using Hyper-V Manager or Failover Cluster Manager.

Enabling 1 GbE Interfaces (Hyper-V)

If 10 GbE networking is specified during cluster setup, 1 GbE interfaces are disabled on Hyper-V nodes. Follow these steps if you need to enable the 1 GbE interfaces later.

About this task

To enable the 1 GbE interfaces, do the following on each host:

Procedure

  1. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
  2. List the network adapters.
    > Get-NetAdapter | Format-List Name,InterfaceDescription,LinkSpeed

    Output similar to the following is displayed.

    Name                 : vEthernet (InternalSwitch)
    InterfaceDescription : Hyper-V Virtual Ethernet Adapter #3
    LinkSpeed            : 10 Gbps
    
    Name                 : vEthernet (ExternalSwitch)
    InterfaceDescription : Hyper-V Virtual Ethernet Adapter #2
    LinkSpeed            : 10 Gbps
    
    Name                 : Ethernet
    InterfaceDescription : Intel(R) 82599 10 Gigabit Dual Port Network Connection
    LinkSpeed            : 10 Gbps
    
    Name                 : Ethernet 3
    InterfaceDescription : Intel(R) 82599 10 Gigabit Dual Port Network Connection #2
    LinkSpeed            : 10 Gbps
    
    Name                 : NetAdapterTeam
    InterfaceDescription : Microsoft Network Adapter Multiplexor Driver
    LinkSpeed            : 20 Gbps
    
    Name                 : Ethernet 4
    InterfaceDescription : Intel(R) I350 Gigabit Network Connection #2
    LinkSpeed            : 0 bps
    
    Name                 : Ethernet 2
    InterfaceDescription : Intel(R) I350 Gigabit Network Connection
    LinkSpeed            : 1 Gbps

    Make a note of the Name of the 1 GbE interfaces you want to enable.

  3. Configure the interface.

    Replace interface_name with the name of the 1 GbE interface as reported by Get-NetAdapter .

    1. Enable the interface.
      > Enable-NetAdapter -Name "interface_name"
    2. Add the interface to the NIC team.
      > Add-NetLBFOTeamMember -Team NetAdapterTeam -Name "interface_name"

      If you want to configure the interface as a standby for the 10 GbE interfaces, include the parameter -AdministrativeMode Standby

    Perform these steps once for each 1 GbE interface you want to enable.

Changing the Hyper-V Host Password

The cluster software needs to be able to log into each host as Admin to perform standard cluster operations, such as querying the status of VMs in the cluster. Therefore, after changing the Administrator password it is critical to update the cluster configuration with the new password.

About this task

Tip: Although it is not required for the Administrator user to have the same password on all hosts, doing so makes cluster management and support much easier. If you do select a different password for one or more hosts, make sure to note the password for each host.

Procedure

  1. Change the Admin password of all hosts.
    Perform these steps on every Hyper-V host in the cluster.
    1. Log on to the Hyper-V host with Remote Desktop Connection.
    2. Press Ctrl+Alt+End to display the management screen.
    3. Click Change a Password .
    4. Enter the old password and the new password in the specified fields and click the right arrow button.
    5. Click Ok to acknowledge the password change.
  2. Update the Administrator user password for all hosts in the cluster configuration.
    Warning: If you do not perform this step, the web console no longer shows correct statistics and alerts, and other cluster operations fail.
    1. Log on to any CVM in the cluster with SSH.
    2. Find the host IDs.

      On the clusters running the AOS release 4.5.x, type:

      nutanix@cvm$ ncli host list | grep -E 'ID|Hypervisor Address'

      On the clusters running the AOS release 4.6.x or later, type:

      nutanix@cvm$ ncli host list | grep -E 'Id|Hypervisor Address'

      Note the host ID for each hypervisor host.

    3. Update the hypervisor host password.
      nutanix@cvm$ ncli managementserver edit name=host_addr \
       password='host_password' 
      nutanix@cvm$ ncli host edit id=host_id \
       hypervisor-password='host_password'
      • Replace host_addr with the IP address of the hypervisor host.
      • Replace host_id with a host ID you determined in the preceding step.
      • Replace host_password with the Admin password on the corresponding hypervisor host.

      Perform this step for every hypervisor host in the cluster.

Changing a Host IP Address

Perform these steps once for every hypervisor host in the cluster. Complete the entire procedure on a host before proceeding to the next host.

Before you begin

Remove the host from the failover cluster and domain before changing the host IP address.

Procedure

  1. Configure networking on the node by following Configuring Host Networking for Hyper-V Manually.
  2. Log on to every Controller VM in the cluster and restart genesis.
    nutanix@cvm$ genesis restart

    If the restart is successful, output similar to the following is displayed.

    Stopping Genesis pids [1933, 30217, 30218, 30219, 30241]
    Genesis started on pids [30378, 30379, 30380, 30381, 30403]

Changing the VLAN ID for Controller VM

About this task

Perform the following procedure to change the VLAN ID of the Controller VM.

Procedure

  1. Log on to the Hyper-V host with the IPMI remote console and run the following PowerShell command to get the VLAN settings configured.
    > Get-VMNetworkAdapterVlan
  2. Change the VLAN ID.
    > Set-VMNetworkAdapterVlan -VMName cvm_name -VMNetworkAdapterName External -Access -VlanID vlan_ID
    Replace cvm_name with the name of the Nutanix Controller VM.

    Replace vlan_ID with the new VLAN ID.

    Note: The VM name of the Nutanix Controller VM must begin with NTNX-

Configuring VLAN for Hyper-V Host

About this task

Perform the following procedure to configure Hyper-V host VLANs.

Procedure

  1. Log on to the Hyper-V host with the IPMI remote console.
  2. Start a PowerShell prompt and run the following command to create a variable for the ExternalSwitch.
    >$netAdapter = Get-VMNetworkAdapter -Name "ExternalSwitch" -ManagementOS
  3. To set a new VLAN ID for the ExternalSwitch.
    >Set-VMNetworkAdapterVlan -VMNetworkAdapter $netAdapter -Access -VlanId vlan_ID
    Replace vlan_ID with the new VLAN ID.
    You can now communicate to the Hyper-V host on the new subnet.

Configuring Host Networking for Hyper-V Manually

Perform the following procedure to manually configure the Hyper-V host networking.

About this task

Note: Do not add any other device (including guest VMs) to the VLAN to which the CVM and hypervisor host are assigned. Isolate guest VMs on one or more separate VLANs.

Procedure

  1. Log on to the Hyper-V host with the IPMI remote console and start a Powershell prompt.
  2. List the network adapters.
    > Get-NetAdapter | Format-List Name,InterfaceDescription,LinkSpeed

    Output similar to the following is displayed.

    Name                 : vEthernet (InternalSwitch)
    InterfaceDescription : Hyper-V Virtual Ethernet Adapter #3
    LinkSpeed            : 10 Gbps
    
    Name                 : vEthernet (ExternalSwitch)
    InterfaceDescription : Hyper-V Virtual Ethernet Adapter #2
    LinkSpeed            : 10 Gbps
    
    Name                 : Ethernet
    InterfaceDescription : Intel(R) 82599 10 Gigabit Dual Port Network Connection
    LinkSpeed            : 10 Gbps
    
    Name                 : Ethernet 3
    InterfaceDescription : Intel(R) 82599 10 Gigabit Dual Port Network Connection #2
    LinkSpeed            : 10 Gbps
    
    Name                 : NetAdapterTeam
    InterfaceDescription : Microsoft Network Adapter Multiplexor Driver
    LinkSpeed            : 20 Gbps
    
    Name                 : Ethernet 4
    InterfaceDescription : Intel(R) I350 Gigabit Network Connection #2
    LinkSpeed            : 0 bps
    
    Name                 : Ethernet 2
    InterfaceDescription : Intel(R) I350 Gigabit Network Connection
    LinkSpeed            : 0 bps

    Make a note of the InterfaceDescription for the vEthernet adapter that links to the physical interface you want to modify.

  3. Start the Server Configuration utility.
    > sconfig
  4. Select Networking Settings by typing 8 and pressing Enter .
  5. Change the IP settings.
    1. Select a network adapter by typing the Index number of the adapter you want to change (refer to the InterfaceDescription you found in step 2) and pressing Enter .
      Warning: Do not select the network adapter with the IP address 192.168.5.1 . This IP address is required for the Controller VM to communicate with the host.
    2. Select Set Network Adapter Address by typing 1 and pressing Enter .
    3. Select Static by typing S and pressing Enter .
    4. Enter the IP address for the host and press Enter .
    5. Enter the subnet mask and press Enter .
    6. Enter the IP address for the default gateway and press Enter .
      The host networking settings are changed.
  6. (Optional) Change the DNS servers.
    DNS servers must be configured for a host to be part of a domain. You can either change the DNS servers in the sconfig utility or with setup_hyperv.py .
    1. Select Set DNS Servers by typing 2 .
    2. Enter the primary and secondary DNS servers and press Enter .
      The DNS servers are updated.
  7. Exit the Server Configuration utility by typing 4 and pressing Enter then 15 and pressing Enter .

Joining a Host to a Domain Manually

About this task

For information about how to join a host to a domain by using utilities provided by Nutanix, see Joining the Cluster and Hosts to a Domain . Perform these steps for each Hyper-V host in the cluster to manually join a host to a domain.

Procedure

  1. Log on to the Hyper-V host with the IPMI remote console and start a Powershell prompt.
  2. Join the host to the domain and rename it.
    > Add-Computer -DomainName domain_name -NewName node_name `
     -Credential domain_name\domain_admin_user -Restart -Force
    • Replace domain_name with the name of the join for the host to join.
    • Replace node_name with a new name for the host.
    • Replace domain_admin_user with the domain administrator username.
    The host restarts and joins the domain.

Changing CVM Memory Configuration (Hyper-V)

About this task

You can increase the memory reserved for each Controller VM in your cluster by using the 1-click Controller VM Memory Upgrade available from the Prism Element web console. Increase memory size depending on the workload type or to enable certain AOS features. For more information about CVM memory sizing recommendations and instructions about how to increase the CVM memory, see Increasing the Controller VM Memory Size in the Prism Web Console Guide .

Hyper-V Configuration

Before configuring Nutanix storage on Hyper-V, you need to ensure that you meet the requirements of Hyper-V installation. For more information, see Hyper-V Installation Requirements. After you configure all the pre-requisites for installing and setting up Hyper-V, you need to join the Hyper-V cluster and its constituent hosts to the domain and then create a failover cluster.

Hyper-V Installation Requirements

Ensure that the following requirements are met before installing Hyper-V.

Windows Active Directory Domain Controller

Requirements:

  • For a fresh installation, you need a version of Nutanix Foundation that is compatible with the version of Windows Server you want to install.
    Note: To install Windows Server 2016, you need Foundation 3.11.2 or later. For more information, see the Field Installation Guide.
  • The primary domain controller version must at least be 2008 R2.
    Note: If you have Volume Shadow Copy Service (VSS) based backup tool (for example Veeam), functional level of Active Directory must be 2008 or higher.
  • Active Directory Web Services (ADWS) must be installed and running. By default, connections are made over TCP port 9389, and firewall policies must enable an exception on this port for ADWS.

    To test that ADWS is installed and running on a domain controller, log on by using a domain administrator account in a Windows host other than the domain controller host that is joined to the same domain and has the RSAT-AD-Powershell feature installed, and run the following PowerShell command. If the command prints the primary name of the domain controller, then ADWS is installed and the port is open.

> (Get-ADDomainController).Name
  • The domain controller must run a DNS server.
    Note: If any of the above requirements are not met, you need to manually create an Active Directory computer object for the Nutanix storage in the Active Directory, and add a DNS entry for the name.
  • Ensure that the Active Directory domain is configured correctly for consistent time synchronization.
  • Place the AD server in a separate virtual or physical host residing in storage that is not dependent on the domains that the AD server manages.
    Note: Do not run a virtual Active Directory domain controller (DC) on a Nutanix Hyper-V cluster and join the cluster to the same domain.

Accounts and Privileges:

  • An Active Directory account with permission to create new Active Directory computer objects for either a storage container or Organizational Unit (OU) where Nutanix nodes are placed. The credentials of this account are not stored anywhere.
  • An account that has sufficient privileges to join a Windows host to a domain. The credentials of this account are not stored anywhere. These credentials are only used to join the hosts to the domain.

Additional Information Required:

  • The IP address of the primary domain controller.
    Note: The primary domain controller IP address is set as the primary DNS server on all the Nutanix hosts. It is also set as the NTP server in the Nutanix storage cluster to keep the Controller VM, host, and Active Directory time synchronized.
  • The fully qualified domain name to which the Nutanix hosts and the storage cluster is going to be joined.

SCVMM

Note: Relevant only if you have SCVMM in your environment.

Requirements:

  • The SCVMM version must be at least 2016 and it must be installed on Windows Server 2016. If you have SCVMM on an earlier release, upgrade it to 2016 before you register a Nutanix cluster running Hyper-V.
  • Kerberos authentication for storage is optional for Windows Server 2012 R2 (see Enabling Kerberos for Hyper-V), but it is required for Windows Server 2016. However, for Kerberos authentication to work with Windows Server 2016, the active directory server must reside outside the Nutanix cluster.
  • The SCVMM server must allow PowerShell remoting.

    To test this scenario, log on by using the SCVMM administrator account in a Windows host and run the following PowerShell command on a Windows host that is different to the SCVMM host (for example, run the command from the domain controller). If they print the name of the SCVMM server, then PowerShell remoting to the SCVMM server is not blocked.

    > Invoke-Command -ComputerName scvmm_server -ScriptBlock {hostname} -Credential MYDOMAIN\username

    Replace scvmm_server with the SCVMM host name and MYDOMAIN with Active Directory domain name.

    Note: If the SCVMM server does not allow PowerShell remoting, you can perform the SCVMM setup manually by using the SCVMM user interface.
  • The ipconfig command must run in a PowerShell window on the SCVMM server. To verify run the following command.

    > Invoke-Command -ComputerName scvmm_server_name -ScriptBlock {ipconfig} -Credential MYDOMAIN\username

    Replace scvmm_server_name with the SCVMM host name and MYDOMAIN with Active Directory domain name.

  • The SMB client configuration in the SCVMM server should have RequireSecuritySignature set to False. To verify, run the following command.

    > Invoke-Command -ComputerName scvmm_server_name -ScriptBlock {Get-SMBClientConfiguration | FL RequireSecuritySignature}

    Replace scvmm_server_name with the SCVMM host name.

    This can be set to True by a domain policy. In this case, the domain policy should be modified to set it to False. Also, if it is True, this can be configured back to False, but might not get changed throughout if there is a policy that reverts it back to True. To change it, you can use the following command in the PowerShell on the SCVMM host by logging in as a domain administrator.

    Set-SMBClientConfiguration -RequireSecuritySignature $False -Force

    If you are changing it from True to False, it is important to confirm that the policies that are on the SCVMM host have the correct value. On the SCVMM host run rsop.msc to review the resultant set of policy details, and verify the value by navigating to, Servername > Computer Configuration > Windows Settings > Security Settings > Local Policies > Security Options: Policy Microsoft network client: Digitally sign communications (always). The value displayed in RSOP must be, Disabled or Not Defined for the change to persist. Also, the group policies that have been configured in the domain to apply to the SCVMM server should to be updated to change this to Disabled, if the RSOP shows it as Enabled. Otherwise, the RequireSecuritySignature changes back to True at a later time. After setting the policy in Active Directory and propagating to the domain controllers, refresh the SCVMM server policy by running the command gpupdate /force . Confirm in RSOP that the value is Disabled .
    Note: If security signing is mandatory, then you need to enable Kerberos in the Nutanix cluster. In this case, it is important to ensure that the time remains synchronized between the Active Directory server, the Nutanix hosts, and the Nutanix Controller VMs. The Nutanix hosts and the Controller VMs set their NTP server as the Active Directory server, so it should be sufficient to ensure that Active Directory domain is configured correctly for consistent time synchronization.

Accounts and Privileges:

  • When adding a host or a cluster to the SCVMM, the run-as account you are specifying for managing the host or cluster must be different from the service account that was used to install SCVMM.
  • Run-as account must be a domain account and must have local administrator privileges on the Nutanix hosts. This can be a domain administrator account. When the Nutanix hosts are joined to the domain, the domain administrator accounts automatically takes administrator privileges on the host. If the domain account used as the run-as account in SCVMM is not a domain administrator account, you need to manually add it to the list of local administrators on each host by running sconfig .
    • SCVMM domain account with administrator privileges on SCVMM and PowerShell remote execution privileges.
  • If you want to install SCVMM server, a service account with local administrator privileges on the SCVMM server.

IP Addresses

  • One IP address for each Nutanix host.
  • One IP address for each Nutanix Controller VM.
  • One IP address for each Nutanix host IPMI interface.
  • One IP address for the Nutanix storage cluster.
  • One IP address for the Hyper-V failover cluster.
Note: For N nodes, (3*N + 2) IP addresses are required. All IP addresses must be in the same subnet.

DNS Requirements

  • Each Nutanix host must be assigned a name of 15 characters or less, which gets automatically added to the DNS server during domain joining.
  • The Nutanix storage cluster needs to be assigned a name of 15 characters or less, which must be added to the DNS server when the storage cluster is joined to the domain.
  • The Hyper-V failover cluster must be assigned a name of 15 characters or less, which gets automatically added to the DNS server when the failover cluster is created.
  • After the Hyper-V configuration, all names must resolve to an IP address in the Nutanix hosts, the SCVMM server (if applicable), or any other host that needs access to the Nutanix storage, for example, a host running the Hyper-V Manager.

Storage Access Requirements

  • Virtual machine and virtual disk paths must always refer to the Nutanix storage cluster by name, not the external IP address. If you use the IP address, it directs all the I/O to a single node in the cluster and thereby compromises performance and scalability.
    Note: For external non-Nutanix host that needs to access Nutanix SMB shares, see Nutanix SMB Shares Connection Requirements from Outside the Cluster topic.

Host Maintenance Requirements

  • When applying Windows updates to the Nutanix hosts, the hosts should be restarted one at a time, ensuring that Nutanix services comes up fully in the Controller VM of the restarted host before updating the next host. This can be accomplished by using Cluster Aware Updating and using a Nutanix-provided script, which can be plugged into the Cluster Aware Update Manager as a pre-update script. This pre-update script ensures that the Nutanix services go down on only one host at a time ensuring availability of storage throughout the update procedure. For more information about cluster-aware updating, see Installing Windows Updates with Cluster-Aware Updating.
    Note: Ensure that automatic Windows updates are not enabled for the Nutanix hosts in the domain policies.

General Host Requirements

  • Hyper-V hosts must have the remote script execution policy set at least to RemoteSigned . A Restricted setting might cause issues when you reboot the CVM.
Note: Nutanix supports the installation of language packs for Hyper-V hosts.

Limitations and Guidelines

Nutanix clusters running Hyper-V have the following limitations. Certain limitations might be attributable to other software or hardware vendors:

Guidelines

Hyper-V 2016 Clusters and Support for Windows Server 2016
  • VHD Set files (.vhds) are a new shared Virtual Disk model for guest clusters in Microsoft Server 2016 and are not supported. You can import existing shared .vhdx disks to Windows Server 2016 clusters. New VHDX format sharing is supported. Only fixed-size VHDX sharing is supported.

    Use the PowerShell Add-VMHardDiskDrive command to attach any existing or new VHDX file in shared mode to VMs. For example: Add-VMHardDiskDrive -VMName Node1 -Path \\gogo\smbcontainer\TestDisk\Shared.vhdx -SupportPersistentReservations .

Upgrading Hyper-V Hypervisor Hosts
  • When upgrading hosts to Hyper-V 2016, 2019, and later versions, the local administrator user name and password is reset to the default administrator name Administrator and password of nutanix/4u. Any previous changes to the administrator name and/or password are overwritten.
General Guidelines
  • Note: Do not add any other device (including guest VMs) to the VLAN to which the CVM and hypervisor host are assigned. Isolate guest VMs on one or more separate VLANs.
  • If you are destroying a cluster and creating a new one and want to reuse the hostnames, failover cluster name, and storage object name of the previous cluster, remove their computer accounts and objects from AD and DNS first.

Limitations

  • Intel Advanced Network Services (ANS) is not compatible with Load Balancing and Failover (LBFO), the built-in NIC teaming feature in Hyper-V. For more information, see the Intel support article, Teaming with Intel® Advanced Network Services .
  • Nutanix does not support the online resizing of the shared virtual hard disks (VHDX files).

Configuration Scenarios

After using Foundation to create a cluster, you can use the Nutanix web console to join the Hyper-V cluster and its constituent hosts to the domain, create the Hyper-V failover cluster, and enable Kerberos.

Note: If you are installing Windows Server 2016, you do not have to enable Kerberos. Kerberos is enabled during cluster creation.

You can then use the setup_hyperv.py script to add host and storage to SCVMM, configure a Nutanix library share in SCVMM, and register Nutanix storage containers as file shares in SCVMM.

Note: You can use the setup_hyperv.py script only with a standalone SCVMM instance. The script does not work with an SCVMM cluster.

The usage of the setup_hyperv.py script is as follows.

nutanix@cvm$ setup_hyperv.py flags command
commands:
register_shares
setup_scvmm

Nonconfigurable Components

The components listed here are configured by the Nutanix manufacturing and installation processes. Do not modify any of these components except under the direction of Nutanix Support.

Nutanix Software

Modifying any of the following Nutanix software settings may inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.

  • Local datastore name.
  • Configuration and contents of any CVM (except memory configuration to enable certain features).
Important: Note the following important considerations about CVMs.
  • Do not delete the Nutanix CVM.
  • Do not take a snapshot of the CVM for backup.
  • Do not rename, modify, or delete the admin and nutanix user accounts of the CVM.
  • Do not create additional CVM user accounts.

    Use the default accounts ( admin or nutanix ), or use sudo to elevate to the root account.

  • Do not decrease CVM memory below recommended minimum amounts required for cluster and add-in features.

    Nutanix Cluster Checks (NCC), preupgrade cluster checks, and the AOS upgrade process detect and monitor CVM memory.

  • Nutanix does not support the usage of third-party storage on the host part of Nutanix clusters.

    Normal cluster operations might be affected if there are connectivity issues with the third-party storage you attach to the hosts in a Nutanix cluster.

  • Do not run any commands on a CVM that are not in the Nutanix documentation.

Hyper-V Settings

  • Cluster name (using the web console)
  • Controller VM name
  • Controller VM virtual hardware configuration file (.xml file in Hyper-V version 2012 R2 and earlier and .vmcx file in Hyper-V version 2016 and later). Each AOS version and upgrade includes a specific Controller VM virtual hardware configuration. Therefore, do not edit or otherwise modify the Controller VM virtual hardware configuration file.
  • Host name (you can configure the host name only at the time of creating and expanding the cluster)
  • Internal switch settings (internal virtual switch and internal virtual network adapter) and external network adapter name

    Two virtual switches are created on the Nutanix host, ExternalSwitch and InternalSwitch. Two virtual network adapters are created on the host corresponding to these virtual switches, vEthernet (ExternalSwitch) and vEthernet (InternalSwitch).

    Note: Do not delete these switches and adapters. Do not change the names of the internal virtual switch, internal virtual network adapter, and external virtual network adapter. You can change the name of the external virtual switch. For more information about changing the name of the external virtual switch, see Updating the Cluster After Renaming the Hyper-V External Virtual Switch.
  • Windows roles and features

    No new Windows roles or features must be installed on the Nutanix hosts. This especially includes the Multipath IO feature, which can cause the Nutanix storage to become unavailable.

    Do not apply GPOs to the Nutanix nodes that impact Log on as a service. It is recommended not to remove the default entries of the following service.

    NT Service\All Services

    NT Virtual Machine\Virtual Machines

  • Note: This best practice helps keep the host operating system free of roles, features, and applications that aren't required to run Hyper-V. For more information, see the Hyper-V should be the only enabled role document in the Microsoft documentation portal.
  • Controller VM pre-configured VM setting of Automatic Start Action
  • Controller VM high-availability setting
  • Controller VM operations: migrating, saving state, or taking checkpoints of the Controller VM

Adding the Cluster and Hosts to a Domain

After completing foundation of the cluster, you need to add the cluster and its constituent hosts to the Active Directory (AD) domain. The adding of cluster and hosts to the domain facilitates centralized administration and security through the use of other Microsoft services such as Group Policy and enables administrators to manage the distribution of updates and hotfixes.

Before you begin

  • If you have a VLAN segmented network, verify that you have assigned the VLAN tags to the Hyper-V hosts and Controller VMs. For information about how to configure VLANs for the Controller VM, see the Advanced Setup Guide.
  • Ensure that you have valid credentials of the domain account that has the privileges to create a new computer account or modify an existing computer account in the Active Directory domain. An Active Directory domain created by using non-ASCII text may not be supported. For more information about usage of ASCII or non-ASCII text in Active Directory configuration, see Internationalization (i18n) .

Procedure

  1. Log on to the Web Console by using one of the Controller VM IP address or by using cluster virtual IP address.
  2. Click the gear icon in the main menu and select Join Cluster and Hosts to the Domain on the Settings page.
    Figure. Join Cluster and Hosts to the Domain
    Click to enlarge A sample image of the Join Cluster and Hosts to the Domain menu used to add a cluster and its constituent hosts to an AD domain.

  3. Enter the fully qualified name of the domain that you want to join the cluster and its constituent hosts to in the Full Domain Name box.
  4. Enter the IP address of the name server in the Name Server IP Address box that can resolve the domain name that you have entered in the Full Domain Name box.
  5. In the Base OU Path box, type the OU (organizational unit) path where the computer accounts must be stored after the host joins a domain. For example, if the organization is nutanix.com and the OU is Documentation, the Base OU Path can be specified as OU=Documentation,DC=nutanix,DC=com
    Specifying the Base OU Path is optional. When you specify the Base OU Path, the computer accounts are stored in the Base OU Path within the Active Directory after the hosts join a domain. If the Base OU Path is not specified, the computer accounts are stored in the default Computers OU.
  6. Enter a name for the cluster in the Nutanix Cluster Name box.
    The maximum length of the cluster name should not be more than 15 characters and it should be a valid NetBIOS name.
  7. Enter the virtual IP address of the cluster in the Nutanix Cluster Virtual IP Address box.
    If you have not already configured the virtual IP address of the cluster, you can configure it by using this box.
  8. Enter the prefix that should be used to name the hosts (according to your convention) in the Prefix box.
    • The prefix name should not end with a period.
    • The maximum length of the prefix name should not be more than 11 characters.
    • Should be a valid NetBIOS name.

      For example, if you enter prefix name as Tulip, the hosts are named as Tulip-1, Tulip-2, and so on, in the increasing order of the external IP address of the hosts.

    If you do not provide any prefix, the default name of NTNX- block-number is used. Click Advanced View to see the expanded view of all the hosts in all the blocks of the cluster and to rename them individually.
  9. In the Credentials field, enter the logon name and password of the domain account that has the privileges to create a new or modify an existing computer accounts in the Active Directory domain.
    Ensure that the logon name is in the DOMAIN\USERNAME format. The cluster and its constituent hosts require these credentials to join the AD domain. Nutanix does not store the credentials.
  10. When all the information is correct, click Join .
    The cluster is added to the domain. Also, all the hosts are renamed, added to the domain, and restarted. Allow the hosts and Controller VMs a few time to start-up. After the cluster is ready, the logon page is displayed.

What to do next

Create a Microsoft failover cluster. For more information, see Creating a Failover Cluster for Hyper-V.

Creating a Failover Cluster for Hyper-V

Before you begin

Perform the following tasks before you create a failover cluster:

Perform the following procedure to create a failover cluster that includes all the hosts in the cluster.

Procedure

  1. Log on to the Prism Element web console by using one of the Controller VM IP addresses or by using the cluster virtual IP address.
  2. Click the gear icon in the main menu and select Configure Failover Cluster from the Settings page.
    Figure. Configure Failover Cluster
    Click to enlarge Configuring Failover Cluster

  3. Type the failover cluster name in the Failover Cluster Name text box.
    The maximum length of the failover cluster name must not be more than 15 characters and must be a valid NetBIOS name.
  4. Type an IP address for the Hyper-V failover cluster in the Failover Cluster IP Address text box.
    This address is for the cluster of Hyper-V hosts that are currently being configured. It must be unique, different from the cluster virtual IP address and from all other IP addresses assigned to the hosts and Controller VMs. It must be in the same network range as the Hyper-V hosts.
  5. In the Credentials field, type the logon name and password of the domain account that has the privileges to create a new account or modify existing accounts in the Active Directory domain.
    The logon name must be in the format DOMAIN\USERNAME . The credentials are required to create a failover cluster. Nutanix does not store the credentials.
  6. Click Create Cluster .
    A failover cluster is created by the name that has been provided and it includes all the hosts in the cluster.
    For information on manually creating a failover cluster, see Manually Creating a Failover Cluster (SCVMM User Interface).

Manually Creating a Failover Cluster (SCVMM User Interface)

Join the hosts to the domain as described in Adding the Cluster and Hosts to a Domain in the Hyper-V Administration for Acropolis guide.

About this task

Perform the following procedure to manually create a failover cluster for Hyper-V by using System Center VM Manager (SCVMM).

If you are not using SCVMM or are using Hyper-V Manager, see Creating a Failover Cluster for Hyper-V.

Procedure

  1. Start the Failover Cluster Manager utility.
  2. Right-click and select Create Cluster , and click Next .
  3. Enter all the hosts that you want to add to the Failover cluster, and click Next .
  4. Select the No. I do not require support from Microsoft for this cluster, and therefore do not want to run the validation tests. When I click Next continue creating the cluster option, and click Next .
    Note:

    If you select Yes , two tests fail when you run the cluster validation tests. The tests fail because the internal network adapter on each host is configured with the same IP address (192.168.5.1). The network validation tests fail with the following error message:

    Duplicate IP address

    The failures occur despite the internal network being reachable only within a host, so the internal adapter can have the same IP address on different hosts. The second test, Validate Network Communication, fails due to the presence of the internal network adapter. Both failures are benign and can be ignored.

  5. Enter a name for the cluster, specify a static IP address, and click Next .
  6. Clear the All eligible storage to the cluster check box, and click Next .
  7. Wait until the cluster is created. After you receive the message that the cluster is successfully created, click Finish to exit the Cluster Creation wizard.
  8. Go to Networks in the cluster tree and select Cluster Network 1 and ensure it is in the internal network by verifying the IP address in the summary pane. The IP address must be 192.168.5.0/24 as shown in the following screen shot.
    Figure. Failover Cluster Manager Click to enlarge

  9. Click the Action tab on the toolbar and select Live Migration Settings .
  10. Remove Cluster Network 1 from Networks for Live Migration and click OK .
    Note: If you do not perform this step, live migrations fail because the internal network is added to the live migration network lists. Log on to SCVMM, add the cluster to SCVMM, check the host migration setting, and ensure that the internal network is not listed.

Changing the Failover Cluster IP Address

About this task

Perform the following procedure to change your Hyper-V failover cluster IP address.

Procedure

  1. Open Failover Cluster Manager and connect to your cluster.
  2. Enter the name of any one of the Hyper-V hosts and click OK .
  3. In the Failover Cluster Manager pane, select your cluster and expand Cluster Core Resources .
  4. Right-click select the cluster, and select Properties > IP address .
  5. Change the IP address of your failover cluster using the Edit option and click OK .
  6. Click Apply .

Enabling Kerberos for Hyper-V

If you are running Windows Server 2012 R2, perform the following procedure to configure Kerberos to secure the storage. You do not have to perform this procedure for Windows Server 2016 because Kerberos is enabled automatically during failover cluster creation.

Before you begin

  • Join the hosts to the domain as described in Adding the Cluster and Hosts to a Domain.
  • Verify that you have configured a service account for delegation. For more information on enabling delegation, see the Microsoft documentation .

Procedure

  1. Log on to the web console by using one of the Controller VM IP addresses or by using the cluster virtual IP address.
  2. Click the gear icon in the main menu and select Kerberos Management from the Settings page.
    Figure. Configure Failover Cluster Click to enlarge Enabling Kerberos

  3. Set the Kerberos Required option to enabled.
  4. In the Credentials field, type the logon name and password of the domain account that has the privileges to create and modify the virtual computer object representing the cluster in Active Directory. The credentials are required for enabling Kerberos.
    The logon name must be in the format DOMAIN\USERNAME . Nutanix does not store the credentials.
  5. Click Save .

Configuring the Hyper-V Computer Object by Using Kerberos

About this task

Perform the following procedure to complete the configuration of the Hyper-V Computer Object by using Kerberos and SMB signing (for enhanced security).
Note: Nutanix recommends you to configure Kerberos during a maintenance window to ensure cluster stability and prevent loss of storage access for user VMs.

Procedure

  1. Log on to Domain Controller and perform the following for each Hyper-V host computer object.
    1. Right-click the host object, and go to Properties . In the Delegation tab, select the Trust this computer for delegation to specified services only option, and select Use any authentication protocol .
    2. Click Add to add the cifs of the Nutanix storage cluster object.
    Figure. Adding the cifs of the Nutanix storage cluster object Click to enlarge

  2. Check the Service Principal Name (SPN) of the Nutanix storage cluster object.
    > Setspn -l name_of_cluster_object

    Replace name_of_cluster_object with the name of the Nutanix storage cluster object.

    Output similar to the following is displayed.

    Figure. SPN Registration Click to enlarge

    If the SPN is not registered for the Nutanix storage cluster object, create the SPN by running the following commands.

    > Setspn -S cifs/name_of_cluster_object name_of_cluster_object
    > Setspn -S cifs/FQDN_of_the_cluster_object name_of_cluster_object

    Replace name_of_cluster_object with the name of the Nutanix storage cluster object and FQDN_of_the_cluster_object with the domain name of the Nutanix storage cluster object.

    Example

    > Setspn -S cifs/virat virat
    > Setspn -S cifs/virat.sre.local virat
    
  3. [Optional] To enable SMB signing feature, log on to each Hyper-V host by using RDP and run the following PowerShell command to change the Require Security Signature setting to True .
    > Set-SMBClientConfiguration -RequireSecuritySignature $True –Force
    Caution: The SMB server will only communicate with an SMB client that can perform SMB packet signing, therefore if you decide to enable the SMB signing feature, it must be enabled for all the Hyper-V hosts in the cluster.

Disabling Kerberos for Hyper-V

Perform the following procedure to disable Kerberos.

Procedure

  1. Disable SMB signing.
    Log on to each Hyper-V host by using RDP and run the following PowerShell command to change the Require Security Signature setting to False .
    Set-SMBClientConfiguration -RequireSecuritySignature $False –Force
  2. Disable Kerberos from the Prism web console.
    1. Log into the web console by using one of the Controller VM IP address or by using cluster virtual IP address.
    2. From the gear icon, click Kerberos Management .
    3. Set Kerberos Required button to disabled.
    4. In the Credentials field, type the logon name and password of the domain account that has the privileges to create modify the virtual computer object representing the cluster in the Active Directory. The credentials are required for enabling Kerberos.
      This logon name must be in the format DOMAIN\USERNAME . Nutanix does not store the credentials.
    5. Click Save .

Setting Up Hyper-V Manager

Perform the following steps to set up Hyper-V Manager.

Before you begin

  • Add the server running Hyper-V Manager to the allowlist by using the Prism user interface. For more information, see Configuring a Filesystem Whitelist in the Prism Web Console Guide .
  • If Kerberos is enabled for accessing storage (by default it is disabled), enable SMB delegation.

Procedure

  1. Log into the Hyper-V Manager.
  2. Right-click the Hyper-V Manager and select Connect to Server .
  3. Type the name of the host that you want to add and click OK .
  4. Right-click the host and select Hyper-V Settings .
  5. Click Virtual Hard Disks and verify that the location to store virtual hard disk files is same that you have specified during storage container creation.
    For more information, see Creating a Storage Container in the Prism Web Console Guide .
  6. Click Virtual Machines and verify that the location to store virtual machine configuration files is same that you have specified during storage container creation.
    For more information, see Creating a Storage Container in the Prism Web Console Guide .
    After performing these steps, you are ready to create and manage virtual machines by using Hyper-V Manager.
    Warning: Virtual machines created by using Hyper-V should never be defined on storage using IP-based SMB share location.

Cluster Management

Installing Windows Updates with Cluster-Aware Updating

With storage containers that are configured with a replication factor of 2, Nutanix clusters can tolerate only a single node being down at a time. For such clusters, you need a way to update nodes one node at a time.

If your Nutanix cluster runs Microsoft Hyper-V, you can use the Cluster-Aware Updating (CAU) utility, which ensures that only one node is down at a time when Windows updates are applied.

Note: Nutanix does not recommend performing a manual patch installation for a Hyper-V cluster running on the Nutanix platform.

The procedure for configuring CAU for a Hyper-V cluster running on the Nutanix platform is the same as that for a Hyper-V cluster running on any other platform. However, for a Hyper-V cluster running on Nutanix, you need to use a Nutanix pre-update script created specifically for Nutanix clusters. The pre-update script ensures that the CAU utility does not proceed to the next node until the Controller VM on the node that was updated is fully back up, preventing a condition in which multiple Controller VMs are down at the same time.

The CAU utility might not install all the recommended updates, and you might have to install some updates manually. For a complete list of recommended updates, see the following articles in the Microsoft documentation portal.

  • Recommended hotfixes, updates, and known solutions for Windows Server 2012 R2 Hyper-V environments
  • Recommended hotfixes and updates for Windows Server 2012 R2-based failover clusters

Revisit these articles periodically and install any updates that are added to the list.

Note: Ensure that the Nutanix Controller VM and the Hyper-V host are placed in maintenance mode before any maintenance or patch installation. For more information, see Placing the Controller VM and Hyper-V Host in Maintenance Mode.

Preparing to Configure Cluster-Aware Updating

Configure your environment to run the Nutanix pre-update script for Cluster-Aware Updating. The Nutanix pre-update script is named cau_preupdate.ps1 and is, by default, located on each Hyper-V host in C:\Program Files\Nutanix\Utils\ . To ensure smooth configuration, make sure you have everything you need before you begin to configure CAU.

Before you begin

  • Review the required and recommended Windows updates for your cluster.
  • See the Microsoft documentation for information about the Cluster-Aware Updating feature. In particular, see the requirements and best practices for Cluster-Aware Updating in the Micorosoft documentation portal.
  • To enable the migration of virtual machines from one node to another, configure the virtual machines for high availability.

About this task

To configure your environment to run the Nutanix pre-update script, do the following:

Procedure

  1. If you plan to use self-updating mode, do the following:
    1. On each Hyper-V host and on the management workstation that you are using to configure CAU, create a directory such that the path to the directory and the directory name do not contain spaces (for example, C:\cau ).
      Note: The location of the directory must be the same on the hosts and the management workstation.
    2. From C:\Program Files\Nutanix\Utils\ on each host, copy the Nutanix pre-update file cau_preupdate.ps1 to the directory you created on the hosts and on the management workstation.

    A directory whose path does not contain spaces is necessary because Microsoft does not support the use of spaces in the PreUpdateScript field. The space in the default path ( C:\Program Files\Nutanix\Utils\ ) prevents the cluster from updating itself in the self-updating mode. However, that space does not cause issues if you update the cluster by using the remote-updating mode. If you plan to use only the remote-updating mode, you can use the pre-update script from its default location. If you plan to use the self-updating mode or both self-updating and remote-updating modes, use a directory whose path does not contain spaces.

  2. On each host, do the following.
    1. Unblock the script file.
      > powershell.exe Unblock–File -Path 'path-to-pre-update-script'

      Replace path-to-pre-update-script with the full path to the pre-update script (for example, C:\cau\cau_preupdate.ps1 ).

    2. Allow Windows PowerShell to run unsigned code.
      > powershell.exe  Set-ExecutionPolicy remoteSigned

Accessing the Cluster-Aware Updating Dialog Box

You configure CAU by using the Cluster-Aware Updating dialog box.

About this task

To access the Cluster-Aware Updating dialog box, do the following:

Procedure

  1. Open Failover Cluster Manager and connect to your cluster.
  2. In the Configure section, click Cluster-Aware Updating .
    Figure. Cluster-Aware Updating Dialog Box Click to enlarge "The Cluster-Aware Updating dialog box connects to a failover cluster. The dialog box displays the nodes in the cluster, a last update summary, logs of updates in progress, and links to CAU configuration options and wizards."

    The Cluster-Aware Updating dialog box appears. If the dialog box indicates that you are not connected to the cluster, in the Connect to a failover cluster field, enter the name of the cluster, and then click Connect .

Specifying the Nutanix Pre-Update Script in an Updating Run Profile

Specify the Nutanix pre-update script in an Updating Run and save the configuration to an Updating Run Profile in the XML format. This is a one-time task. The XML file contains the configuration for the cluster-update operation. You can reuse this file to drive cluster updates through both self-updating and remote-updating modes.

About this task

To specify the Nutanix pre-update script in an Updating Run Profile, do the following:

Procedure

  1. In the Cluster-Aware Updating dialog box, click Create or modify Updating Run Profile .
    You can see the current location of the XML file under the Updating Run profile to start from: field.
    Note: You cannot overwrite the default CAU configuration file, because non-local administrative users, including the AD administrative users, do not have permissions to modify files in the C:\Windows\System32\ directory.
  2. Click Save As .
  3. Select a new location for the file and rename the file. For example, you can rename the file to msfc_updating_run_profile.xml and save it to the following location: C:\Users\administrator\Documents .
  4. Click Save .
  5. In the Cluster-Aware Updating dialog box, under Cluster Actions , click Configure cluster self-updating options .
  6. Go to Input Settings > Advanced Options and, in the Updating Run options based on: field, click Browse to select the location to which you saved the XML file in an earlier step.
  7. In the Updating Run Profile Editor dialog box, in the PreUpdateScript field, specify the full path to the cau_preupdate.ps1 script. The default full path is C:\Program Files\Nutanix\Utils\cau_preupdate.ps1 . The default path is acceptable if you plan to use only the remote-updating mode. If you plan to use the self-updating mode, place cau_preupdate.ps1 in a directory such that the path does not include spaces. For more information, see Preparing to Configure Cluster-Aware Updating.
    Note: You can also place the script on the SMB file share if you can access the SMB file share from all your hosts and the workstation that you are configuring the CAU from.
  8. Click Save .
    Caution: Do not change the auto-populated ConfigurationName field value. Otherwise, the script fails.
    The CAU configuration is saved to an XML file in the following folder: C:\Windows\System32

What to do next

Save the Updating Run Profile to another location and use it for any other cluster updates.

Updating a Cluster by Using the Remote-Updating Mode

You can update the cluster by using the remote-updating mode to verify that CAU is configured and working correctly. You might need to use the remote-updating mode even when you have configured the self-updating mode, but mostly for updates that cannot wait until the next self-updating run.

About this task

Note: Do not turn off your workstation until all updates have been installed.
To update a cluster by using the remote-updating mode, do the following:

Procedure

  1. In the Cluster-Aware Updating dialog box, click Apply updates to this cluster .
    The Cluster-Aware Updating Wizard appears.
  2. Read the information on the Getting Started page, and then click Next .
  3. On the Advanced Options page, do the following.
    1. In the Updating Run options based on field, enter the full path to the CAU configuration file that you created in Specifying the Nutanix Pre-Update Script in an Updating Run Profile .
    2. Ensure that the full path to the downloaded script is shown in the PreUpdateScript field and that the value in the CauPluginName field is Microsoft.WindowsUpdatePlugin .
  4. On the Additional Update Options page, do the following.
    1. If you want to include recommended updates, select the Give me recommended updates the same way that I receive important updates check box.
    2. Click Next .
  5. On the Completion page, click Close .
    The update process begins.
  6. In the Cluster-Aware Updating dialog box, click the Log of Updates in Progress tab and monitor the update process.

Updating a Cluster by Using the Self-Updating Mode

The self-updating mode ensures that the cluster is up-to-date at all times.

About this task

To configure the self-updating mode, do the following:

Procedure

  1. In the Cluster-Aware Updating dialog box, click Configure cluster self-updating options .
    The Configure Self-Updating Options Wizard appears.
  2. Read the information on the Getting Started page, and then click Next .
  3. On the Add Clustered Role page, do the following.
    1. Select the Add the CAU clustered role, with self-updating mode enabled, to this cluster check box.
    2. If you have a prestaged computer account, select the I have a prestaged computer object for the CAU clustered role check box. Otherwise, leave the check box clear.
  4. On the Self-updating schedule page, specify details such as the self-updating frequency and start date.
  5. On the Advanced Options page, do the following.
    1. In the Updating Run options based on field, enter the full path to the CAU configuration file that you created in Specifying the Nutanix Pre-Update Script in an Updating Run Profile .
    2. Ensure that the full path to the Nutanix pre-update script is shown in the PreUpdateScript field and that the value in the CauPluginName field is Microsoft.WindowsUpdatePlugin .
  6. On the Additional Update Options page, do the following.
    1. If you want to include recommended updates, select the Give me recommended updates the same way that I receive important updates check box.
    2. Click Next .
  7. Click Close .

Moving a Hyper-V Cluster to a Different Domain

This topic describes the supported procedure to move all the hosts on a Nutanix cluster running Hyper-V from one domain to another domain. For example, you might need to do this when you are ready to transition a test cluster to your production environment. Ensure that you merge all the VM checkpoints before moving them to another domain. The VMs fail to start in another domain if there are multiple VM checkpoints.

Before you begin

This method involves cluster downtime. Therefore, schedule a maintenance window to perform the following operations.

Procedure

  1. Note: If you are using System Center Virtual Machine Manager (SCVMM) to manage the cluster, remove the cluster from the SCVMM console. Right-click the cluster in the SCVMM console, and select Remove .
    Destroy the Hyper-V failover cluster using the Failover Cluster Manager or PowerShell commands.
    Note:

    • Remove all the roles from the cluster before destroying the cluster by doing either of the following:
      • Open Failover Cluster Manager, and select Roles from the left navigation pane. Select all the VM's, and select Remove .
      • Log on to any Hyper-V host with domain administrator user credentials and remove the roles with the PowerShell command Get-ClusterGroup | Remove-ClusterGroup -RemoveResources -Force .
    • Destroying the cluster removes any non-VM role permanently. This does not impact the VMs, and the VMs are visible in Hyper-V manager only.
    • Open Failover Cluster Manager, right-click select the cluster, and select More Actions > Destroy Cluster .
    • Log on to any Hyper-V host with domain administrator user credentials and remove the cluster with the PowerShell command Remove-Cluster -Force -CleanupAD , which ensures that all Active Directory objects (all hosts in the Nutanix cluster, Hyper-V failover cluster object, Nutanix storage cluster object) and any corresponding entries are deleted.
  2. Log on to any controller VM in the cluster and remove the Nutanix cluster from the domain by using nCLI; ensure that you also specify the Active Directory administrator user name.
    nutanix@cvm$ ncli cluster unjoin-domain logon-name=domain\username
  3. Log on to each host as the domain administrator user and remove the domain security identifiers from the virtual machines.
    > $d = (Get-WMIObject Win32_ComputerSystem).Domain.Split(".")[0]
    > Get-VMConnectAccess | Where {$_.username.StartsWith("$d\")} | `
      Foreach {Revoke-VMConnectAccess -VMName * -UserName $_.UserName} 
  4. Caution:

    Ensure all the user VM's are powered off before performing this step.
    Log on to any controller VM in the cluster and remove all hosts in the Nutanix cluster from the domain.
    nutanix@cvm$ allssh 'source /etc/profile > /dev/null 2>&1; winsh "\$x=hostname; netdom \
      remove \$x /domain /force"'
  5. Restart all hosts.
  6. If a controller VM fails to restart, use the Repair-CVM Nutanix PowerShell cmdlet to help you recover from this issue. Otherwise, skip this step and perform the next step.
    1. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
    2. Start the controller VM repair process.
      > Repair-CVM
      The CVM will be shutdown. Proceed (Y/N)? Y

      Progress is displayed in the PowerShell command-line shell. When the process is complete, the controller VM configuration information is displayed:

      Using the following configuration:
      
      Name                           Value
      ----                           -----
      internal_adapter_name          Internal
      name                           cvm-host-name
      external_adapter_name          External
      processor_count                8
      memory_weight                  100
      svmboot_iso_path               C:\Program Files\Nutanix\Cvm\cvm_name\svmboot.iso
      nutanix_path                   C:\Program Files\Nutanix
      vm_repository                  C:\Users\Administrator\Virtual Machines
      internal_vswitch_name          InternalSwitch
      processor_weight               200
      external_vswitch_name          ExternalSwitch
      memory_size_bytes              12884901888
      pipe_name                      \\.\pipe\SVMPipe

What to do next

Add the hosts to the new domain as described in Adding the Cluster and Hosts to a Domain.

Recover a Controller VM by Using Repair-CVM

The Repair-CVM PowerShell cmdlet can repair an unusable or deleted Controller VM by removing the existing Controller VM (if present) and creating a new one. In the Nutanix enterprise cloud platform design, no data associated with the unusable or deleted Controller VM is lost.

About this task

If a Controller VM already exists and is running, Repair-CVM prompts you to shut down the Controller VM so it can be deleted and re-created. If the Controller VM has been deleted, the cmdlet creates a new one. In all cases, the new CVM automatically powers on and joins the cluster.

A Controller VM is considered unusable when:

  • The Controller VM is accidentally deleted.
  • The Controller VM configuration is accidentally or unintentionally changed and the original configuration parameters are unavailable.
  • The Controller VM fails to restart after unjoining the cluster from a Hyper-V domain as part of a domain move procedure.

To use the cmdlet, log on to the Hyper-V host, type Repair-CVM, and follow any prompts. The repair process creates a new Controller VM based on any available existing configuration information. If the process cannot find the information or the information does not exist, the cmdlet prompts you for:

  • Controller VM name
  • Controller VM memory size in GB
  • Number of processors to assign to the Controller VM
Note: After running this command, you need to manually re-apply all the custom configuration that you have performed, for example increased RAM size.

Procedure

  1. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
  2. Start the controller VM repair process.
    > Repair-CVM
    The CVM will be shutdown. Proceed (Y/N)? Y

    Progress is displayed in the PowerShell command-line shell. When the process is complete, the controller VM configuration information is displayed:

    Using the following configuration:

    Name                 Value
    ----                           -----
    internal_adapter_name          Internal
    name                           cvm-host-name
    external_adapter_name          External
    processor_count                8
    memory_weight                  100
    svmboot_iso_path               C:\Program Files\Nutanix\Cvm\cvm_name\svmboot.iso
    nutanix_path                   C:\Program Files\Nutanix
    vm_repository                  C:\Users\Administrator\Virtual Machines
    internal_vswitch_name          InternalSwitch
    processor_weight               200
    external_vswitch_name          ExternalSwitch
    memory_size_bytes              12884901888
    pipe_name                      \\.\pipe\SVMPipe

Connect to a Controller VM by Using Connect-CVM

Nutanix installs Hyper-V utilities on each Hyper-V host for troubleshooting and Controller VM access. This procedure describes how to use Connect-CVM to launch the FreeRDP utility to access a Controller VM console when a secure shell (SSH) is not available or cannot be used.

About this task

FreeRDP launches when you run: > Connect-CVM .

Procedure

  1. Log on to a Hyper-V host in your environment and open a PowerShell command window.
  2. Start Connect-CVM.
    > Connect-CVM
  3. In the authentication dialog box, type the local administrator credentials and click OK .
  4. Log on to the Controller VM at the FreeRDP console window.
  5. Login to the Controller VM by using Controller VM credentials.

Changing the Name of the Nutanix Storage Cluster

The name of the Nutanix Storage clusters cannot be changed by using the Web console.

About this task

To change the name of the Nutanix storage cluster, do the following:

Procedure

  1. Log on to the CVM with SSH.
  2. Unjoin the existing Nutanix storage cluster object from the domain.
    ncli> cluster unjoin-domain logon-name=domain\username
  3. Change the cluster name.
    ncli> cluster edit-params new-name=cluster_name

    Replace cluster_name with the new cluster name.

  4. Create a new AD object corresponding to the new storage cluster name.
    nutanix@cvm$ ncli cluster join-domain cluster-name=new_name domain=domain_name \
    external-ip-address=external_ip_address name-server-ip=dns_ip logon-name=domain\username
  5. Restart genesis on each Controller VM in the cluster.
    nutanix@cvm$ allssh 'genesis restart'
    A new entry for the cluster is created in \Windows\System32\drivers\etc\hosts on the Hyper-V hosts.

Changing the Nutanix Cluster External IP Address

About this task

To change the external IP address of the Nutanix cluster, do the following.

Procedure

  1. Log on to the Controller VM with SSH.
  2. Run the following command to change the cluster external IP address.
    nutanix@cvm$ ncli cluster edit-params external-ip-address external_ip_address
    Replace external_ip_address with the new Nutanix cluster external IP address.

Fast Clone a VM Based on Nutanix SMB Shares by using New-VMClone

This cmdlet provides for fast-cloning virtual machines that are based off of Nutanix SMB shares. This cmdlet provide options for creating one or more clones from a given virtual machine.

About this task

Run Get-Help New-VMClone -Full to get detailed help on using the cmdlet with all the options that are available.

Note: This cmdlet does not support creating clones of VMs that have Hyper-V checkpoints.

Procedure

Log on to the Hyper-V host with a Remote Desktop Connection and open a PowerShell command window.
  • The syntax to create single clone is as follows.
    > New-VMClone -VM vm_name -CloneName clone_name -ComputerName computer_name`
     -DestinationUncPath destination_unc_path -PowerOn`
    -Credential prism_credential common_parameters
  • The syntax to create multiple clones is as follows.
    > New-VMClone -VM vm_name -CloneNamePrefix  clone_name_prefix`
    -CloneNameSuffixBegin clone_name_suffix_begin -NCopies n_copies`
    -ComputerName computer_name -DestinationUncPath destination_unc_path -PowerOn`
    -Credential prism_credential -MaxConcurrency max_concurrency common_parameters
  • Replace vm_name with the name of the VM that you are cloning.
  • Replace clone_name with the name of the VM that you are creating.
  • Replace clone_name_prefix with the prefix that should be used for naming the clones.
  • Replace clone_name_suffix_begin with the starting number of the suffix.
  • Replace n_copies with the number of clones that you need to create.
  • Replace computer_name with the name of the computer on which you are creating the clone.
  • Replace destination_unc_path with path on the Nutanix SMB share to store the clone on.
  • Replace prism_credential with the credential to access the Prism (the Nutanix Management service).
  • Replace max_concurrency with the number of clones that you need to create in parallel.
  • Replace common_parameters with any additional parameters that you want to define. For example, -Verbose flag.

Change the Path of a VM Based on Nutanix SMB shares by using Set-VMPath

This cmdlet provides for repairing the UNC paths in the metadata of the VMs that are based off of Nutanix SMB shares and has the following two forms.

About this task

  • Replaces the specified IP address with the supplied DNS name for every occurrence of the IP address in the UNC paths in the VM metadata or configuration file.
  • Replaces the specified SMB server name with the supplied alternative in the UNC paths in the VM metadata without taking the case into consideration.
Note: You cannot use the Set-VMPath cmdlet in 4.5 release. You can use this cmdlet for 4.5.1 or later releases.

Procedure

Log on to the Hyper-V host with a Remote Desktop Connection and open a PowerShell command window.
  • The syntax to change the IP address to DNS name is as follows.
    > Set-VMPath -VMId vm_id -IPAddress ip_address -DNSName dns_name common_parameters
  • The syntax to change the SMB server name is as follows.
    > Set-VMPath -VMId vm_id -SmbServerName smb_server_name`
    -ReplacementSmbServerName replacement_smb_server_name common_parameters
  • Replace vm_id with the ID of the VM.
  • Replace ip_address with the IP address that you want to replace in the VM metadata or configuration file.
  • Replace dns_name with the DNS name that you want to replace the IP address with.
  • Replace smb_server_name with the SMB server name that you want to replace.
  • Replace replacement_smb_server_name with the SMB server name that you want as a replacement.
  • Replace common_parameters with any additional parameters that you want to define. For example, -Verbose flag.
Note: The target VM must be powered off for the operation to complete.

Nutanix SMB Shares Connection Requirements from Outside the Cluster

For external non-Nutanix host that needs to access Nutanix SMB shares must conform to following requirements.

  • Any external non-Nutanix host that needs to access Nutanix SMB shares must run at least Windows 8 or later version if it is a desktop client, and Windows 2012 or later version if it is running Windows Server. This requirement is because SMB 3.0 support is required for accessing Nutanix SMB shares.
  • The IP address of the host must be allowed in the Nutanix storage cluster.
    Note: The SCVMM host IP address is automatically included in the allowlist during the setup. For other IP addresses, you can add those source addresses to the allowlist after the setup configuration is completed by using the Web Console or the nCLI cluster add-to-nfs-whitelist command.
  • For accessing a Nutanix SMB share from Windows 10 or Windows Server 2016, you must enable Kerberos on the Nutanix cluster.
  • If Kerberos is not enabled in the Nutanix storage cluster (the default configuration), then the SMB client in the host must not have RequireSecuritySignature set to True. For more information about checking the policy, see System Center Virtual Machine Manager Configuration . You can verify this by running Get-SmbClientConfiguration in the host. If the SMB client is running in a Windows desktop instead of Windows Server, the account used to log on into the desktop should not be linked to an external Microsoft account.
  • If Kerberos is enabled in the Nutanix storage cluster, you can access the storage only by using the DNS name of the Nutanix storage cluster, and not by using the external IP address of the cluster.
Warning: Nutanix does not support using SMB shares of Hyper-V for storing anything other than virtual machine disks (e.g VHD, VHDX files) and their associated configuration files. This includes, but is not limited to, using Nutanix SMB shares of Hyper-V for general file sharing, virtual machine and configuration files for VMs running on outside of the Nutanix nodes, or any other type of hosted repository not based on virtual machine disks.

Updating the Cluster After Renaming the Hyper-V External Virtual Switch

About this task

You can rename the external virtual switch on your Hyper-V cluster to a name of your choice. After you rename the external virtual switch, you must update the new name in AOS so that AOS upgrades and VM migrations do not fail.

Note: In releases earlier than AOS 5.11, the name of the external virtual switch in your Hyper-V cluster must be ExternalSwitch .

See the Microsoft documentation for instructions about how to rename the external virtual switch.

Perform the following steps after you rename the external virtual switch.

Procedure

  1. Log on to a CVM with SSH.
  2. Restart Genesis on all the CVMs in the cluster.
    nutanix@cvm$ genesis restart
  3. Refresh all the guest VMs.
    1. Log on to a Hyper-V host.
    2. Go to Hyper-V Manager, select the VM and, in Settings , click the Refresh icon.
    See the Microsoft documentation for the updated instructions about how to refresh the guest VMs.

Upgrade to Windows Server Version 2016 and 2019

The following procedures describe how to upgrade earlier releases of Windows Server to Windows Server 2016 and 2019. For information about fresh installation of Windows Server, see Hyper-V Configuration.
Note: If you are upgrading from Windows Server 2012 R2 and if the AOS version is less than 5.11, then upgrade to Windows Server 2016 first and then upgrade to AOS 5.17. Proceed with upgrading to Windows Server 2019 if necessary.

Hyper-V Hypervisor Upgrade Recommendations and Requirements

This section provides the requirements, recommendations, and limitations to upgrade Hyper-V.

Requirements

Note:
  • From Hyper-V 2019, if you do not choose LACP/LAG, SET is the default teaming mode. NX Series G5 and later models support Hyper-V 2019.
  • For Hyper-V 2016, if you do not choose LACP/LAG, the teaming mode is Switch Independent LBFO teaming.
  • For Hyper-V (2016 and 2019), if you choose LACP/LAG, the teaming mode is Switch Dependant LBFO teaming.
  • The platform must not be a light-compute platform.
  • Before upgrading, disable or uninstall third-party antivirus or security filter drivers that modify Windows firewall rules. Windows firewalls must accept inbound and outbound SSH traffic outside of the domain rules.
  • Enable Kerberos when upgrading from Windows Server 2012 R2 to Windows Server 2016. For more information, see Enabling Kerberos for Hyper-V .
    Note: Kerberos is enabled by default when upgrading from Windows Server 2016 to Windows Server 2019.
  • Enable virtual machine migration on the host. Upgrading reimages the hypervisor. Any custom or non-standard hypervisor configurations could be lost after the upgrade is completed.
  • If you are using System Center for Virtual Machine Management (SCVMM) 2012, upgrade to SCVMM 2016 first before upgrading to Hyper-V 2016. Similarly, before upgrading to Hyper-V 2019 upgrade to SCVMM 2019.
  • Upgrade using ISOs and Nutanix JSON File
    • Upgrade using ISOs. The Prism web console supports 1-click upgrade ( Upgrade Software dialog box) of Hyper-V 2016 or 2019 by using metadata upgrade JSON file. This file is available in the Nutanix Support portal Hypervisor Details page and the Microsoft Hyper-V ISO file.
    • The Hyper-V upgrade JSON file, when used on clusters where Foundation 4.0 or later is installed, is available for Nutanix NX series G4 and later, Dell EMC XC series, or Lenovo HX series platforms. You can upgrade hosts to Hyper-V 2016 or Hyper-V 2019 (except for NX series G4 and Lenovo HX series platforms) on these platforms by using this JSON file.
      Note: Lenovo HX series platforms do not support upgrades to Hyper-V 2019 or later versions.

Limitations

  • When upgrading hosts to Hyper-V 2016, 2019, and later versions, the local administrator user name and password is reset to the default administrator name Administrator and password of nutanix/4u. Any previous changes to the administrator name and/or password are overwritten.
  • VMs with any associated files on local storage are lost.
    • Logical networks are not restored immediately after upgrade. If you configure logical switches, the configuration is not retained and VMs could become unavailable.
    • Any VMs created during hypervisor upgrade (including as part of disaster recovery operations) and not marked as HA (High Availability) experiences unavailability.
    • Disaster recovery: VMs with the Automatic Stop Action property set to Save is marked as CBR Not Capable if they are upgraded to version 8.0 after upgrading the hypervisor. Change the value of Automatic Stop Action to ShutDown or TurnOff when the VM is upgraded so that it is not marked as CBR Not Capable
  • Enabling Link Aggregation Control Protocol (LACP) for your cluster deployment is not supported when upgrading hypervisor hosts from Windows Server 2012 R2 to 2016. Preupgrade hypervisor host configuration checks fail in this case.
    Note: Enabling LACP is supported on AOS 5.10.10 when upgrading from Windows Server 2012 R2 to 2016. From AOS 5.17 and later, it is enabled for upgrades from Windows Server 2016 to 2019.

Recommendations

Nutanix recommends that you schedule a sufficiently long maintenance window to upgrade your Hyper-V clusters.

  • Budget sufficient time to upgrade: Depending on the number of VMs running on a node before the upgrade, a node could take more than 1.5 hours to upgrade. The total time to upgrade a Hyper-V cluster from Hyper-V 2012 R2 to Hyper-V 2016 is approximately the time per node multiplied by the number of nodes.
    Upgrading from Windows Server 2012 R2 to Windows Server 2019 can take longer, considering the following total time:
    1. Upgrading Windows Server 2012 R2 to Windows Server 2016
    2. Upgrading from AOS 5.16 to AOS 5.17
    3. Upgrading from Windows Server 2016 to Windows Server 2019
    During the upgrade process, Nutanix performs the following operations:
    • Each node in the cluster is reimaged by using Foundation (duration per node: approximately 45 minutes).
    • Each node is restarted once after the reimaging is complete.
    • The VMs are migrated out of the node that is being upgraded (duration: depends on the number of VMs and workloads running on that node).

Upgrading to Windows Server Version 2016 and 2019

About this task

Note:
  • It is possible that clusters running Windows Server 2012 R2 and AOS have time synchronization issues. Therefore, before you upgrade to Windows Server 2016 or Windows Server 2019 and AOS, make sure that the cluster is free from time synchronization issues.
  • Windows Server 2016 also implements Discrete Device Assignment (DDA) for passing through PCI Express devices to guest VMs. This feature is available in Windows Server 2019 too. Therefore, DiskMonitorService, which was used in earlier AOS releases for passing disks through to the CVM, no longer exists. For more information about DDA, see the Microsoft documentation.

Procedure

  1. Make sure that AOS, host, and hypervisor upgrade prerequisites are met.
    See Hyper-V Hypervisor Upgrade Recommendations and Requirements and the Acropolis Upgrade Guide.
  2. Upgrade AOS by either using the one-click upgrade procedure or uploading the installation files manually. The Prism web console performs both procedures.
    • After upgrading AOS and before upgrading your cluster hypervisor, perform a Life Cycle Manager (LCM) inventory, update LCM, and upgrade any recommended firmware. See the Life Cycle Manager documentation for more information.
    • See the Acropolis Upgrade Guide for more details, including recommended installation or upgrade order.
  3. Do one of the following if you want to manage your VMs with SCVMM:
    1. If you register the Hyper-V cluster with an SCVMM installation with a version that is earlier to 2016, do the following in any order:
      • Unregister the cluster from SCVMM.
      • Upgrade SCVMM to version 2016. See Microsoft documentation for this upgrade procedure.
        Note: Similarly, do the same when upgrading from Hyper-V 2016 to 2019. Upgrade SCVMM to version 2019 and register the cluster to SCVMM 2019.
    2. If you do not have SCVMM, deploy SCVMM 2016 / 2019. See Microsoft documentation for this installation procedure.
    Regardless of whether you deploy a new instance of SCVMM 2016 or you upgrade an existing SCVMM installation, do not register the Hyper-V cluster with SCVMM now. To minimize the steps in the overall upgrade workflow, register the cluster with SCVMM 2016 after you upgrade the Hyper-V hosts.
  4. If you are upgrading from Windows Server 2012 R2 to Windows Server 2016, then enable Kerberos. See Enabling Kerberos for Hyper-V.
  5. Upgrade the Hyper-V hosts.
  6. After the cluster is up, add the cluster to SCVMM 2016. The procedure for adding the cluster to SCVMM 2016 is the procedure used for earlier versions of SCVMM. See Registering a Cluster with SCVMM.
  7. Any log redirection (for example, SCOM log redirection) configurations are lost during the hypervisor upgrade process. Reconfigure log redirection.

System Center Virtual Machine Manager Configuration

System Center Virtual Machine Manager (SCVMM) is a management platform for Hyper-V clusters. Nutanix provides a utility for joining Hyper-V hosts to a domain and adding Hyper-V hosts and storage to SCVMM. If you cannot or do not want to use this utility, you must join to hosts to the domain and add the hosts and storage to SCVMM manually.

Note: The Validate Cluster feature of the Microsoft System Center VM Manager (SCVMM) is not supported for Nutanix clusters managed by SCVMM.

SCVMM Configuration

After joining cluster and its constituent hosts to the domain and creating a failover cluster, you can configure SCVMM.

Registering a Cluster with SCVMM

Perform the following procedure to register a cluster with SCVMM.

Before you begin

  • Join the hosts in the Nutanix cluster to a domain manually or by following Adding the Cluster and Hosts to a Domain.
  • Make sure that the hosts are not registered with SCVMM.

Procedure

  1. Log on to any CVM in the cluster with SSH.
  2. Verify that the status of all services on all the CVMs are Up.
    nutanix@cvm$ cluster status
    If the Nutanix cluster is running properly, output similar to the following is displayed for each node in the Nutanix cluster.
    CVM: <host IP-Address> Up
                                    Zeus   UP       [9935, 9980, 9981, 9994, 10015, 10037]
                               Scavenger   UP       [25880, 26061, 26062]
                                  Xmount   UP       [21170, 21208]
                        SysStatCollector   UP       [22272, 22330, 22331]
                               IkatProxy   UP       [23213, 23262]
                        IkatControlPlane   UP       [23487, 23565]
                           SSLTerminator   UP       [23490, 23620]
                          SecureFileSync   UP       [23496, 23645, 23646]
                                  Medusa   UP       [23912, 23944, 23945, 23946, 24176]
                      DynamicRingChanger   UP       [24314, 24404, 24405, 24558]
                                  Pithos   UP       [24317, 24555, 24556, 24593]
                              InsightsDB   UP       [24322, 24472, 24473, 24583]
                                  Athena   UP       [24329, 24504, 24505]
                                 Mercury   UP       [24338, 24515, 24516, 24614]
                                  Mantle   UP       [24344, 24572, 24573, 24634]
                              VipMonitor   UP       [18387, 18464, 18465, 18466, 18474]
                                Stargate   UP       [24993, 25032]
                    InsightsDataTransfer   UP       [25258, 25348, 25349, 25388, 25391, 25393, 25396]
                                   Ergon   UP       [25263, 25414, 25415]
                                 Cerebro   UP       [25272, 25462, 25464, 25581]
                                 Chronos   UP       [25281, 25488, 25489, 25547]
                                 Curator   UP       [25294, 25528, 25529, 25585]
                                   Prism   UP       [25718, 25801, 25802, 25899, 25901, 25906, 25941, 25942]
                                     CIM   UP       [25721, 25829, 25830, 25856]
                            AlertManager   UP       [25727, 25862, 25863, 25990]
                                Arithmos   UP       [25737, 25896, 25897, 26040]
                                 Catalog   UP       [25749, 25989, 25991]
                               Acropolis   UP       [26011, 26118, 26119]
                                   Uhura   UP       [26037, 26165, 26166]
                                    Snmp   UP       [26057, 26214, 26215]
                       NutanixGuestTools   UP       [26105, 26282, 26283, 26299]
                              MinervaCVM   UP       [27343, 27465, 27466, 27730]
                           ClusterConfig   UP       [27358, 27509, 27510]
                                Aequitas   UP       [27368, 27567, 27568, 27600]
                             APLOSEngine   UP       [27399, 27580, 27581]
                                   APLOS   UP       [27853, 27946, 27947]
                                   Lazan   UP       [27865, 27997, 27999]
                                  Delphi   UP       [27880, 28058, 28060]
                                    Flow   UP       [27896, 28121, 28124]
                                 Anduril   UP       [27913, 28143, 28145]
                                   XTrim   UP       [27956, 28171, 28172]
                           ClusterHealth   UP       [7102, 7103, 27995, 28209,28495, 28496, 28503, 28510,	
    28573, 28574, 28577, 28594, 28595, 28597, 28598, 28602, 28603, 28604, 28607, 28645, 28646, 28648, 28792,	
    28793, 28837, 28838, 28840, 28841, 28858, 28859, 29123, 29124, 29127, 29133, 29135, 29142, 29146, 29150,	
    29161, 29162, 29163, 29179, 29187, 29219, 29268, 29273]
  3. Add the Nutanix hosts and storage to SCVMM.
    nutanix@cvm$ setup_hyperv.py setup_scvmm

    This script performs the following functions.

    • Adds the cluster to SCVMM.
    • Sets up the library share in SCVMM.
    • Unregisters the deleted storage containers from SCVMM.
    • Registers the new storage containers in SCVMM.

    Alternatively, you can specify all the parameters as given in the following steps as command-line arguments. If you do so, enclose the values in single quotation marks since the Controller VM shell does not otherwise correctly interpret the backslash (\).

    The utility prompts for the necessary parameters, for example:

    Getting the cluster configuration ... Done
    Getting information about each host ... Done
    The hosts are joined to domain hyperv.nutanix.com
    
    Please enter the domain account username that has local administrator rights on
    the hosts: hyperv.nutanix.com\Administrator
    Please enter the password for hyperv.nutanix.com\Administrator:
    Verifying credentials for accessing localhost ... Done
    
    Please enter the name of the SCVMM server: scvmmhyperv
    Getting the SCVMM server IP address ... 10.4.34.44
    Adding 10.4.34.44 to the IP address whitelist ... Done
    
    Please enter the domain account username (e.g. username@corp.contoso.com or
     CORP.CONTOSO.COM\username) that has administrator rights on the SCVMM server
    and is a member of the domain administrators group (press ENTER for hyperv.nutanix.com\Administrator):
    Verifying credentials for accessing scvmmhyperv ... Done
    
    Verifying SCVMM service account ... HYPERV\scvmm
    
    All nodes are already part of the Hyper-V failover cluster msfo-tulip.
    Preparing to join the Nutanix storage cluster to domain ... Already joined
    Creating an SCVMM run-as account ... hyperv-Administrator
    Verifying the DNS entry tulip.hyperv.nutanix.com -> 10.4.36.191 ... Done
    Verifying that the Hyper-V failover cluster IP address has been added to DNS ... 10.4.36.192
    Verifying SCVMM security settings ... Done
    Initiating adding the failover cluster to SCVMM ... Done
    Step 2 of adding the failover cluster to SCVMM ... Done
    Final step of adding the failover cluster to SCVMM ... Done
    Querying registered Nutanix library shares ... None
    Add a Nutanix share to the SCVMM library for storing VM templates, useful for deploying VMs using Fast File Copy ([Y]/N)? Y
    Querying the registered library servers ... Done
    Using library server scvmmhyperv.hyperv.nutanix.com.
    Please enter the name of the Nutanix library share to be created (press ENTER for "msfo-tulip-library"): 
    Creating container msfo-tulip-library ... Done
    Registering msfo-tulip-library as a library share with server scvmmhyperv.hyperv.nutanix.com in SCVMM ... Done
    Please enter the Prism password: 
    Registering the SMI-S provider with SCVMM ... Done
    Configuring storage in SCVMM ... Done
    Registered default-container-11962
    
    1. Type the domain account username and password.
      This username must include the fully-qualified domain name, for example hyperv.nutanix.com\Administrator .
    2. Type the SCVMM server name.
      The name must resolve to an IP address.
    3. Type the SCVMM username and password if they are different from the domain account; otherwise press Enter to use the domain account.
    4. Choose whether to create a library share.
      Add a Nutanix share to the SCVMM library for storing VM templates, useful for
       deploying VMs using Fast File Copy ([Y]/N)?

      If you choose to create a library share, output similar to the following is displayed.

      Querying the registered library servers ... Done
      Add a Nutanix share to the SCVMM library for storing VM templates, useful for deploying VMs using Fast File Copy ([Y]/N)? Y
      Querying the registered library servers ... Done
      Using library server scvmmhyperv.hyperv.nutanix.com.
      Please enter the name of the Nutanix library share to be created (press ENTER
       for "NTNX-HV-library"):
      Creating container NTNX-HV-library ... Done
      Registering NTNX-HV-library as a library share with server scvmmhyperv.hyperv.nutanix.com ... Done
      
      Finally the following output is displayed.
      Registering the SMI-S provider with SCVMM ... Done
      Configuring storage in SCVMM ... Done
      Registered share ctr1
      
      Setup complete.
    Note: You can also register Nutanix Cluster by using SCVMM. For more information, see Adding Hosts and Storage to SCVMM Manually (SCVMM User Interface).
    Warning: If you change the Prism password, you must change the Prism run as account in SCVMM.

Adding Hosts and Storage to SCVMM Manually (SCVMM User Interface)

If you are unable to add hosts and storage to SCVMM by using the utility provided by Nutanix, you can add the hosts and storage to SCVMM by using the SCVMM user interface.

Before you begin

  • Verify that the SCVMM server IP address is on the cluster allowlist.
  • Verify that the SCVMM library server has a run-as account specified. Right-click the library server, click Properties , and ensure that Library management credential is populated.

Procedure

  1. Log into the SCVMM user interface and click VMs and Services .
  2. Right-click All Hosts and select Add Hyper-V Hosts and Clusters , and click Next .
    The Specify the Credentials to use for discovery screen appears.
  3. Click Browse and select an existing Run As Account or create a new Run As Account by clicking Create Run As Account . Click OK and then click Next .
    The Specify the search scope for virtual machine host candidates screen appears.
  4. Type the failover cluster name in the Computer names text box, and click Next .
  5. Select the failover cluster that you want to add, and click Next .
  6. Select Reassociate this host with this VMM environment check box, and click Next .
    The Confirm the settings screen appears.
  7. Click Finish .
    Warning: If you are adding the cluster for the first time, the addition action fails with the following error message.
    Error (10400)
    Before Virtual Machine Manager can perform the current operation, the virtualization server must be restarted.

    Remove the cluster that you were adding and perform the same procedure again.

  8. Register a Nutanix SMB share as a library share in SCVMM by clicking Library and then adding the Nutanix SMB share.
    1. Right-click the Library Servers and click Add Library Shares .
    2. Click Add Unmanaged Share and type the SMB file share path, click OK , and click Next .
    3. Click Add Library Shares .
      If all the parameters are correct, the library share is added.
  9. Register the Nutanix SMI-S provider.
    1. Go to Settings > Security > Run As Accounts and click Create Run As Account .
    2. Enter the Prism user name and password, de-select Validate domain credentials , and click Finish .
      Note:

      Only local Prism accounts are supported and even if AD authentication in Prism is configured, SMI-S provider cannot use it for authentication.

    3. Go to Fabric > Storage > Providers .
    4. Right-click Providers and select Add Storage Devices .
    5. Select SAN and NAS devices discovered and managed by a SMI-S provider check box, and Click Next .
    6. Specify the protocol and address of the storage SMI-S provider.
      • In the Protocol drop-down menu, select SMI-S CIMXML .
      • In the Provider IP Address or FQDN text box, provide the Nutanix storage cluster name. For example, clus-smb .
        Note: The Nutanix storage cluster name is not the same as the Hyper-V cluster name. You should get the storage cluster name from the cluster details in the web console.
      • Select the Use Secure sockets layer SSL connection check box.
      • In the Run As Account field, click Browse and select the Prism Run As Account that you have created earlier, and click Next .
      Note: If you encounter the following error when attempting to add an SMI-S provider, see KB 5070:
      Could not retrieve a certificate from the <clustername> server because of the error:
      The request was aborted: Could not create SSL/TLS secure channel.
    7. Click Import to verify the identity of the storage provider.
      The discovery process starts and at the completion of the process, the storage is displayed.
    8. Click Next and select all the SMB shares exported by the Nutanix cluster except the library share and click Next .
    9. Click Finish .
      The newly added provider is displayed under Providers. Go to Storage > File Clusters to verify that the Managed column is Yes .
  10. Add the file shares to the Nutanix cluster by navigating to VMs and Services .
    1. Right-click the cluster name and select Properties .
    2. Go to File Share Storage , and click Add to add file shares to the cluster.
    3. From the File share path drop-down menu, select all the shares that you want to add, and click OK .
    4. Right-click the cluster and click Refresh . Wait for the refresh job to finish.
    5. Right-click the cluster name and select Properties > File Share Storage . You should see the access status with a green check mark, which means that the shares are successfully added.
    6. Select all the virtual machines in the cluster, right-click, and select Refresh .

SCVMM Operations

You can perform the operational procedures on a Hyper-V mode by using SCVMM such as placing a host in the maintenance mode.

Placing a Host in Maintenance Mode

If you try to place a host that is managed by SCVMM in maintenance mode, by default the Controller VM running on the host is placed in the saved state, which might create issues. Perform the following procedure to properly place a host in the maintenance mode.

Procedure

  1. Log into the Controller VM of the host that you are planning to place in maintenance mode by using SSH and shut down the Controller VM.
    nutanix@cvm$ cvm_shutdown -P now

    Wait for the Controller VM to completely shut down.

  2. Select the host and place it in the maintenance mode by navigating to the Host tab in the Host group and clicking Start Maintenance Mode .
    Wait for the operation to complete before performing any maintenance activity on the host.
  3. After the maintenance activity is completed, bring out the host from the maintenance mode by navigating to the Host tab in the Host group and clicking Stop Maintenance Mode .
  4. Start the Controller VM manually.
Read article

Migration Guide

AOS 5.20

Product Release Date: 2021-05-17

Last updated: 2022-07-14

This Document Has Been Removed

Nutanix Move is the Nutanix-recommended tool for migrating a VM. Please see the Move documentation at the Nutanix Support portal.

Read article

vSphere Administration Guide for Acropolis

AOS 5.20

Product Release Date: 2021-05-17

Last updated: 2022-08-04

Overview

Nutanix Enterprise Cloud delivers a resilient, web-scale hyperconverged infrastructure (HCI) solution built for supporting your virtual and hybrid cloud environments. The Nutanix architecture runs a storage controller called the Nutanix Controller VM (CVM) on every Nutanix node in a cluster to form a highly distributed, shared-nothing infrastructure.

All CVMs work together to aggregate storage resources into a single global pool that guest VMs running on the Nutanix nodes can consume. The Nutanix Distributed Storage Fabric manages storage resources to preserve data and system integrity if there is node, disk, application, or hypervisor software failure in a cluster. Nutanix storage also enables data protection and High Availability that keep critical data and guest VMs protected.

This guide describes the procedures and settings required to deploy a Nutanix cluster running in the VMware vSphere environment. To know more about the VMware terms referred to in this document, see the VMware Documentation.

Hardware Configuration

See the Field Installation Guide for information about how to deploy and create a Nutanix cluster running ESXi for your hardware. After you create the Nutanix cluster by using Foundation, use this guide to perform the management tasks.

Limitations

For information about ESXi configuration limitations, see Nutanix Configuration Maximums webpage.

Nutanix Software Configuration

The Nutanix Distributed Storage Fabric aggregates local SSD and HDD storage resources into a single global unit called a storage pool. In this storage pool, you can create several storage containers, which the system presents to the hypervisor and uses to host VMs. You can apply a different set of compression, deduplication, and replication factor policies to each storage container.

Storage Pools

A storage pool on Nutanix is a group of physical disks from one or more tiers. Nutanix recommends configuring only one storage pool for each Nutanix cluster.

Replication factor
Nutanix supports a replication factor of 2 or 3. Setting the replication factor to 3 instead of 2 adds an extra data protection layer at the cost of more storage space for the copy. For use cases where applications provide their own data protection or high availability, you can set a replication factor of 1 on a storage container.
Containers
The Nutanix storage fabric presents usable storage to the vSphere environment as an NFS datastore. The replication factor of a storage container determines its usable capacity. For example, replication factor 2 tolerates one component failure and replication factor 3 tolerates two component failures. When you create a Nutanix cluster, three storage containers are created by default. Nutanix recommends that you do not delete these storage containers. You can rename the storage container named default - xxx and use it as the main storage container for hosting VM data.
Note: The available capacity and the vSphere maximum of 2,048 VMs limits the number of VMs a datastore can host.

Capacity Optimization

  • Nutanix recommends enabling inline compression unless otherwise advised.
  • Nutanix recommends disabling deduplication for all workloads except VDI.

    For mixed-workload Nutanix clusters, create a separate storage container for VDI workloads and enable deduplication on that storage container.

Nutanix CVM Settings

CPU
Keep the default settings as configured by the Foundation during the hardware configuration.

Change the CPU settings only if Nutanix Support recommends it.

Memory
Most workloads use less than 32 GB RAM memory per CVM. However, for mission-critical workloads with large working sets, Nutanix recommends more than 32 GB CVM RAM memory.
Tip: You can increase CVM RAM memory up to 64 GB using the Prism one-click memory upgrade procedure. For more information, see Increasing the Controller VM Memory Size in the Prism Web Console Guide .
Networking
The Nutanix CVM uses the standard Ethernet MTU (maximum transmission unit) of 1500 bytes for all the network interfaces by default. The standard 1500 byte MTU helps deliver enhanced excellent performance and stability. Nutanix does not support configuring the MTU on a network interface of CVMs to higher values.
Caution: Do not use jumbo Frames for the Nutanix CVM.
Caution: Do not change the vSwitchNutanix or the internal vmk (VMkernel) interface.

Nutanix Cluster Settings

Nutanix recommends that you do the following.

  • Map a Nutanix cluster to only one vCenter Server.

    Due to the way the Nutanix architecture distributes data, there is limited support for mapping a Nutanix cluster to multiple vCenter Servers. Some Nutanix products (Move, Era, Calm, Files, Prism Central), and features (disaster recovery solution) are unstable when a Nutanix cluster maps to multiple vCenter Servers.

  • Configure a Nutanix cluster with replication factor 2 or replication factor 3.
    Tip: Nutanix recommends using replication factor 3 for clusters with more than 16 nodes. Replication factor 3 requires at least five nodes so that the data remains online even if two nodes fail concurrently.
  • Use the advertised capacity feature to ensure that the resiliency capacity is equivalent to one node of usable storage for replication factor 2 or two nodes for replication factor 3.

    The advertised capacity of a storage container must equal the total usable cluster space minus the capacity of either one or two nodes. For example, in a 4-node cluster with 20 TB usable space per node with replication factor 2, the advertised capacity of the storage container must be 60 TB. That spares 20 TB capacity to sustain and rebuild one node for self-healing. Similarly, in a 5-node cluster with 20 TB usable space per node with replication factor 3, advertised capacity of the storage container must be 60 TB. That spares 40 TB capacity to sustain and rebuild two nodes for self-healing.

  • Use the default storage container and mounting it on all the ESXi hosts in the Nutanix cluster.

    You can also create a single storage container. If you are creating multiple storage containers, ensure that all the storage containers follow the advertised capacity recommendation.

  • Configure the vSphere cluster according to settings listed in vSphere Cluster Settings Checklist.

Software Acceptance Level

The Foundation sets the software acceptance level of an ESXi image to CommunitySupported by default. If there is a requirement to upgrade the software acceptance level, run the following command to upgrade the software acceptance level to the maximum acceptance level of PartnerSupported .

root@esxi# esxcli software acceptance set --level=PartnerSupported

Scratch Partition Settings

ESXi uses the scratch partition (/scratch) to dump the logs when it encounters a purple screen of death (PSOD) or a kernel dump. The Foundation install automatically creates this partition on the SATA DOM or M.2 device with the ESXi installation. Moving the scratch partition to any location other than the SATA DOM or M.2 device can cause issues with LCM, 1-click hypervisor updates, and the general stability of the Nutanix node.

vSphere Networking

vSphere on the Nutanix platform enables you to dynamically configure, balance, or share logical networking components across various traffic types. To ensure availability, scalability, performance, management, and security of your infrastructure, configure virtual networking when designing a network solution for Nutanix clusters.

You can configure networks according to your requirements. For detailed information about vSphere virtual networking and different networking strategies, refer to the Nutanix vSphere storage solution document. This chapter describes the configuration elements required to run VMware vSphere on the Nutanix Enterprise infrastrucutre.

Virtual Networking Configuration Options

vSphere on Nutanix supports the following types of virtual switches.

vSphere Standard Switch (vSwitch)
vSphere Standard Switch (vSS) with Nutanix vSwitch is the default configuration for Nutanix deployments and suits most use cases. A vSwitch detects which VMs are connected to each virtual port and uses that information to forward traffic to the correct VMs. You can connect a vSwitch to physical switches by using physical Ethernet adapters (also referred to as uplink adapters) to join virtual networks with physical networks. This type of connection is similar to connecting physical switches together to create a larger network.
Tip: A vSwitch works like a physical Ethernet switch.
vSphere Distributed Switch (vDS)

Nutanix recommends vSphere Distributed Switch (vDS) coupled with network I/O control (NIOC version 2) and load-based teaming. This combination provides simplicity, ensures traffic prioritization if there is contention, and reduces operational management overhead. A vDS acts as a single virtual switch across all associated hosts on a datacenter. It enables VMs to maintain consistent network configuration as they migrate across multiple hosts. For more information about vDS, see NSX-T Support on Nutanix Platform.

Nutanix recommends setting all vNICs as active on the port group and dvPortGroup unless otherwise specified. The following table lists the naming convention, port groups, and the corresponding VLAN Nutanix uses for various traffic types.

Table 1. Port Groups and Corresponding VLAN
Port group VLAN Description
MGMT_10 10 VM kernel interface for host management traffic
VMOT_20 20 VM kernel interface for vMotion traffic
FT_30 30 Fault tolerance traffic
VM_40 40 VM traffic
VM_50 50 VM traffic
NTNX_10 10 Nutanix CVM to CVM cluster communication traffic (public interface)
Svm-iscsi-pg N/A Nutanix CVM to internal host traffic
VMK-svm-iscsi-pg N/A VM kernel port for CVM to hypervisor communication (internal)

All Nutanix configurations use an internal-only vSwitch for the NFS communication between the ESXi host and the Nutanix CVM.This vSwitch remains unmodified regardless of the virtual networking configuration for ESXi management, VM traffic, vMotion, and so on.

Caution: Do not modify the internal-only vSwitch (vSwitch-Nutanix). vSwitch-Nutanix facilitates communication between the CVM and the internal hypervisor.

VMware NSX Support

Running VMware NSX on Nutanix infrastructure ensures that VMs always have access to fast local storage and compute, consistent network addressing and security without the burden of physical infrastructure constraints. The supported scenario connects the Nutanix CVM to a traditional VLAN network, with guest VMs inside NSX virtual networks. For more information, see the Nutanix vSphere storage solution document.

NSX-T Support on Nutanix Platform

Nutanix platform relies on communication with vCenter to work with networks backed by vSphere standard switch (vSS) or vSphere Distributed Switch (vDS). With the introduction of a new management plane, that enables network management agnostic to the compute manager (vCenter), network configuration information is available through the NSX-T manager. To collect the network configuration information from the NSX-T Manager, you must modify the Nutanix infrastructure workflows (AOS upgrades, LCM upgrades, and so on).

Figure. Nutanix and the NSX-T Workflow Overview Click to enlarge Nutanix and NSX-T Workflow Overview

The Nutanix platform supports the following in the NSX-T configuration.

  • ESXi hypervisor only.
  • vSS and vDS virtual switch configurations.
  • Nutanix CVM connection to VLAN backed NSX-T segments only.
  • The NSX-T Manager credentials registration using the CLI.

The Nutanix platform does not support the following in the NSX-T configuration.

  • Network segmentation with N-VDS.
  • Nutanix CVM connection to overlay NSX-T segments.
  • Link Aggregation/LACP for the uplinks backing the NVDS host switch connecting Nutanix CVMs.
  • The NSX-T Manager credentials registration through Prism.

NSX-T Segments

Nutanix supports NSX-T logical segments to co-exist on Nutanix clusters running ESXi hypervisors. All infrastructure workflows that include the use of the Foundation, 1-click upgrades, and AOS upgrades are validated to work in the NSX-T configurations where CVM is backed by the NSX-T VLAN logical segment.

NSX-T has the following types of segments.

VLAN backed
VLAN backed segments operate similar to the standard port group in a vSphere switch. A port group is created on the NVDS, and VMs that are connected to the port group have their network packets tagged with the configured VLAN ID.
Overlay backed
Overlay backed segments use the Geneve overlay to create a logical L2 network over L3 network. Encapsulation occurs at the transport layer (which is the NVDS module on the host).

Multicast Filtering

Enabling multicast snooping on a vDS with a Nutanix CVM attached affects the ability of CVM to discover and add new nodes in the Nutanix cluster (the cluster expand option in Prism and the Nutanix CLI).

Creating Segment for NVDS

This procedure provides details about creating a segment for nVDS.

About this task

To check the vSwitch configuration of the host and verify if NSX-T network supports the CVM port-group, perform the following steps.

Procedure

  1. Log on to vCenter sever and go to the NSX-T Manager.
  2. Click Networking , and go to Connectivity > Segments in the left pane.
  3. Click ADD SEGMENT under the SEGMENTS tab on the right pane and specify the following information.
    Figure. Create New Segment Click to enlarge Create New Segment

    1. Segment Name : Enter a name for the segment.
    2. Transport Zone : Select the VLAN-based transport zone.
      This transport name is associated with the Transport Zone when configuring the NSX switch .
    3. VLAN : Enter the number 0 for native VLAN.
  4. Click Save to create a segment for NVDS.
  5. Click Yes when the system prompts to continue with configuring the segment.
    The newly created segment appears below the prompt.
    Figure. New Segment Created Click to enlarge New Segment Created

Creating NVDS Switch on the Host by Using NSX-T Manager

This procedure provides instructions to create an NVDS switch on the ESXi host. The management and CVM external interface of the host is migrated to the NVDS switch.

About this task

To create an NVDS switch and configure the NSX-T Manager, do the following.

Procedure

  1. Log on to NSX-T Manager.
  2. Click System , and go to Configuration > Fabric > Nodes in the left pane.
    Figure. Add New Node Click to enlarge Add New Node

  3. Click ADD HOST NODE under the HOST TRANSPORT NODES in the right pane.
    1. Specify the following information in the Host Details dialog box.
      Figure. Add Host Details Click to enlarge Add Host Details

        1. Name : Enter an identifiable ESXi host name.
        2. Host IP : Enter the IP address of the ESXi host.
        3. Username : Enter the username used to log on to the ESXi host.
        4. Password : Enter the password used to log on to the ESXi host.
        5. Click Next to move to the NSX configuration.
    2. Specify the following information in the Configure NSX dialog box.
      Figure. Configure NSX Click to enlarge Configure NSX

        1. Mode : Select Standard option.

          Nutanix recommends the Standard mode only.

        2. Name : Displays the default name of the virtual switch that appears on the host. You can edit the default name and provide an identifiable name as per your configuration requirements.
        3. Transport Zone : Select the transport zone that you selected in Creating Segment for NVDS.

          These segments operate in the way similar to the standard port group in a vSphere switch. A port group is created on the NVDS, and VMs that are connected to the port group have their network packets tagged with the configured VLAN ID.

        4. Uplink Profile : Select an uplink profile for the new nVDS switch.

          This selected uplink profile represents the NICs connected to the host. For more information about uplink profiles, see the VMware Documentation .

        5. LLDP Profile : Select the LLDP profile for the new nVDS switch.

          For more information about LLDP profiles, see the VMware Documentation .

        6. Teaming Policy Uplink Mapping : Map the uplinks with the physical NICs of the ESXi host.
          Note: To verify the active physical NICs on the host, select ESXi host > Configure > Networking > Physical Adapters .

          Click Edit icon and enter the name of the active physical NIC in the ESXi host selected for migration to the NVDS.

          Note: Always migrate one physical NIC at a time to avoid connectivity failure with the ESXi host.
        7. PNIC only Migration : Turn on the switch to Yes if there are no VMkernal Adapters (vmks) associated with the PNIC selected for migration from vSS switch to the nVDS switch.
        8. Network Mapping for Install . Click Add Mapping to migrate the VMkernels (vmks) to the NVDS switch.
        9. Network Mapping for Uninstall : To revert the migration of VMKernels.
  4. Click Finish to create the ESXi host to the NVDS switch.
    The newly created nVDS switch appears on the ESXi host.
    Figure. NVDS Switch Created Click to enlarge NVDS Switch Created

Registering NSX-T Manager with Nutanix

After migrating the external interface of the host and the CVM to the NVDS switch, it is mandatory to inform Genesis about the registration of the cluster with the NSX-T Manager. This registration helps Genesis communicate with the NSX-T Manager and avoid failures during LCM, 1-click, and AOS upgrades.

About this task

This procedure demonstrates an AOS upgrade error encountered for a non-registered NSX-T Manager with Nutanix and how to register the the NSX-T Manager with Nutanix and resolve the issue.

To register an the NSX-T Manager with Nutanix, do the following.

Procedure

  1. Log on to the Prism Element web console.
  2. Select VM > Settings > Upgrade Software > Upgrade > Pre-upgrade to upgrade AOS on the host.
    Figure. Upgrade AOS Click to enlarge

  3. The upgrade process throws an error if the NSX-T Manager is not registered with Nutanix.
    Figure. AOS Upgrade Error for Unregistered NSX-T Click to enlarge

    The AOS upgrade determines if the NSX-T networks supports the CVM, its VLAN, and then attempts to get the VLAN information of those networks. To get VLAN information for the CVM, the NSX-T Manager information must be configured in the Nutanix cluster.

  4. To fix this upgrade issue, log on to the Prism Element web console using SSH.
  5. Access the cluster directory.
    nutanix@cvm$ cd ~/cluster/bin
  6. Verify if the NSX-T Manager was registered with the CVM earlier.
    nutanix@cvm$ ~/cluster/bin$ ./nsx_t_manager -l

    If the NSX-T Manager was not registered earlier, you get the following message.

    No MX-T manager configured in the cluster
  7. Register the NSX-T Manager with the CVM if it was not registered earlier. Specify the credentials of the NSX-T Manager to the CVM.
    nutanix@cvm$ ~/cluster/bin$ ./nsx_t_manager -a
    IP address: 10.10.10.10
    Username: admin
    Password: 
    /usr/local/nutanix/cluster/lib/py/requests-2.12.0-py2.7.egg/requests/packages/urllib3/conectionpool.py:843:
     InsecureRequestWarning: Unverified HTTPS request is made. Adding certificate verification is strongly advised. 
    See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
    Successfully persisted NSX-T manager information
  8. Verify the registration of NSX-T Manager with the CVM.
    nutanix@cvm$ ~/cluster/bin$ ./nsx_t_manager -l

    If there are no errors, the system displays a similar output.

    IP address: 10.10.10.10
    Username: admin
  9. In the Prism Element Web Console, click the Pre-upgrade to continue the AOS upgrade procedure.

    The AOS upgrade is completed successfully.

Networking Components

IP Addresses

All CVMs and ESXi hosts have two network interfaces.
Note: An empty interface eth2 is created on CVM during deployment by Foundation. The eth2 interface is used for backplane when backplane traffic isolation (Network Segmentation) is enabled in the cluster. For more information about backplane interface and traffic segmentation, see Securing Traffic Through Network Segmentation section in Security Guide .
Interface IP address vSwitch
ESXi host vmk0 User-defined vSwitch0
CVM eth0 User-defined vSwitch0
ESXi host vmk1 192.168.5.1 vSwitchNutanix
CVM eth1 192.168.5.2 vSwitchNutanix
CVM eth1:1 192.168.5.254 vSwitchNutanix
CVM eth2 User-defined vSwitch0
Note: The ESXi and CVM interfaces on vSwitch0 cannot use IP addresses in any subnets that overlap with subnet 192.168.5.0/24.

vSwitches

A Nutanix node is configured with the following two vSwitches.

  • vSwitchNutanix

    Local communications between the CVM and the ESXi host use vSwitchNutanix. vSwitchNutanix has no uplinks.

    Caution: To manage network traffic between VMs with greater control, create more port groups on vSwitch0. Do not modify vSwitchNutanix.
    Figure. vSwitchNutanix Configuration Click to enlarge vSwitchNutanix Configuration

  • vSwitch0

    All other external communications like CVM to a differnet host (in case of HA re-direction) use vSwitch0 that has uplinks to the physical network interfaces. Since network segmentation is disabled by default, the backplane traffic uses vSwitch0.

    vSwitch0 has the following two networks.

    • Management Network

      HA, vMotion, and vCenter communications use the Management Network.

    • VM Network

      All VMs use the VM Network.

    Caution:
    • The Nutanix CVM uses the standard Ethernet maximum transmission unit (MTU) of 1,500 bytes for all the network interfaces by default. The standard 1,500-byte MTU delivers excellent performance and stability. Nutanix does not support configuring the MTU on a network interface of CVMs to higher values.
    • You can enable jumbo Frames (MTU of 9,000 bytes) on the physical network interfaces of ESXi hosts and guest VMs if the applications on your guest VMs require them. If you choose to use jumbo Frames on hypervisor hosts, ensure to enable them end-to-end in the desired network and consider both the physical and virtual network infrastructure impacted by the change.
    Figure. vSwitch0 Configuration Click to enlarge vSwitch0 Configuration

Configuring Host Networking (Management Network)

After you create the Nutanix cluster by using Foundation, configure networking for your ESXi hosts.

About this task

Figure. Configure Management Network Click to enlarge Ip Configuration image

Procedure

  1. On the ESXi host console, press F2 and then provide the ESXi host logon credentials.
  2. Press the down arrow key until Configure Management Network highlights and then press Enter .
  3. Select Network Adapters and then press Enter .
  4. Ensure that the connected network adapters are selected.
    If they are not selected, press Space key to select them and press Enter key to return to the previous screen.
    Figure. Network Adapters Click to enlarge Select a Network Adapters
  5. If a VLAN ID needs to be configured on the Management Network, select VLAN (optional) and press Enter . In the dialog box, provide the VLAN ID and press Enter .
    Note: Do not add any other device (including guest VMs) to the VLAN to which the CVM and hypervisor host are assigned. Isolate guest VMs on one or more separate VLANs.
  6. Select IP Configuration and press Enter .
    Figure. Configure Management Network Click to enlarge IP Address Configuration
  7. If necessary, highlight the Set static IP address and network configuration option and press Space to update the setting.
  8. Provide values for the following: IP Address , Subnet Mask , and Default Gateway fields based on your environment and then press Enter .
  9. Select DNS Configuration and press Enter .
  10. If necessary, highlight the Use the following DNS server addresses and hostname option and press Space to update the setting.
  11. Provide values for the Primary DNS Server and Alternate DNS Server fields based on your environment and then press Enter .
  12. Press Esc and then Y to apply all changes and restart the management network.
  13. Select Test Management Network and press Enter .
  14. Press Enter to start the network ping test.
  15. Verify that the default gateway and DNS servers reported by the ping test match those that you specified earlier in the procedure and then press Enter .

    Ensure that the tested addresses pass the ping test. If they do not, confirm that the correct IP addresses are configured.

    Figure. Test Management Network Click to enlarge Test Management Network

    Press Enter to close the test window.

  16. Press Esc to log off.

Changing a Host IP Address

About this task

To change a host IP address, perform the following steps. Perform the following steps once for each hypervisor host in the Nutanix cluster. Complete the entire procedure on a host before proceeding to the next host.
Caution: The cluster cannot tolerate duplicate host IP addresses. For example, when swapping IP addresses between two hosts, temporarily change one host IP address to an interim unused IP address. Changing this IP address avoids having two hosts with identical IP addresses on the cluster. Then complete the address change or swap on each host using the following steps.
Note: All CVMs and hypervisor hosts must be on the same subnet. The hypervisor can be multihomed provided that one interface is on the same subnet as the CVM.

Procedure

  1. Configure networking on the Nutanix node. For more information, see Configuring Host Networking (Management Network).
  2. Update the host IP addresses in vCenter. For more information, see Reconnecting a Host to vCenter.
  3. Log on to every CVM in the Nutanix cluster and restart Genesis service.
    nutanix@cvm$ genesis restart

    If the restart is successful, output similar to the following is displayed.

    Stopping Genesis pids [1933, 30217, 30218, 30219, 30241]
    Genesis started on pids [30378, 30379, 30380, 30381, 30403]

Reconnecting a Host to vCenter

About this task

If you modify the IP address of a host, you must reconnect the host with the vCenter. To reconnect the host to the vCenter, perform the following procedure.

Procedure

  1. Log on to vCenter with the web client.
  2. Right-click the host with the changed IP address and select Disconnect .
  3. Right-click the host again and select Remove from Inventory .
  4. Right-click the Nutanix cluster and then click Add Hosts... .
    1. Enter the IP address or fully qualified domain name (FQDN) of the host you want to reconnect in the IP address or FQDN under New hosts .
    2. Enter the host logon credentials in the User name and Password fields, and click Next .
      If a security or duplicate management alert appears, click Yes .
    3. Review the Host Summary and click Next .
    4. Click Finish .
    You can see the host with the updated IP address in the left pane of vCenter.

Selecting a Management Interface

Nutanix tracks the management IP address for each host and uses that IP address to open an SSH session into the host to perform management activities. If the selected vmk interface is not accessible through SSH from the CVMs, activities that require interaction with the hypervisor fail.

If multiple vmk interfaces are present on a host, Nutanix uses the following rules to select a management interface.

  1. Assigns weight to each vmk interface.
    • If vmk is configured for the management traffic under network settings of ESXi, then the weight assigned is 4. Otherwise, the weight assigned is 0.
    • If the IP address of vmk belongs to the same IP subnet as eth0 of the CVMs interface, then 2 is added to its weight.
    • If the IP address of vmk belongs to the same IP subnet as eth2 of the CVMs interface, then 1 is added to its weight.
  2. The vmk interface that has the highest weight is selected as the management interface.

Example of Selection of Management Network

Consider an ESXi host with following configuration.

  • vmk0 IP address and mask: 2.3.62.204, 255.255.255.0
  • vmk1 IP address and mask: 192.168.5.1, 255.255.255.0
  • vmk2 IP address and mask: 2.3.63.24, 255.255.255.0

Consider a CVM with following configuration.

  • eth0 inet address and mask: 2.3.63.31, 255.255.255.0
  • eth2 inet address and mask: 2.3.62.12, 255.255.255.0

According to the rules, the following weights are assigned to the vmk interfaces.

  • vmk0 = 4 + 0 + 1 = 5
  • vmk1 = 0 + 0 + 0 = 0
  • vmk2 = 0 + 2 + 0 = 2

Since vmk0 has the highest weight assigned, vmk0 interface is used as a management IP address for the ESXi host.

To verify that vmk0 interface is selected for management IP address, use the following command.

root@esx# esxcli network ip interface tag get -i vmk0

You see the following output.

Tags: Management, VMmotion

For the other two interfaces, no tags are displayed.

If you want any other interface to act as the management IP address, enable management traffic on that interface by following the procedure described in Selecting a New Management Interface.

Selecting a New Management Interface

You can mark the vmk interface to select as a management interface on an ESXi host by using the following method.

Procedure

  1. Log on to vCenter with the web client.
  2. Do the following on the ESXi host.
    1. Go to Configure > Networking > VMkernel adapters .
    2. Select the interface on which you want to enable the management traffic.
    3. Click Edit settings of the port group to which the vmk belongs.
    4. Select Management check box from the Enabled services option to enable management traffic on the vmk interface.
  3. Open an SSH session to the ESXi host and enable the management traffic on the vmk interface.
    root@esx# esxcli network ip interface tag add -i vmkN --tagname=Management

    Replace vmkN with the vmk interface where you want to enable the management traffic.

Updating Network Settings

After you configure networking of your vSphere deployments on Nutanix Enterprise Cloud, you may want to update the network settings.

  • To know about the best practice of ESXi network teaming policy, see Network Teaming Policy.

  • To migrate an ESXi host networking from a vSphere Standard Switch (vSwitch) to a vSphere Distributed Switch (vDS) with LACP/LAG configuration, see Migrating to a New Distributed Switch with LACP/LAG.

  • To migrate an ESXi host networking from a vSphere standard switch (vSwitch) to a vSphere Distributed Switch (vDS) without LACP, see Migrating to a New Distributed Switch without LACP/LAG.

    .

Network Teaming Policy

On an ESXi host, NIC teaming policy allows you to bundle two or more physical NICs into a single logical link to provide more network bandwidth aggregation and link redundancy to a vSwitch. The NIC teaming policies in the ESXi networking configuration for a vSwitch consists of the following.

  • Route based on originating virtual port.
  • Route based on IP hash.
  • Route based on source MAC hash.
  • Explicit failover order.

In addition to the earlier mentioned NIC teaming policy, vDS uses an extra teaming policy that consists of - Route based on physical NIC load.

When Foundation or Phoenix imaging is performed on a Nutanix cluster, the following two standard virtual switches are created on ESXi hosts:

  • vSwitch0
  • vSwitchNutanix

On vSwitch0, the Nutanix best practice guide (see Nutanix vSphere Networking Solution Document) provides the following recommendations for NIC teaming:

  • vSwitch. Route based on originating virtual port
  • vDS. Route based on physical NIC load

On vSwitchNutanix, there are no uplinks to the virtual switch, so there is no NIC teaming configuration required.

Migrate from a Standard Switch to a Distributed Switch

This topic provides detailed information about how to migrate from a vSphere Standard Switch (vSS) to a vSphere Distributed Switch (vDS).

The following are the two types of virtual switches (vSwitch) in vSphere.

  • vSphere standard switch (vSwitch) (see vSphere Standard Switch (vSwitch) in vSphere Networking).
  • vSphere Distributed Switch (vDS) (see vSphere Distributed Switch (vDS) in vSphere Networking).
Tip: For more information about vSwitches and the associated network concepts, see the VMware Documentation .

For migrating from a vSS to a vDS with LACP/LAG configuration, see Migrating to a New Distributed Switch with LACP/LAG.

For migrating from a vSS to a vDS without LACP/LAG configuration, see Migrating to a New Distributed Switch without LACP/LAG.

Standard Switch Configuration

The standard switch configuration consists of the following.

vSwitchNutanix
vSwitchNutanix handles internal communication between the CVM and the ESXi host. There are no uplink adapters associated with this vSwitch. This virtual switch enables the communication between the CVM and the host. Administrators must not modify the settings of this virtual switch or its port groups. The only members of this port group must be the CVM and its host. Do not modify this virtual switch configuration as it can disrupt communication between the host and the CVM.
vSwitch0
vSwitch0 consists of the vmk (VMkernel) management interface, vMotion interface, and VM port groups. This virtual switch connects to uplink network adapters that are plugged into a physical switch.

Planning the Migration

It is important to plan and understand the migration process. An incorrect configuration can disrupt communication, which can require downtime to resolve.

Consider the following while or before planning the migration.

  • Read Nutanix Best Practice Guide for VMware vSphere Networking .

  • Understand the various teaming and load-balancing algorithms on vSphere.

    For more information, see the VMware Documentation .

  • Confirm communication on the network through all the connected uplinks.
  • Confirm access to the host using IPMI when there are network connectivity issues during migration.

    Access the host to troubleshoot the network issue or move the management network back to the standard switch depending on the issue.

  • Confirm that the hypervisor external management IP address and the CVM IP address are in the same public subnet for the data path redundancy functionality to work.
  • When performing migration to the distributed switch, migrate one host at a time and verify that networking is working as desired.
  • Do not migrate the port groups and vmk (VMkernel) interfaces that are on vSwitchNutanix to the distributed switch (dvSwitch).

Unassigning Physical Uplink of the Host for Distributed Switch

All the physical adapters connect to the vSwitch0 of the host. A live distributed switch must have a physical uplink connected to it to work. To assign the physical adapter of the host to the distributed switch, unassign the physical adapter of the host and assign it to the new distributed switch.

About this task

To unassign the physical uplink of the host, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Click Configure , and go to Networking > Virtual Switches .
  4. Click MANAGE PHYSICAL ADAPTERS tab and select the active adapters from the Assigned adapters that you want to unassign from the list of physical adapters of the host.
    Figure. Managing Physical Adapters Click to enlarge Managing Physical Adapters

  5. Click X on the top.
    The selected adapter is unassigned from the list of physical adapters of the host.
    Tip: Ping the host to check and confirm if you are able to communicate with the active physical adapter of the host. If you lose network connectivity to the ESXi host during this test, review your network configuration.

Migrating to a New Distributed Switch without LACP/LAG

Migrating to a new distributed switch without LACP/LAG consists of the following workflow.

  1. Creating a Distributed Switch
  2. Creating Port Groups on the Distributed Switch
  3. Configuring Port Group Policies

Creating a Distributed Switch

Connect to vCenter and create a distributed switch.

About this task

To create a distributed switch, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Distributed Switch Creation Click to enlarge Distributed Switch Creation

  3. Right-click the host, select Distributed Switch > New Distributed Switch , and specify the following information in the New Distributed Switch dialog box.
    1. Name and Location : Enter name for the distributed switch.
    2. Select Version : Select a distributed switch version that is compatible with all your hosts in that datacenter.
    3. Configure Settings : Select the number of uplinks you want to connect to the distributed switch.
      Select Create a default port group checkbox to create a port group. To configure a port group later, see Creating Port Groups on the Distributed Switch.
    4. Ready to complete : Review the configuration and click Finish .
    A new distributed switch is created with the default uplink port group. The uplink port group is the port group to which the uplinks connect. This uplink is different from the vmk (VMkernel) or the VM port groups.
    Figure. New Distributed Switch Created in the Host Click to enlarge New Distributed Switch Created in the Host

Creating Port Groups on the Distributed Switch

Create one or more vmk (VMkernel) port groups and VM port groups depending on the vSphere features you plan to use and or the physical network layout. The best practice is to have the vmk Management interface, vmk vMotion interface, and vmk iSCSI interface on separate port groups.

About this task

To create port groups on the distributed switch, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Creating Distributed Port Groups Click to enlarge Creating Distributed Port Groups

  3. Right-click the host, select Distributed Switch > Distributed Port Group > New Distributed Port Group , and follow the wizard to create the remaining distributed port group (vMotion interface and VM port groups).
    You would need the following port groups because you would be migrating from the standard switch to the distributed switch.
    • VMkernel Management interface . Use this port group to connect to the host for all management operations.
    • VMNetwork . Use this port group to connect to the new VMs.
    • vMotion . This port group is an internal interface and the host will use this port during failover for vMotion traffic.
    Note: Nutnaix recommends you to use static port binding instead of ephemeral port binding when you create a port group.
    Figure. Distibuted Port Groups Created Click to enlarge Distibuted Port Groups Created

    Note: The port group for vmk management interface is created during the distributed switch creation. See Creating a Distributed Switch for more information.

Configuring Port Group Policies

To configure port groups, you must configure VLANs, Teaming and failover, and other distributed port groups policies at the port group layer or at the distributed switch layer. Refer to the following topics to configure the port group policies.

  1. Configuring Policies on the Port Group Layer
  2. Configuring Policies on the Distributed Switch Layer
  3. Adding ESXi Host to the Distributed Switch

Configuring Policies on the Port Group Layer

Ensure that the distributed switches port groups have VLANs tagged if the physical adapters of the host have a VLAN tagged to them. Update the policies for the port group, VLANs, and teaming algorithms to configure the physical network switch. Configure the load balancing policy as per the network configuration requirements on the physical switch.

About this task

To configure the port group policies, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Configure Port Group Policies on the Distributed Switch Click to enlarge Configure Port Group Policies on the Distributed Switch

  3. Right-click the host, select Distributed Switch > Distributed Port Group > Edit Settings , and follow the wizard to configure the VLAN, Teaming and failover, and other options.
    Note: For more information about configuring port group policies, see the VMware Documentation .
  4. Click OK to complete the configuration.
  5. Repeat steps 2–4 to configure the other port groups.
Configuring Policies on the Distributed Switch Layer

You can configure the same policy for all the port groups simultaneously.

About this task

To configure the same policy for all the port groups, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Manage Distributed Port Groups Click to enlarge Manage Distributed Port Groups

  3. Right-click the host, select Distributed Switch > Distributed Port Group > Manage Distributed Port Groups , and specify the following information in Manage Distributed Port Group dialog box.
    1. In the Select port group policies tab, select the port group policies that you want to configure and click Next .
      Note: For more information about configuring port group policies, see the VMware Documentation .
    2. In the Select port groups tab, select the distributed port groups on which you want to configure the policy and click Next .
    3. In the Teaming and failover tab, configure the Load balancing policy, Active uplinks , and click Next .
    4. In the Ready to complete window, review the configuration and click Finish .
Adding ESXi Host to the Distributed Switch

Migrate the management interface and CVM of the host to the distributed switch.

About this task

To migrate the Management interface and CVM of the ESXi host, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Add ESXi Host to Distributed Switch Click to enlarge Add ESXi Host to Distributed Switch

  3. Right-click the host, select Distributed Switch > Add and Manage Hosts , and specify the following information in Add and Manage Hosts dialog box.
    1. In the Select task tab, select Add hosts to add new host to the distributed switch and click Next .
    2. In the Select hosts tab, click New hosts to select the ESXi host and add it to the distributed switch.
      Note: Add one host at a time to the distributed switch and then migrate all the CVMs from the host to the distributed switch.
    3. In the Manage physical adapters tab, configure the physical NICs (PNICs) on the distributed switch.
      Tip: For consistent network configuration, you can connect the same physical NIC on every host to the same uplink on the distributed switch.
        1. Select a PNIC from the On other switches/unclaimed section and click Assign uplink .
          Figure. Select Physical Adapter for Uplinking Click to enlarge Select Physical Adapter for Uplinking

          Important: If you select physical NICs connected to other switches, those physical NICs migrate to the current distributed switch.
        2. Select the Uplink in the distributed switch to which you want to assign the PNIC of the host and click OK .
        3. Click Next .
    4. In the Manage VMkernel adapters tab, configure the vmk adapters.
        1. Select a VMkernel adapter from the On other switches/unclaimed section and click Assign port group .
        2. Select the port group in the distributed switch to which you want to assign the VMkernel of the host and click OK .
          Figure. Select a Port Group Click to enlarge Select a Port Group

        3. Click Next .
    5. (optional) In the Migrate VM networking tab, select Migrate virtual machine networking to connect all the network adapters of a VM to a distributed port group.
        1. Select the VM to connect all the network adapters of the VM to a distributed port group, or select an individual network adapter to connect with the distributed port group.
        2. Click Assign port group and select the distributed port group to which you want to migrate the VM or network adapter and click OK .
        3. Click Next .
    6. In the Ready to complete tab, review the configuration and click Finish .
  4. Go to the Hosts and Clusters view in the vCenter web client and Hosts > Configure to review the network configuration for the host.
    Note: Run a ping test to confirm that the networking on the host works as expected.
  5. Follow the steps 2–4 to add the remaining hosts to the distributed switch and migrate the adapters.

Migrating to a New Distributed Switch with LACP/LAG

Migrating to a new distributed switch without LACP/LAG consists of the following workflow.

  1. Creating a Distributed Switch
  2. Creating Port Groups on the Distributed Switch
  3. Creating Link Aggregation Group on Distributed Switch
  4. Creating Port Groups to use the LAG
  5. Adding ESXi Host to the Distributed Switch

Creating Link Aggregation Group on Distributed Switch

Using Link Aggregation Group (LAG) on a distributed switch, you can connect the ESXi host to physical switched by using dynamic link aggregation. You can create multiple link aggregation groups (LAGs) on a distributed switch to aggregate the bandwidth of physical NICs on ESXi hosts that are connected to LACP port channels.

About this task

To create a LAG, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
  3. Right-click the host, select Distributed Switch > Configure > LACP .
    Figure. Create LAG on Distributed Switch Click to enlarge Create LAG on Distributed Switch

  4. Click New and enter the following details in the New Link Aggregation Group dialog box.
    1. Name : Enter a name for the LAG.
    2. Number of Ports : Enter the number of ports.
      The number of ports must match the physical ports per host in the LACP LAG. For example, if the Number of Ports two, then you can attach two physical ports per ESXi host to the LAG.
    3. Mode : Specify the state of the physical switch.
      Based on the configuration requirements, you can set the mode to Active or Passive .
    4. Load balancing mode : Specify the load balancing mode for the physical switch.
      For more information about the various load balancing options, see the VMware Documentation .
    5. VLAN trunk range : Specify the VLANs if you have VLANs configured in your environment.
  5. Click OK .
    LAG is created on the distributed switch.

Creating Port Groups to Use LAG

To use LAG as the uplink you have to edit the settings of the port group created on the distributed switch.

About this task

To edit the settings on the port group to use LAG, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
  3. Right-click the host, select Management port Group > Edit Setting .
  4. Go to the Teaming and failover tab in the Edit Settings dialog box and specify the following information.
    Figure. Configure the Management Port Group Click to enlarge Configure the Management Port Group

    1. Load Balancing : Select Route based IP hash .
    2. Active uplinks : Move the LAG under the Unused uplinks section to Active Uplinks section.
    3. Unused uplinks : Select the physical uplinks ( Uplink 1 and Uplink 2 ) and move them to the Unused uplinks section.
  5. Repeat steps 2–4 to configure the other port groups.

Adding ESXi Host to the Distributed Switch

Add the ESXi host to the distributed switch and migrate the network from the standard switch to the distributed switch. Migrate the management interface and CVM of the ESXi host to the distributed switch.

About this task

To migrate the Management interface and CVM of ESXi host, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Add ESXi Host to Distributed Switch Click to enlarge Add ESXi Host to Distributed Switch

  3. Right-click the host, select Distributed Switch > Add and Manage Hosts , and specify the following information in Add and Manage Hosts dialog box.
    1. In the Select task tab, select Add hosts to add new host to the distributed switch and click Next .
    2. In the Select hosts tab, click New hosts to select the ESXi host and add it to the distributed switch.
      Note: Add one host at a time to the distributed switch and then migrate all the CVMs from the host to the distributed switch.
    3. In the Manage physical adapters tab, configure the physical NICs (PNICs) on the distributed switch.
      Tip: For consistent network configuration, you can connect the same physical NIC on every host to the same uplink on the distributed switch.
        1. Select a PNIC from the On other switches/unclaimed section and click Assign uplink .
          Important: If you select physical NICs connected to other switches, those physical NICs migrate to the current distributed switch.
        2. Select the LAG Uplink in the distributed switch to which you want to assign the PNIC of the host and click OK .
        3. Click Next .
    4. In the Manage VMkernel adapters tab, configure the vmk adapters.
      Select the VMkernel adapter that is associated with vSwitch0 as your management VMkernel adapter. Migrate this adapter to the corresponding port group on the distributed switch.
      Note: Do not migrate the VMkernel adapter associated with vSwitchNutanix.
      Note: If the are any VLANs associated with the port group on the standard switch, ensure that the corresponding distributed port group also has the correct VLAN. Verify the physical network configuration to ensure it is configured as required.
        1. Select a VMkernel adapter from the On other switches/unclaimed section and click Assign port group .
        2. Select the port group in the distributed switch to which you want to assign the VMkernel of the host and click OK .
        3. Click Next .
    5. (optional) In the Migrate VM networking tab, select Migrate virtual machine networking to connect all the network adapters of a VM to a distributed port group.
        1. Select the VM to connect all the network adapters of the VM to a distributed port group, or select an individual network adapter to connect with the distributed port group.
        2. Click Assign port group and select the distributed port group to which you want to migrate the VM or network adapter and click OK .
        3. Click Next .
    6. In the Ready to complete tab, review the configuration and click Finish .

vCenter Configuration

VMware vCenter enables the centralized management of multiple ESXi hosts. You can either create a vCenter Server or use an existing vCenter Server. To create a vCenter Server, refer to the VMware Documentation .

This section considers that you already have a vCenter Server and therefore describes the operations you can perform on an existing vCenter Server. To deploy vSphere clusters running Nutanix Enterprise Cloud, perform the following steps in the vCenter.

Tip: For a single-window management of all your ESXi nodes, you can also integrate the vCenter Server to Prism Central. For more information, see Registering a Cluster to the vCenter Server

1. Create a cluster entity within the existing vCenter inventory and configure its settings according to Nutanix best practices. For more information, see Creating a Nutanix Cluster in the vCenter Server.

2. Configure HA. For more information, see vSphere HA Settings.

3. Configure DRS. For more information, see vSphere DRS Settings.

4. Configure EVC. For more information, see vSphere EVC Settings.

5. Configure override. For more information, see VM Override Settings.

6. Add the Nutanix hosts to the new cluster. For more information, see Adding a Nutanix Node to the vCenter Server.

Registering a Cluster to the vCenter Server

To perform core VM management operations directly from Prism without switching to vCenter Server, you need to register your cluster with the vCenter Server.

Before you begin

Ensure that you have vCenter Server Extension privileges as these privileges provide permissions to perform vCenter registration for the Nutanix cluster.

About this task

Following are some of the important points about registering vCenter Server.

  • Nutanix does not store vCenter Server credentials.
  • Whenever a new node is added to Nutanix cluster, vCenter Sever registration for the new node is automatically performed.

Procedure

  1. Log into the Prism web console.
  2. Click the gear icon in the main menu and then select vCenter Registration in the Settings page.
    The vCenter Server that is managing the hosts in the cluster is auto-discovered and displayed.
  3. Click the Register link.
    The IP address is auto-populated in the Address field. The port number field is also auto-populated with 443. Do not change the port number. For the complete list of required ports, see Port Reference.
  4. Type the administrator user name and password of the vCenter Server in the Admin Username and Admin Password fields.
    Figure. vCenter Registration Figure 1 Click to enlarge vcenter registration

  5. Click Register .
    During the registration process a certificate is generated to communicate with the vCenter Server. If the registration is successful, relevant message is displayed in the Tasks dashboard. The Host Connection field displays as Connected, which implies that all the hosts are being managed by the vCenter Server that is registered.
    Figure. vCenter Registration Figure 2 Click to enlarge vcenter registration

Unregistering a Cluster from the vCenter Server

To unregister the vCenter Server from your cluster, perform the following procedure.

About this task

  • Ensure that you unregister the vCenter Server from the cluster before changing the IP address of the vCenter Server. After you change the IP address of the vCenter Sever, you should register the vCenter Server again with the new IP address with the cluster.
  • The vCenter Server Registration page displays the registered vCenter Server. If for some reason the Host Connection field changes to Not Connected , it implies that the hosts are being managed by a different vCenter Server. In this case, there will be new vCenter entry with host connection status as Connected and you need to register to this vCenter Server.

Procedure

  1. Log into the Prism web console.
  2. Click the gear icon in the main menu and then select vCenter Registration in the Settings page.
    A message that cluster is already registered to the vCenter Server is displayed.
  3. Type the administrator user name and password of the vCenter Server in the Admin Username and Admin Password fields.
  4. Click Unregister .
    If the credentials are correct, the vCenter Server is unregistered from the cluster and a relevant message is displayed in the Tasks dashboard.

Creating a Nutanix Cluster in the vCenter Server

Before you begin

Nutanix recommends creating a storage container in the Prism Element running on the host or using the default container to mount NFS datastore on all ESXi hosts.

About this task

To enable the vCenter to discover the Nutanix clusters, perform the following steps in the vCenter.

Procedure

  1. Log on to vCenter with the web client.
  2. Do one of the following.
    • If you want the Nutanix cluster to be in an existing datacenter, proceed to step 3.
    • If you want the Nutanix cluster to be in a new datacenter or if there is no datacenter, perform the following steps to create a datacenter.
      Note: Nutanix clusters must be in a datacenter.
    1. Go to the Hosts and Clusters view and right-click the IP address of the vCenter Server in the left pane.
    2. Click New Datacenter .
    3. Enter a meaningful name for the datacenter (for example, NTNX-DC ) and click OK .
  3. Right-click the datacenter node and click New Cluster .
    1. Enter a meaningful name for the cluster in the Name field (for example, NTNX-Cluster ).
    2. Turn on the vSphere DRS switch.
    3. Turn on the Turn on vSphere HA switch.
    4. Uncheck Manage all hosts in the cluster with a single image .
    Nutanix cluster ( NTNX-Cluster ) is created with the default settings for vSphere HA and vSphere DRS.

What to do next

Add all the Nutanix nodes to the Nutanix cluster inventory in vCenter. For more information, see Adding a Nutanix Node to the vCenter Server.

Adding a Nutanix Node to the vCenter Server

Before you begin

Configure the Nutanix cluster according to Nutanix specifications given in Creating a Nutanix Cluster in the vCenter Server and vSphere Cluster Settings Checklist.

About this task

Note: To ensure that vCenter managed ESXi hosts are accessible through vCenter only and are not directly accessible, put the vCenter managed ESXi hosts in lockdown mode. Lockdown mode forces all operations through the vCenter Server.
Tip: Refer to KB-1661 for the default credentials of all cluster components.

Procedure

  1. Log on to vCenter with the web client.
  2. Right-click the Nutanix cluster and then click Add Hosts... .
    1. Enter the IP address or fully qualified domain name (FQDN) of the host you want to reconnect in the IP address or FQDN under New hosts .
    2. Enter the host logon credentials in the User name and Password fields, and click Next .
      If a security or duplicate management alert appears, click Yes .
    3. Review the Host Summary and click Next .
    4. Click Finish .
  3. Select the host under the Nutanix cluster from the left pane and go to Configure > System > Security Profile .
    Ensure that Lockdown Mode is Disabled because Nutanix does not support lockdown mode.
  4. Configure DNS servers.
    1. Go to Configure > Networking > TCP/IP configuration .
    2. Click Default under TCP/IP stack and go to TCP/IP .
    3. Click the pencil icon to configure DNS servers and perform the following.
        1. Select Enter settings manually .
        2. Type the domain name in the Domain field.
        3. Type DNS server addresses in the Preferred DNS Server and Alternate DNS Server fields and click OK .
  5. Configure NTP servers.
    1. Go to Configure > System > Time Configuration .
    2. Click Edit .
    3. Select the Use Network Time Protocol (Enable NTP client) .
    4. Type the NTP server address in the NTP Servers text box.
    5. In the NTP Service Startup Policy, select Start and stop with host from the drop-down list.
      Add multiple NTP servers if necessary.
    6. Click OK .
  6. Click Configure > Storage and ensure that NFS datastores are mounted.
    Note: Nutanix recommends creating a storage container in Prism Element running on the host.
  7. If HA is not enabled, set the CVM to start automatically when the ESXi host starts.
    Note: Automatic VM start and stop is disabled in clusters where HA is enabled.
    1. Go to Configure > Virtual Machines > VM Startup/Shutdown .
    2. Click Edit .
    3. Ensure that Automatically start and stop the virtual machines with the system is checked.
    4. If the CVM is listed in Manual Startup , click the up arrow to move the CVM into the Automatic Startup section.
    5. Click OK .

What to do next

Configure HA and DRS settings. For more information, see vSphere HA Settings and vSphere DRS Settings.

Nutanix Cluster Settings

To ensure the optimal performance of your vSphere deployment running on Nutanix cluster, configure the following settings from the vCenter.

vSphere General Settings

About this task

Configure the following general settings from vCenter.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Click Configure , and go to Configuration > General .
    1. Under General , set the Swap file location to Virtual machine directory .
      Setting the swap file location to the VM directory stores the VM swap files in the same directory as the VM.
    2. Under Default VM Compatibility , set the compatibility to Use datacenter setting and host version .
      Do not change the compatibility unless the cluster has to support previous versions of ESXi VMs.
      Figure. General Cluster Settings Click to enlarge General Cluster Settings

vSphere HA Settings

If there is a node failure, vSphere HA (High Availability) settings ensure that there are sufficient compute resources available to restart all VMs that were running on the failed node.

About this task

Configure the following HA settings from vCenter.
Note: Nutanix recommends that you configure vSphere HA and DRS even if you do not use the features. The vSphere cluster configuration preserves the settings, so if you later decide to enable the features, the settings are in place and conform to Nutanix best practices.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Click Configure , and go to Services > vSphere Availability .
  4. Click Edit next to the text showing vSphere HA status.
    Figure. vSphere Availability Settings: Failures and Responses Click to enlarge vSphere Availability Settings: Failures and Responses

    1. Turn on the vSphere HA and Enable Host Monitoring switches.
    2. Specify the following information under the Failures and Responses tab.
        1. Host Failure Response : Select Restart VMs from the drop-down list.

          This option configures the cluster-wide host isolation response settings.

        2. Response for Host Isolation : Select Power off and restart VMs from the drop-down list.
        3. Datastore with PDL : Select Disabled from the drop-down list.
        4. Datastore with APD : Select Disabled from the drop-down list.
          Note: To enable the VM component protection in vCenter, refer to the VMware Documentation.
        5. VM Monitoring : Select Disabled from the drop-down list.
    3. Specify the following information under the Admission Control tab.
      Note: If you are using replication factor 2 with cluster sizes up to 16 nodes, configure HA admission control settings to tolerate one node failure. For cluster sizes larger than 16 nodes, configure HA admission control to sustain two node failures and use replication factor 3. vSphere 6.7, and newer versions automatically calculate the percentage of resources required for admission control.
      Figure. vSphere Availability Settings: Admission Control Click to enlarge vSphere Availability Settings: Admission Control

        1. Host failures cluster tolerates : Enter 1 or 2 based on the number of nodes in the Nutanix cluster and the replication factor.
        2. Define host failover capacity by : Select Cluster resource Percentage from the drop-down list.
        3. Performance degradation VMs tolerate : Set the percentage to 100.

          For more information about settings of percentage of cluster resources reserved as failover spare capacity, see vSphere HA Admission Control Settings for Nutanix Environment.

    4. Specify the following information under the Heartbeat Datastores tab.
      Note: vSphere HA uses datastore heart beating to distinguish between hosts that have failed and hosts that reside on a network partition. With datastore heart beating, vSphere HA can monitor hosts when a management network partition occurs while continuing to respond to failures.
      Figure. vSphere Availability Settings: Heartbeat Datastores Click to enlarge vSphere Availability Settings: Heartbeat Datastores

        1. Select Use datastores only from the specified list .
        2. Select the named storage container mounted as the NFS datastore (Nutanix datastore).

          If you have more than one named storage container, select all that are applicable.

        3. If the cluster has only one datastore, click Advanced Options tab and add das.ignoreInsufficientHbDatastore with Value of true .
    5. Click OK .

vSphere HA Admission Control Settings for Nutanix Environment

Overview

If you are using redundancy factor 2 with cluster sizes of up to 16 nodes, you must configure HA admission control settings with the appropriate percentage of CPU/RAM to achieve at least N+1 availability. For cluster sizes larger than 16 nodes, you must configure HA admission control with the appropriate percentage of CPU/RAM to achieve at least N+2 availability.

N+2 Availability Configuration

The N+2 availability configuration can be achieved in the following two ways.

  • Redundancy factor 2 and N+2 vSphere HA admission control setting configured.

    Because Nutanix distributed file system recovers in the event of a node failure, it is possible to have a second node failure without data being unavailable if the Nutanix cluster has fully recovered before the subsequent failure. In this case, a N+2 vSphere HA admission control setting is required to ensure sufficient compute resources are available to restart all the VMs.

  • Redundancy factor 3 and N+2 vSphere HA admission control setting configured.
    If you want two concurrent node failures to be tolerated and the cluster has insufficient blocks to use block awareness, redundancy factor 3 in a cluster of five or more nodes is required. In either of these two options, the Nutanix storage pool must have sufficient free capacity to restore the configured redundancy factor (2 or 3). The percentage of free space required is the same as the required HA admission control percentage setting. In this case, redundancy factor 3 must be configured at the storage container layer. An N+2 vSphere HA admission control setting is also required to ensure sufficient compute resources are available to restart all the VMs.
    Note: For redundancy factor 3, a minimum of five nodes is required, which provides the ability that two concurrent nodes can fail while ensuring data remains online. In this case, the same N+2 level of availability is required for the vSphere cluster to enable the VMs to restart following a failure.

For redundancy factor 2 deployments, the recommended minimum HA admission control setting percentage is marked with single asterisk (*) symbol in the following table. For redundancy factor 2 or redundancy factor 3 deployments configured for multiple non-concurrent node failures to be tolerated, the minimum required HA admission control setting percentage is marked with two asterisks (**) in the following table.

Table 1. Minimum Reservation Percentage for vSphere HA Admission Control Setting
Nodes Availability Level
N+1 N+2 N+3 N+4
1 N/A N/A N/A N/A
2 N/A N/A N/A N/A
3 33* N/A N/A N/A
4 25* 50 75 N/A
5 20* 40** 60 80
6 18* 33** 50 66
7 15* 29** 43 56
8 13* 25** 38 50
9 11* 23** 33 46
10 10* 20** 30 40
11 9* 18** 27 36
12 8* 17** 25 34
13 8* 15** 23 30
14 7* 14** 21 28
15 7* 13** 20 26
16 6* 13** 19 25
Nodes Availability Level
N+1 N+2 N+3 N+4
17 6 12* 18** 24
18 6 11* 17** 22
19 5 11* 16** 22
20 5 10* 15** 20
21 5 10* 14** 20
22 4 9* 14** 18
23 4 9* 13** 18
24 4 8* 13** 16
25 4 8* 12** 16
26 4 8* 12** 16
27 4 7* 11** 14
28 4 7* 11** 14
29 3 7* 10** 14
30 3 7* 10** 14
31 3 6* 10** 12
32 3 6* 9** 12

The table also represents the percentage of the Nutanix storage pool, which should remain free to ensure that the cluster can fully restore the redundancy factor in the event of one or more nodes, or even a block failure (where three or more blocks exist within a cluster).

Block Awareness

For deployments of at least three blocks, block awareness automatically ensures data availability when an entire block of up to four nodes configured with redundancy factor 2 can become unavailable.

If block awareness levels of availability are required, the vSphere HA admission control setting must ensure sufficient compute resources are available to restart all virtual machines. In addition, the Nutanix storage pool must have sufficient space to restore redundancy factor 2 to all data.

The vSphere HA minimum availability level must be equal to number of nodes per block.

Note: For block awareness, each block must be populated with a uniform number of nodes. In the event of a failure, a non-uniform node count might compromise block awareness or the ability to restore the redundancy factor, or both.

Rack Awareness

Rack fault tolerance is the ability to provide a rack-level availability domain. With rack fault tolerance, data is replicated to nodes that are not in the same rack. Rack failure can occur in the following situations.

  • All power supplies in a rack fail.
  • Top-of-rack (TOR) switch fails.
  • Network partition occurs: one of the racks becomes inaccessible from the other racks.

With rack fault tolerance enabled, the cluster has rack awareness and guest VMs can continue to run even during the failure of one rack (with replication factor 2) or two racks (with replication factor 3). The redundant copies of guest VM data and metadata persist on other racks when one rack fails.

Table 2. Rack awareness has minimum requirements, described in the following table.
Replication factor Minimum number of nodes Minimum number of Blocks Minimum number of racks Data resiliency
2 3 3 3 Failure of 1 node, block, or rack
3 5 5 5 Failure of 2 nodes, blocks, or racks

vSphere DRS Settings

About this task

Configure the following DRS settings from vCenter.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Click Configure , and go to Services > vSphere DRS .
  4. Click Edit next to the text showing vSphere DRS status.
    Figure. vSphere DRS Settings: Automation Click to enlarge vSphere DRS Settings: Automation

    1. Turn on the vSphere DRS switch.
    2. Specify the following information under the Automation tab.
        1. Automation Level : Select Fully Automated from the drop-down list.
        2. Migration Threshold : Set the bar between conservative and aggressive (value=3).

          Migration threshold provides optimal resource utilization while minimizing DRS migrations with little benefit. This threshold automatically manages data locality in such a way that whenever VMs move, writes are always written on one of the replicas locally to maximize the subsequent read performance.

          Nutanix recommends the migration threshold at 3 in a fully automated configuration.

        3. Predictive DRS : Leave the option disabled.

          The value of predictive DRS depends on whether you use other VMware products such as vRealize operations. Unless you use vRealize operations, Nutanix recommends disabling predictive DRS.

        4. Virtual Machine Automation : Enable VM automation.
    3. Specifying anything under the Additional Options tab is optional.
    4. Specify the following information under the Power Management tab.
      Figure. vSphere DRS Settings: Power Management Click to enlarge vSphere DRS Settings: Power Management

        1. DPM : Leave the option disabled.

          Enabling DPM causes nodes in the Nutanix cluster to go offline, affecting cluster resources.

    5. Click OK .

vSphere EVC Settings

vSphere enhanced vMotion compatibility (EVC) ensures that workloads can live migrate, using vMotion, between ESXi hosts in a Nutanix cluster that are running different CPU generations. The general recommendation is to have EVC enabled as it will help you in the future where you will be scaling your Nutanix clusters with new hosts that might contain new CPU models.

About this task

To enable EVC in a brownfield scenario can be challenging. Configure the following EVC settings from vCenter.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Shut down all the VMs on the hosts with feature sets greater than the EVC mode.
    Ensure that the Nutanix cluster contains hosts with CPUs from only one vendor, either Intel or AMD.
  4. Click Configure , and go to Configuration > VMware EVC .
  5. Click Edit next to the text showing VMware EVC.
  6. Enable EVC for the CPU vendor and feature set appropriate for the hosts in the Nutanix cluster, and click OK .
    If the Nutanix cluster contains nodes with different processor classes, enable EVC with the lower feature set as the baseline.
    Tip: To know the processor class of a node, perform the following steps.
      1. Log on to Prism Element running on the Nutanix cluster.
      2. Click Hardware from the menu and go to Diagram or Table view.
      3. Click the node and look for the Block Serial field in Host Details .
    Figure. VMware EVC Click to enlarge VMware EVC

  7. Start the VMs in the Nutanix cluster to apply the EVC.
    If you try to enable EVC on a Nutanix cluster with mismatching host feature sets (mixed processor clusters), the lowest common feature set (lowest processor class) is selected. Hence, if VMs are already running on the new host and if you want to enable EVC on the host, you must first shut down the VMs and then enable EVC.
    Note: Do not shut down more than one CVM at the same time.

VM Override Settings

You must exclude Nutanix CVMs from vSphere availability and resource scheduling and therefore tweak the following VM overriding settings.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Click Configure , and go to Configuration > VM Overrides .
  4. Select all the CVMs and click Next .
    If you do not have the CVMs listed, click Add to ensure that the CVMs are added to the VM Overrides dialog box.
    Figure. VM Override Click to enlarge VM Override

  5. In the VM override section, configure override for the following parameters.
    • DRS Automation Level: Disabled
    • VM HA Restart Priority: Disabled
    • VM Monitoring: Disabled
  6. Click Finish .

Migrating a Nutanix Cluster from One vCenter Server to Another

About this task

Perform the following steps to migrate a Nutanix cluster from one vCenter Server to another vCenter Server.
Note: The following steps are to migrate a Nutanix cluster with vSphere Standard Switch (vSwitch). To migrate a Nutanix cluster with vSphere Distributed Switch (vDS), see the VMware Documentation. .

Procedure

  1. Create a vSphere cluster in the vCenter Server where you want to migrate the Nutanix cluster. See Creating a Nutanix Cluster in the vCenter Server.
  2. Configure HA, DRS, and EVC on the created vSphere cluster. See Nutanix Cluster Settings.
  3. Unregister the Nutanix cluster from the source vCenter Server. See Unregistering a Cluster from the vCenter Server.
  4. Move the nodes from the source vCenter Server to the new vCenter Server.
    See the VMware Documentation to know the process.
  5. Register the Nutanix cluster to the new vCenter Server. See Registering a Cluster to the vCenter Server.

Storage I/O Control (SIOC)

SIOC controls the I/O usage of a virtual machine and gradually enforces the predefined I/O share levels. Nutanix converged storage architecture does not require SIOC. Therefore, while mounting a storage container on an ESXi host, the system disables SIOC in the statistics mode automatically.

Caution: While mounting a storage container on ESXi hosts running older versions (6.5 or below), the system enables SIOC in the statistics mode by default. Nutanix recommends disabling SIOC because an enabled SIOC can cause the following issues.
  • The storage can become unavailable because the hosts repeatedly create and delete the access .lck-XXXXXXXX files under the .iorm.sf subdirectory, located in the root directory of the storage container.
  • Site Recovery Manager (SRM) failover and failback does not run efficiently.
  • If you are using Metro Availability disaster recovery feature, activate and restore operations do not work.
    Note: For using Metro Availability disaster recovery feature, Nutanix recommends using an empty storage container. Disable SIOC and delete all the files from the storage container that are related to SIOC. For more information, see KB-3501.
Run the NCC health check (see KB-3358) to verify if SIOC and SIOC in statistics mode are disabled on storage containers. If SIOC and SIOC in statistics mode are enabled on storage containers, disable them by performing the procedure described in Disabling Storage I/O Control (SIOC) on a Container.

Disabling Storage I/O Control (SIOC) on a Container

About this task

Perform the following procedure to disable storage I/O statistics collection.

Procedure

  1. Log on to vCenter with the web client.
  2. Click the Storage view in the left pane.
  3. Right-click the storage container under the Nutanix cluster and select Configure Storage I/O Controller .
    The properties for the storage container are displayed. The Disable Storage I/O statistics collection option is unchecked, which means that SIOC is enabled by default.
    1. Disable Storage I/O Control and statistics collection.
    2. Disable Storage I/O Control but enable statistics collection.
    3. Disable Storage I/O Control and statistics collection: Select the Disable Storage I/O Control and statistics collection option to disable SIOC.
      Uncheck Include I/O Statistics for SDRS option.
    4. Click OK .

Node Management

This chapter describes the management tasks you can do on a Nutanix node.

Nonconfigurable ESXi Components

The Nutanix manufacturing and installation processes done by running Foundation on the Nutanix nodes configures the following components. Do not modify any of these components except under the direction of Nutanix Support.

Nutanix Software

Modifying any of the following Nutanix software settings may inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.

  • Local datastore name.
  • Configuration and contents of any CVM (except memory configuration to enable certain features).
Important: Note the following important considerations about CVMs.
  • Do not delete the Nutanix CVM.
  • Do not take a snapshot of the CVM for backup.
  • Do not rename, modify, or delete the admin and nutanix user accounts of the CVM.
  • Do not create additional CVM user accounts.

    Use the default accounts ( admin or nutanix ), or use sudo to elevate to the root account.

  • Do not decrease CVM memory below recommended minimum amounts required for cluster and add-in features.

    Nutanix Cluster Checks (NCC), preupgrade cluster checks, and the AOS upgrade process detect and monitor CVM memory.

  • Nutanix does not support the usage of third-party storage on the host part of Nutanix clusters.

    Normal cluster operations might be affected if there are connectivity issues with the third-party storage you attach to the hosts in a Nutanix cluster.

  • Do not run any commands on a CVM that are not in the Nutanix documentation.

ESXi

Modifying any of the following ESXi settings can inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.

  • NFS datastore settings
  • VM swapfile location
  • VM startup/shutdown order
  • CVM name
  • CVM virtual hardware configuration file (.vmx file)
  • iSCSI software adapter settings
  • Hardware settings, including passthrough HBA settings.

  • vSwitchNutanix standard virtual switch
  • vmk0 interface in Management Network port group
  • SSH
    Note: An SSH connection is necessary for various scenarios. For example, to establish connectivity with the ESXi server through a control plane that does not depend on additional management systems or processes. The SSH connection is also required to modify the networking and control paths in the case of a host failure to maintain High Availability. For example, the CVM autopathing (Ha.py) requires an SSH connection. In case a local CVM becomes unavailable, another CVM in the cluster performs the I/O operations over the 10GbE interface.
  • Open host firewall ports
  • CPU resource settings such as CPU reservation, limit, and shares of the CVM.
    Caution: Do not use the Reset System Configuration option.
  • ProductLocker symlink setting to point at the default datastore.

    Do not change the /productLocker symlink to point at a non-local datastore.

    Do not change the ProductLockerLocation advanced setting.

Putting the CVM and ESXi Host in Maintenance Mode

About this task

Nutanix recommends placing the CVM and ESXi host into maintenance mode while the Nutanix cluster undergoes maintenance or patch installations.
Caution: Verify the data resiliency status of your Nutanix cluster. Ensure that the replication factor (RF) supports putting the node in maintenance mode.

Procedure

  1. Log on to vCenter with the web client.
  2. If vSphere DRS is enabled on the Nutanix cluster, skip this step. If vSphere DRS is disabled, perform one of the following.
    • Manually migrate all the VMs except the CVM to another host in the Nutanix cluster.
    • Shut down VMs other than the CVM that you do not want to migrate to another host.
  3. Right-click the host and select Maintenance Mode > Enter Maintenance Mode.
  4. In the Enter Maintenance Mode dialog box, check Move powered-off and suspended virtual machines to other hosts in the cluster and click OK .
    The host gets ready to go into maintenance mode, which prevents VMs from running on this host. DRS automatically attempts to migrate all the VMs to another host in the Nutanix cluster.
Note:

In certain rare conditions, even when DRS is enabled, some VMs do not automatically migrate due to user-defined affinity rules or VM configuration settings. The VMs that do not migrate appear under cluster DRS > Faults when a maintenance mode task is in progress. To address the faults, either manually shut down those VMs or ensure the VMs can be migrated.

  1. Log on to the CVM with SSH and shut down the CVM.
    nutanix@cvm$ cvm_shutdown -P now
    Note: Do not reset or shutdown the CVM in any way other than the cvm_shutdown command to ensure that the cluster is aware that the CVM is unavailable.
  2. After the CVM shuts down, wait for the host to go into maintenance mode.
    The host enters maintenance mode after its CVM is shut down.

Shutting Down an ESXi Node in a Nutanix Cluster

Before you begin

Verify the data resiliency status of your Nutanix cluster. If the Nutanix cluster only has replication factor 2 (RF2), you can shut down only one node for each cluster. If an RF2 cluster has more than one node shut down, shut down the entire cluster.

About this task

You can put the ESXi host into maintenance mode and shut it down either from the web client or from the command line. For more information about shutting down a node from the command line, see Shutting Down an ESXi Node in a Nutanix Cluster (vSphere Command Line).

Procedure

  1. Log on to vCenter with the web client.
  2. Put the Nutanix node in the maintenance mode. For more information, see Putting the CVM and ESXi Host in Maintenance Mode.
    Note: If DRS is not enabled, manually migrate or shut down all the VMs excluding the CVM. The VMs that are not migrated automatically even when the DRS is enabled can be because of a configuration option in the VM that is not present on the target host.
  3. Right-click the host and select Shut Down .
    Wait until the vCenter displays that the host is not responding, which may take several minutes. If you are logged on to the ESXi host rather than to vCenter, the web client disconnects when the host shuts down.

Shutting Down an ESXi Node in a Nutanix Cluster (vSphere Command Line)

Before you begin

Verify the data resiliency status of your Nutanix cluster. If the Nutanix cluster only has replication factor 2 (RF2), you can shut down only one node for each cluster. If an RF2 cluster has more than one node shut down, shut down the entire cluster.

About this task

Procedure

  1. Log on to the CVM with SSH and shut down the CVM.
    nutanix@cvm$ cvm_shutdown -P now
  2. Log on to another CVM in the Nutanix cluster with SSH.
  3. Shut down the host.
    nutanix@cvm$ ~/serviceability/bin/esx-enter-maintenance-mode -s cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host.

    If successful, this command returns no output. If it fails with a message like the following, VMs are probably still running on the host.

    CRITICAL esx-enter-maintenance-mode:42 Command vim-cmd hostsvc/maintenance_mode_enter failed with ret=-1

    Ensure that all VMs are shut down or moved to another host and try again before proceeding.

    nutanix@cvm$ ~/serviceability/bin/esx-shutdown -s cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host..

    Alternatively, you can put the ESXi host into maintenance mode and shut it down using the vSphere web client. For more information, see Shutting Down an ESXi Node in a Nutanix Cluster.

    If the host shuts down, a message like the following is displayed.

    INFO esx-shutdown:67 Please verify if ESX was successfully shut down using ping hypervisor_ip_addr

    hypervisor_ip_addr is the IP address of the ESXi host.

  4. Confirm that the ESXi host has shut down.
    nutanix@cvm$ ping hypervisor_ip_addr

    Replace hypervisor_ip_addr with the IP address of the ESXi host.

    If no ping packets are answered, the ESXi host shuts down.

Starting an ESXi Node in a Nutanix Cluster

About this task

You can start an ESXi host either from the web client or from the command line. For more information about starting a node from the command line, see Starting an ESXi Node in a Nutanix Cluster (vSphere Command Line).

Procedure

  1. If the node is off, turn it on by pressing the power button on the front. Otherwise, proceed to the next step.
  2. Log on to vCenter (or to the node if vCenter is not running) with the web client.
  3. Right-click the ESXi host and select Exit Maintenance Mode .
  4. Right-click the CVM and select Power > Power on .
    Wait approximately 5 minutes for all services to start on the CVM.
  5. Log on to another CVM in the Nutanix cluster with SSH.
  6. Confirm that the Nutanix cluster services are running on the CVM.
    nutanix@cvm$ ncli cluster status | grep -A 15 cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host.

    Output similar to the following is displayed.
        Name                      : 10.1.56.197
        Status                    : Up
        ... ... 
        StatsAggregator           : up
        SysStatCollector          : up

    Every service listed should be up .

  7. Right-click the ESXi host in the web client and select Rescan for Datastores . Confirm that all Nutanix datastores are available.
  8. Verify that the status of all services on all the CVMs are Up.
    nutanix@cvm$ cluster status
    If the Nutanix cluster is running properly, output similar to the following is displayed for each node in the Nutanix cluster.
    CVM: <host IP-Address> Up
                                    Zeus   UP       [9935, 9980, 9981, 9994, 10015, 10037]
                               Scavenger   UP       [25880, 26061, 26062]
                                  Xmount   UP       [21170, 21208]
                        SysStatCollector   UP       [22272, 22330, 22331]
                               IkatProxy   UP       [23213, 23262]
                        IkatControlPlane   UP       [23487, 23565]
                           SSLTerminator   UP       [23490, 23620]
                          SecureFileSync   UP       [23496, 23645, 23646]
                                  Medusa   UP       [23912, 23944, 23945, 23946, 24176]
                      DynamicRingChanger   UP       [24314, 24404, 24405, 24558]
                                  Pithos   UP       [24317, 24555, 24556, 24593]
                              InsightsDB   UP       [24322, 24472, 24473, 24583]
                                  Athena   UP       [24329, 24504, 24505]
                                 Mercury   UP       [24338, 24515, 24516, 24614]
                                  Mantle   UP       [24344, 24572, 24573, 24634]
                              VipMonitor   UP       [18387, 18464, 18465, 18466, 18474]
                                Stargate   UP       [24993, 25032]
                    InsightsDataTransfer   UP       [25258, 25348, 25349, 25388, 25391, 25393, 25396]
                                   Ergon   UP       [25263, 25414, 25415]
                                 Cerebro   UP       [25272, 25462, 25464, 25581]
                                 Chronos   UP       [25281, 25488, 25489, 25547]
                                 Curator   UP       [25294, 25528, 25529, 25585]
                                   Prism   UP       [25718, 25801, 25802, 25899, 25901, 25906, 25941, 25942]
                                     CIM   UP       [25721, 25829, 25830, 25856]
                            AlertManager   UP       [25727, 25862, 25863, 25990]
                                Arithmos   UP       [25737, 25896, 25897, 26040]
                                 Catalog   UP       [25749, 25989, 25991]
                               Acropolis   UP       [26011, 26118, 26119]
                                   Uhura   UP       [26037, 26165, 26166]
                                    Snmp   UP       [26057, 26214, 26215]
                       NutanixGuestTools   UP       [26105, 26282, 26283, 26299]
                              MinervaCVM   UP       [27343, 27465, 27466, 27730]
                           ClusterConfig   UP       [27358, 27509, 27510]
                                Aequitas   UP       [27368, 27567, 27568, 27600]
                             APLOSEngine   UP       [27399, 27580, 27581]
                                   APLOS   UP       [27853, 27946, 27947]
                                   Lazan   UP       [27865, 27997, 27999]
                                  Delphi   UP       [27880, 28058, 28060]
                                    Flow   UP       [27896, 28121, 28124]
                                 Anduril   UP       [27913, 28143, 28145]
                                   XTrim   UP       [27956, 28171, 28172]
                           ClusterHealth   UP       [7102, 7103, 27995, 28209,28495, 28496, 28503, 28510,	
    28573, 28574, 28577, 28594, 28595, 28597, 28598, 28602, 28603, 28604, 28607, 28645, 28646, 28648, 28792,	
    28793, 28837, 28838, 28840, 28841, 28858, 28859, 29123, 29124, 29127, 29133, 29135, 29142, 29146, 29150,	
    29161, 29162, 29163, 29179, 29187, 29219, 29268, 29273]

Starting an ESXi Node in a Nutanix Cluster (vSphere Command Line)

About this task

You can start an ESXi host either from the command line or from the web client. For more information about starting a node from the web client, see Starting an ESXi Node in a Nutanix Cluster .

Procedure

  1. Log on to a running CVM in the Nutanix cluster with SSH.
  2. Start the CVM.
    nutanix@cvm$ ~/serviceability/bin/esx-exit-maintenance-mode -s cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host.

    If successful, this command produces no output. If it fails, wait 5 minutes and try again.

    nutanix@cvm$ ~/serviceability/bin/esx-start-cvm -s cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host.

    .

    If the CVM starts, a message like the following is displayed.

    INFO esx-start-cvm:67 CVM started successfully. Please verify using ping cvm_ip_addr

    cvm_ip_addr is the IP address of the CVM on the ESXi host.

    After starting, the CVM restarts once. Wait three to four minutes before you ping the CVM.

    Alternatively, you can take the ESXi host out of maintenance mode and start the CVM using the web client. For more information, see Starting an ESXi Node in a Nutanix Cluster

  3. Verify that the status of all services on all the CVMs are Up.
    nutanix@cvm$ cluster status
    If the Nutanix cluster is running properly, output similar to the following is displayed for each node in the Nutanix cluster.
    CVM: <host IP-Address> Up
                                    Zeus   UP       [9935, 9980, 9981, 9994, 10015, 10037]
                               Scavenger   UP       [25880, 26061, 26062]
                                  Xmount   UP       [21170, 21208]
                        SysStatCollector   UP       [22272, 22330, 22331]
                               IkatProxy   UP       [23213, 23262]
                        IkatControlPlane   UP       [23487, 23565]
                           SSLTerminator   UP       [23490, 23620]
                          SecureFileSync   UP       [23496, 23645, 23646]
                                  Medusa   UP       [23912, 23944, 23945, 23946, 24176]
                      DynamicRingChanger   UP       [24314, 24404, 24405, 24558]
                                  Pithos   UP       [24317, 24555, 24556, 24593]
                              InsightsDB   UP       [24322, 24472, 24473, 24583]
                                  Athena   UP       [24329, 24504, 24505]
                                 Mercury   UP       [24338, 24515, 24516, 24614]
                                  Mantle   UP       [24344, 24572, 24573, 24634]
                              VipMonitor   UP       [18387, 18464, 18465, 18466, 18474]
                                Stargate   UP       [24993, 25032]
                    InsightsDataTransfer   UP       [25258, 25348, 25349, 25388, 25391, 25393, 25396]
                                   Ergon   UP       [25263, 25414, 25415]
                                 Cerebro   UP       [25272, 25462, 25464, 25581]
                                 Chronos   UP       [25281, 25488, 25489, 25547]
                                 Curator   UP       [25294, 25528, 25529, 25585]
                                   Prism   UP       [25718, 25801, 25802, 25899, 25901, 25906, 25941, 25942]
                                     CIM   UP       [25721, 25829, 25830, 25856]
                            AlertManager   UP       [25727, 25862, 25863, 25990]
                                Arithmos   UP       [25737, 25896, 25897, 26040]
                                 Catalog   UP       [25749, 25989, 25991]
                               Acropolis   UP       [26011, 26118, 26119]
                                   Uhura   UP       [26037, 26165, 26166]
                                    Snmp   UP       [26057, 26214, 26215]
                       NutanixGuestTools   UP       [26105, 26282, 26283, 26299]
                              MinervaCVM   UP       [27343, 27465, 27466, 27730]
                           ClusterConfig   UP       [27358, 27509, 27510]
                                Aequitas   UP       [27368, 27567, 27568, 27600]
                             APLOSEngine   UP       [27399, 27580, 27581]
                                   APLOS   UP       [27853, 27946, 27947]
                                   Lazan   UP       [27865, 27997, 27999]
                                  Delphi   UP       [27880, 28058, 28060]
                                    Flow   UP       [27896, 28121, 28124]
                                 Anduril   UP       [27913, 28143, 28145]
                                   XTrim   UP       [27956, 28171, 28172]
                           ClusterHealth   UP       [7102, 7103, 27995, 28209,28495, 28496, 28503, 28510,	
    28573, 28574, 28577, 28594, 28595, 28597, 28598, 28602, 28603, 28604, 28607, 28645, 28646, 28648, 28792,	
    28793, 28837, 28838, 28840, 28841, 28858, 28859, 29123, 29124, 29127, 29133, 29135, 29142, 29146, 29150,	
    29161, 29162, 29163, 29179, 29187, 29219, 29268, 29273]
  4. Verify the storage.
    1. Log on to the ESXi host with SSH.
    2. Rescan for datastores.
      root@esx# esxcli storage core adapter rescan --all
    3. Confirm that cluster VMFS datastores, if any, are available.
      root@esx# esxcfg-scsidevs -m | awk '{print $5}'

Restarting an ESXi Node using CLI

Before you begin

Shut down the guest VMs, including vCenter, that are running on the node, or move them to other nodes in the Nutanix cluster.

About this task

Procedure

  1. Log on to vCenter (or to the ESXi host if the node is running the vCenter VM) with the web client.
  2. Right-click the host and select Maintenance mode > Enter Maintenance Mode .
    In the Confirm Maintenance Mode dialog box, click OK .
    The host is placed in maintenance mode, which prevents VMs from running on the host.
    Note: The host does not enter in the maintenance mode until after the CVM is shut down.
  3. Log on to the CVM with SSH and shut down the CVM.
    nutanix@cvm$ cvm_shutdown -P now
    Note: Do not reset or shutdown the CVM in any way other than the cvm_shutdown command to ensure that the cluster is aware that the CVM is unavailable.
  4. Right-click the node and select Power > Reboot .
    Wait until vCenter shows that the host is not responding and then is responding again, which takes several minutes.

    If you are logged on to the ESXi host rather than to vCenter, the web client disconnects when the host shuts down.

  5. Right-click the ESXi host and select Exit Maintenance Mode .
  6. Right-click the CVM and select Power > Power on .
    Wait approximately 5 minutes for all services to start on the CVM.
  7. Log on to the CVM with SSH.
  8. Confirm that the Nutanix cluster services are running on the CVM.
    nutanix@cvm$ ncli cluster status | grep -A 15 cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host.

    Output similar to the following is displayed.
        Name                      : 10.1.56.197
        Status                    : Up
        ... ... 
        StatsAggregator           : up
        SysStatCollector          : up

    Every service listed should be up .

  9. Right-click the ESXi host in the web client and select Rescan for Datastores . Confirm that all Nutanix datastores are available.

Rebooting an ESXi Node in a Nutanix Cluster

About this task

The Request Reboot operation in the Prism web console gracefully restarts the selected nodes one after the other.

Perform the following procedure to restart the nodes in the cluster.

Procedure

  1. Click the gear icon in the main menu and then select Reboot in the Settings page.
  2. In the Request Reboot window, select the nodes you want to restart, and click Reboot .
    Figure. Request Reboot Click to enlarge

    A progress bar is displayed that indicates the progress of the restart of each node.

Changing an ESXi Node Name

After running a bare-metal Foundation, you can change the host (node) name from the command line or by using the vSphere web client.

To change the hostname, see VMware Documentation. .

Changing an ESXi Node Password

Although it is not required for the root user to have the same password on all hosts (nodes), doing so makes cluster management and support much easier. If you do select a different password for one or more hosts, make sure to note the password for each host.

To change the host password, see VMware Documentation .

Changing CVM Memory Configuration (ESXi)

About this task

You can increase the memory reserved for each CVM in your Nutanix cluster by using the 1-click CVM Memory Upgrade available from the Prism Element web console. Increase memory size depending on the workload type or to enable certain AOS features. See Increasing the Controller VM Memory Size in the Prism Web Console Guide for CVM memory sizing recommendations and instructions about how to increase the CVM memory.

VM Management

For the list of supported VMs, see Compatibility and Interoperability Matrix.

VM Management Using Prism Central

You can create and manage a VM on your ESXi from Prism Central. For more information, see Creating a VM through Prism Central (ESXi) and Managing a VM (ESXi).

Creating a VM through Prism Central (ESXi)

In ESXi clusters, you can create a new virtual machine (VM) through Prism Central.

Before you begin

  • See the requirements and limitations section in vCenter Server Integration in the Prism Central Guide before proceeding.
  • Register the vCenter Server with your cluster. For more information, see Registering vCenter Server (Prism Central) in the Prism Central Guide .

About this task

To create a VM, do the following:

Procedure

  1. Go to the List tab of the VMs dashboard (see VM Summary View in the Prism Central Guide ) and click the Create VM button.
    The Create VM wizard appears.
  2. In the Cluster Selection window, select the target cluster from the pull-down list.

    A list of registered clusters appears in the window; you can select a cluster running AHV only. Clicking a cluster name displays the Create VM dialog box for that cluster.

    Figure. Cluster Selection Window Click to enlarge Cluster Selection window display

  3. In the Create VM dialog box, do the following in the indicated fields:
    1. Name : Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Guest OS : Type and select the guest operating system.

      The guest operating system that you select affects the supported devices and number of virtual CPUs available for the virtual machine. The Create VM wizard does not install the guest operating system. See the list of supported operating systems in vCenter Server Integration topic.

    4. vCPU(s) : Enter the number of virtual CPUs to allocate to this VM.
    5. Number of Cores per vCPU : Enter the number of cores assigned to each virtual CPU.
    6. Memory : Enter the amount of memory (in GiBs) to allocate to this VM.
  4. To attach a disk to the VM, click the Add New Disk button.
    The Add Disks dialog box appears. Do the following in the indicated fields:
    Figure. Add Disk Dialog Box Click to enlarge configure a disk screen

    1. Type : Select the type of storage device, DISK or CD-ROM , from the pull-down list.

      The following fields and options vary depending on whether you choose DISK or CD-ROM . You can use the CD-ROM type only to create a blank CD-ROM device for mounting NGT or VMware guest tools.

    2. Operation : Specify the device contents from the pull-down list.
      • Select Clone from ADSF file to copy any file from the cluster that can be used as an image onto the disk.
      • Select Allocate on Storage Container to allocate space without specifying an image. (This option appears only when DISK is selected in the previous field.) Selecting this option means you are allocating space only. You have to provide a system image later from a CD-ROM or other source.
    3. Bus Type : Select the bus type from the pull-down list. The choices are IDE or SCSI .
    4. ADSF Path : Enter the path to the desired system image.

      This field appears only when Clone from ADSF file is selected. It specifies the image to copy. Enter the path name as / container_name / vmdk_name .vmdk . For example to clone an image from myvm-flat.vmdk in a storage container named crt1 , enter /crt1/myvm-flat.vmdk . When a user types the storage container name ( / container_name / ), a list appears of the VMDK files in that storage container (assuming one or more VMDK files had previously been copied to that storage container).

      Note: Make sure you are copying from a flat file.
    5. Storage Container : Select the storage container to use from the pull-down list.

      This field appears only when Allocate on Storage Container is selected. The list includes all storage containers created for this cluster.

    6. Size : Enter the disk size in GiBs.
    7. When all the field entries are correct, click the Add button to attach the disk to the VM and return to the Create VM dialog box.
    8. Repeat this step to attach more devices to the VM.
  5. To create a network interface for the VM, click the Add New NIC button.

    The Create NIC dialog box appears. Do the following in the indicated fields:

    1. VLAN Name : Select the target virtual LAN from the pull-down list.

      The list includes all defined networks (see Configuring Network Connections in the Prism Central Guide ).

    2. Network Adapter Type : Select the network adapter type from the pull-down list.

      For information about the list of supported adapter types, see vCenter Server Integration in the Prism Central Guide .

    3. Network UUID : This is a read-only field that displays the network UUID.
    4. Network Address/Prefix : This is a read-only field that displays the network IP address and prefix.
    5. When all the field entries are correct, click the Add button to create a network interface for the VM and return to the Create VM dialog box.
    6. Repeat this step to create more network interfaces for the VM.

Managing a VM (ESXi)

You can manage virtual machines (VMs) in an ESXi cluster through Prism Central.

Before you begin

  • See the requirements and limitations section in vCenter Server Integration in the Prism Central Guide before proceeding.
  • Ensure that you have registered the vCenter Server with your cluster. For more information, see Registering vCenter Server (Prism Central) in the Prism Central Guide .

About this task

After creating a VM (see Creating a VM through Prism Central (ESXi)), you can use Prism Central to update the VM configuration, delete the VM, clone the VM, launch a console window, start (or shut down) the VM, pause (or resume) the VM, assign the VM to a protection policy, take a snapshot, add the VM to a recovery plan, run a playbook, manage categories, install, and manage Nutanix Guest Tools (NGT), manage the VM ownership, or configure QoS settings.

You can perform these tasks by using any of the following methods:

  • Select the target VM in the List tab of the VMs dashboard (see VMs Summary View in the Prism Central Guide ) and choose the required action from the Actions menu.
  • Right-click on the target VM in the List tab of the VMs dashboard and select the required action from the drop-down list.
  • Go to the details page of a selected VM (see VM Details View in the Prism Central Guide ) and select the desired action.
Note: The available actions appear in bold; other actions are unavailable. The available actions depend on the current state of the VM and your permissions.

Procedure

  • To modify the VM configuration, select Update .

    The Update VM dialog box appears, which includes the same fields as the Create VM dialog box. You cannot change the name, number of assigned vCPUs, or memory size of the VM, but you can add or delete disks and NICs.

    Figure. Update VM Window Click to enlarge VM update window display

  • To delete the VM, select Delete . A window prompt appears; click the OK button to delete the VM.
  • To clone the VM, select Clone .

    This displays the Clone VM dialog box, which includes the same fields as the Create VM dialog box but with all fields (except the name) filled in with the current VM settings. Enter a name for the clone and then click the Save button to create the clone. You can create a modified clone by changing some of the settings before clicking the Save button.

    Figure. Clone VM Window Click to enlarge clone VM window display

  • To launch a console window, select Launch Console .

    This opens a virtual network computing (VNC) client and displays the console in a new tab or window. This option is available only when the VM is powered on. The VM power options that you access from the Power On Actions (or Power Off Actions ) action link below the VM table can also be accessed from the VNC console window. To access the VM power options, click the Power button at the top-right corner of the console window.

    Note: A VNC client may not function properly on all browsers. Some keys are not recognized when the browser is Google Chrome. (Firefox typically works best.)
    Figure. Console Window (VNC) Click to enlarge VNC console window display

  • To start (or shut down) the VM, select Power on (or Power off ).
  • To pause (or resume) the VM, select Pause/Suspend (or Resume ). This option is available only when the VM is powered on (off).
  • To assign the VM to a protection policy, select Protect . This opens a page to specify the protection policy to which this VM should be assigned (see Policies Management). To remove the VM from a protection policy, select Unprotect .
  • To take a snapshot of the VM, select Take Snapshot .

    This displays the Take Snapshot dialog box. Enter a name for the snapshot and then click the Submit button to start the backup.

    Warning: The following are the restrictions for naming VM snapshots.
    • The maximum length is 80 characters.
    • Allowed characters are uppercase and lowercase standard Latin letters (A-Z and a-z), decimal digits (0-9), dots (.), hyphens (-), and underscores (_).

    Note: These snapshots (stored locally) cannot be replicated to other sites.
    Figure. Take Snapshot Window Click to enlarge take snapshot window display

  • To add this VM to a recovery plan you created previously, select Add to Recovery Plan . For more information, see Adding Guest VMs Individually to a Recovery Plan in the Leap Administration Guide .
  • To create VM recovery point, select Create Recovery Point .
  • To run a playbook you created previously, select Run Playbook . For more information, see Running a Playbook (Manual Trigger) in the Prism Central Guide .
  • To assign the VM a category value, select Manage Categories .

    This displays the Manage VM Categories page. For more information, see Assigning a Category in the Prism Central Guide .

  • To install Nutanix Guest Tools (NGT), select Install NGT . For more information, see Installing NGT on Multiple VMs in the Prism Central Guide .
  • To enable (or disable) NGT, select Manage NGT Applications . For more information, see Managing NGT Applications in the Prism Central Guide .
    The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
    Note: If you clone a VM, by default NGT is not enabled on the cloned VM. You need to again enable and mount NGT on the cloned VM. If you want to enable NGT on multiple VMs simultaneously, see Enabling NGT and Mounting the NGT Installer on Cloned VMs in the Prism Web Console Guide .

    If you eject the CD, you can mount the CD back again by logging into the Controller VM and running the following nCLI command.

    ncli> ngt mount vm-id=virtual_machine_id

    For example, to mount the NGT on the VM with VM_ID=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-c1601e759987, type the following command.

    ncli> ngt mount vm-id=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-
    c1601e759987
    Note:
    • Self-service restore feature is not enabled by default on a VM. You must manually enable the self-service restore feature.
    • If you have created the NGT ISO CD-ROMs on AOS 4.6 or earlier releases, the NGT functionality will not work even if you upgrade your cluster because REST APIs have been disabled. You must unmount the ISO, remount the ISO, install the NGT software again, and then upgrade to a later AOS version.
  • To upgrade NGT, select Upgrade NGT . For more information, see Upgrading NGT in the Prism Central Guide .
  • To configure quality of service (QoS) settings, select Set QoS Attributes . For more information, see Setting QoS for an Individual VM in the Prism Central Guide .

VM Management using Prism Element

You can create and manage a VM on your ESXi from Prism Element. For more information, see Creating a VM (ESXi) and Managing a VM (ESXi).

Creating a VM (ESXi)

In ESXi clusters, you can create a new virtual machine (VM) through the web console.

Before you begin

  • See the requirements and limitations section in VM Management through Prism Element (ESXi) in the Prism Web Console Guide before proceeding.
  • Register the vCenter Server with your cluster. For more information, see Registering a Cluster to the vCenter Server.

About this task

When creating a VM, you can configure all of its components, such as number of vCPUs and memory, but you cannot attach a volume group to the VM.

To create a VM, do the following:

Procedure

  1. In the VM dashboard , click the Create VM button.
    The Create VM dialog box appears.
  2. Do the following in the indicated fields:
    1. Name : Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Guest OS : Type and select the guest operating system.
      The guest operating system that you select affects the supported devices and number of virtual CPUs available for the virtual machine. The Create VM wizard does not install the guest operating system. For information about the list of supported operating systems, see VM Management through Prism Element (ESXi) in the Prism Web Console Guide .
    4. vCPU(s) : Enter the number of virtual CPUs to allocate to this VM.
    5. Number of Cores per vCPU : Enter the number of cores assigned to each virtual CPU.
    6. Memory : Enter the amount of memory (in GiBs) to allocate to this VM.
  3. To attach a disk to the VM, click the Add New Disk button.
    The Add Disks dialog box appears.
    Figure. Add Disk Dialog Box Click to enlarge configure a disk screen

    Do the following in the indicated fields:
    1. Type : Select the type of storage device, DISK or CD-ROM , from the pull-down list.
      The following fields and options vary depending on whether you choose DISK or CD-ROM .
    2. Operation : Specify the device contents from the pull-down list.
      • Select Clone from ADSF file to copy any file from the cluster that can be used as an image onto the disk.
      • Select Allocate on Storage Container to allocate space without specifying an image. (This option appears only when DISK is selected in the previous field.) Selecting this option means you are allocating space only. You have to provide a system image later from a CD-ROM or other source.
    3. Bus Type : Select the bus type from the pull-down list. The choices are IDE or SCSI .
    4. ADSF Path : Enter the path to the desired system image.
      This field appears only when Clone from ADSF file is selected. It specifies the image to copy. Enter the path name as / storage_container_name / vmdk_name .vmdk . For example to clone an image from myvm-flat.vmdk in a storage container named crt1 , enter /crt1/myvm-flat.vmdk . When a user types the storage container name ( / storage_container_name / ), a list appears of the VMDK files in that storage container (assuming one or more VMDK files had previously been copied to that storage container).
      Note: Make sure you are copying from a flat file.
    5. Storage Container : Select the storage container to use from the pull-down list.
      This field appears only when Allocate on Storage Container is selected. The list includes all storage containers created for this cluster.
    6. Size : Enter the disk size in GiBs.
    7. When all the field entries are correct, click the Add button to attach the disk to the VM and return to the Create VM dialog box.
    8. Repeat this step to attach more devices to the VM.
  4. To create a network interface for the VM, click the Add New NIC button.
    The Create NIC dialog box appears. Do the following in the indicated fields:
    1. VLAN Name : Select the target virtual LAN from the pull-down list.
      The list includes all defined networks. For more information, see Network Configuration for VM Interfaces in the Prism Web Console Guide .
    2. Network Adapter Type : Select the network adapter type from the pull-down list.

      For information about the list of supported adapter types, see VM Management through Prism Element (ESXi) in the Prism Web Console Guide .

    3. Network UUID : This is a read-only field that displays the network UUID.
    4. Network Address/Prefix : This is a read-only field that displays the network IP address and prefix.
    5. When all the field entries are correct, click the Add button to create a network interface for the VM and return to the Create VM dialog box.
    6. Repeat this step to create more network interfaces for the VM.
  5. When all the field entries are correct, click the Save button to create the VM and close the Create VM dialog box.
    The new VM appears in the VM table view. For more information, see VM Table View in the Prism Web Console Guide .

Managing a VM (ESXi)

You can use the web console to manage virtual machines (VMs) in the ESXi clusters.

Before you begin

  • See the requirements and limitations section in VM Management through Prism Element (ESXi) in the Prism Web Console Guide before proceeding.
  • Ensure that you have registered the vCenter Server with your cluster. For more information, see Registering a Cluster to the vCenter Server.

About this task

After creating a VM, you can use the web console to manage guest tools, power operations, suspend, launch a VM console window, update the VM configuration, clone the VM, or delete the VM. To accomplish one or more of these tasks, do the following:

Note: Your available options depend on the VM status, type, and permissions. Unavailable options are unavailable.

Procedure

  1. In the VM dashboard , click the Table view.
  2. Select the target VM in the table (top section of screen).
    The summary line (middle of screen) displays the VM name with a set of relevant action links on the right. You can also right-click on a VM to select a relevant action.

    The possible actions are Manage Guest Tools , Launch Console , Power on (or Power off actions ), Suspend (or Resume ), Clone , Update , and Delete . The following steps describe how to perform each action.

    Figure. VM Action Links Click to enlarge

  3. To manage guest tools as follows, click Manage Guest Tools .
    You can also enable NGT applications (self-service restore, volume snapshot service and application-consistent snapshots) as part of manage guest tools.
    1. Select Enable Nutanix Guest Tools check box to enable NGT on the selected VM.
    2. Select Mount Nutanix Guest Tools to mount NGT on the selected VM.
      Ensure that VM has at least one empty IDE CD-ROM or SATA slot to attach the ISO.

      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
    3. To enable self-service restore feature for Windows VMs, click Self Service Restore (SSR) check box.
      The self-service restore feature is enabled of the VM. The guest VM administrator can restore the desired file or files from the VM. For more information about the self-service restore feature, see Self-Service Restore in the Data Protection and Recovery with Prism Element guide.

    4. After you select Enable Nutanix Guest Tools check box the VSS and application-consistent snapshot feature is enabled by default.
      After this feature is enabled, Nutanix native in-guest VmQuiesced snapshot service (VSS) agent is used to take application-consistent snapshots for all the VMs that support VSS. This mechanism takes application-consistent snapshots without any VM stuns (temporary unresponsive VMs) and also enables third-party backup providers like Commvault and Rubrik to take application-consistent snapshots on Nutanix platform in a hypervisor-agnostic manner. For more information, see Conditions for Application-consistent Snapshots in the Data Protection and Recovery with Prism Element guide.

    5. To mount VMware guest tools, click Mount VMware Guest Tools check box.
      The VMware guest tools are mounted on the VM.
      Note: You can mount both VMware guest tools and Nutanix Guest Tools at the same time on a particular VM provided the VM has sufficient empty CD-ROM slots.
    6. Click Submit .
      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
      Note:
      • If you clone a VM, by default NGT is not enabled on the cloned VM. If the cloned VM is powered off, enable NGT from the UI and start the VM. If cloned VM is powered on, enable NGT from the UI and restart the Nutanix guest agent service.
      • If you want to enable NGT on multiple VMs simultaneously, see Enabling NGT and Mounting the NGT Installer on Cloned VMs in the Prism Web Console Guide .
      If you eject the CD, you can mount the CD back again by logging into the Controller VM and running the following nCLI command.
      ncli> ngt mount vm-id=virtual_machine_id

      For example, to mount the NGT on the VM with VM_ID=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-c1601e759987, type the following command.

      ncli> ngt mount vm-id=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-
      c1601e759987
      Caution: In AOS 4.6, for the powered-on Linux VMs on AHV, ensure that the NGT ISO is ejected or unmounted within the guest VM before disabling NGT by using the web console. This issue is specific for 4.6 version and does not occur from AOS 4.6.x or later releases.
      Note: If you have created the NGT ISO CD-ROMs prior to AOS 4.6 or later releases, the NGT functionality will not work even if you upgrade your cluster because REST APIs have been disabled. You must unmount the ISO, remount the ISO, install the NGT software again, and then upgrade to 4.6 or later version.
  4. To launch a VM console window, click the Launch Console action link.
    This opens a virtual network computing (VNC) client and displays the console in a new tab or window. This option is available only when the VM is powered on. The VM power options that you access from the Power Off Actions action link below the VM table can also be accessed from the VNC console window. To access the VM power options, click the Power button at the top-right corner of the console window.
    Note: A VNC client may not function properly on all browsers. Some keys are not recognized when the browser is Google Chrome. (Firefox typically works best.)
  5. To start (or shut down) the VM, click the Power on (or Power off ) action link.

    Power on begins immediately. If you want to shut down the VMs, you are prompted to select one of the following options:

    • Power Off . Hypervisor performs a hard shut down action on the VM.
    • Reset . Hypervisor performs an ACPI reset action through the BIOS on the VM.
    • Guest Shutdown . Operating system of the VM performs a graceful shutdown.
    • Guest Reboot . Operating system of the VM performs a graceful restart.
    Note: The Guest Shutdown and Guest Reboot options are available only when VMware guest tools are installed.
  6. To pause (or resume) the VM, click the Suspend (or Resume ) action link. This option is available only when the VM is powered on.
  7. To clone the VM, click the Clone action link.
    This displays the Clone VM dialog box, which includes the same fields as the Create VM dialog box but with all fields (except the name) filled in with the current VM settings and number of clones needed. Enter a name for the clone and number of clones of the VM that are required and then click the Save button to create the clone.
    Figure. Clone VM Dialog Box Click to enlarge

    Note: In the Clone window, you cannot update the disks and network interfaces.
  8. To modify the VM configuration, click the Update action link.
    The Update VM dialog box appears, which includes the same fields as the Create VM dialog box. Modify the configuration as needed (see Creating a VM (ESXi)), and in addition you can enable Flash Mode for the VM.
    Note: If you delete a vDisk attached to a VM and snapshots associated with this VM exist, space associated with that vDisk is not reclaimed unless you also delete the VM snapshots.
    1. Click the Enable Flash Mode check box.
      • After you enable this feature on the VM, the status is updated in the VM table view. To view the status of individual virtual disks (disks that are flashed to the SSD), go the Virtual Disks tab in the VM table view.
      • You can disable the Flash Mode feature for individual virtual disks. To update the Flash Mode for individual virtual disks, click the update disk icon in the Disks pane and deselect the Enable Flash Mode check box.
  9. To delete the VM, click the Delete action link. A window prompt appears; click the OK button to delete the VM.
    The deleted VM disappears from the list of VMs in the table. You can also delete a VM that is already powered on.

VM Migration

You can migrate a VM to an ESXi host in a Nutanix cluster. Usually the migration is done in the following cases.

  • Migrate VMs from existing storage platform to Nutanix.
  • Keep VMs running during disruptive upgrade or other downtime of Nutanix cluster.

In migrating VMs between Nutanix clusters running vSphere, the source host and NFS datastore are the ones presently running the VM. The target host and NFS datastore are the ones where the VM runs after migration. The target ESXi host and datastore must be part of a Nutanix cluster.

To accomplish this migration, you have to mount the NFS datastores from the target on the source. After the migration is complete, you must unmount the datastores and block access.

Migrating a VM to Another Nutanix Cluster

Before you begin

Before migrating a VM to another Nutanix cluster running vSphere, verify that you have provisioned the target Nutanix environment.

About this task

The shared storage feature in vSphere allows you to move both compute and storage resources from the source legacy environment to the target Nutanix environment at the same time without disruption. This feature also removes the need to do any sort of file systems allow lists on Nutanix.

You can use the shared storage feature through the migration wizard in the web client.

Procedure

  1. Log on to vCenter with the web client.
  2. Select the VM that you want to migrate.
  3. Right-click the VM and select Migrate .
  4. Under Select Migration Type , select Change both compute resource and storage .
  5. Select Compute Resource and then Storage and click Next .
    If necessary, change the disk format to the one that you want to use during the migration process.
  6. Select a destination network for all VM network adapters and click Next .
  7. Click Finish .
    Wait for the migration process to complete. The process performs the storage vMotion first, and then creates a temporary storage network over vmk0 for the period where the disk files are on Nutanix.

Cloning a VM

About this task

To clone a VM, you must enable the Nutanix VAAI plug-in. For steps to enable and verify Nutanix VAAI plug-in, refer KB-1868.

Procedure

  1. Log on to vCenter with the web client.
  2. Right-click the VM and select Clone .
  3. Follow the wizard to enter a name for the clone, select a cluster, and select a host.
  4. Select the datastore that contains source VM and click Next .
    Note: If you choose a datastore other than the one that contains the source VM, the clone operation uses the VMware implementation and not the Nutanix VAAI plug-in.
  5. If desired, set the guest customization parameters. Otherwise, proceed to the next step.
  6. Click Finish .

vStorage APIs for Array Integration

To improve the vSphere cloning process, Nutanix provides a vStorage API for array integration (VAAI) plug-in. This plug-in is installed by default during the Nutanix factory process.

Without the Nutanix VAAI plug-in, the process of creating a full clone takes a significant amount of time because all the data that comprises a VM is duplicated. This duplication also results in an increase in storage consumption.

The Nutanix VAAI plug-in efficiently makes full clones without reserving space for the clone. Read requests for blocks shared between parent and clone are sent to the original vDisk that was created for the parent VM. As the clone VM writes new blocks, the Nutanix file system allocates storage for those blocks. This data management occurs completely at the storage layer, so the ESXi host sees a single file with the full capacity that was allocated when the clone was created.

vSphere ESXi Hardening Settings

Configure the following settings in /etc/ssh/sshd_config to harden an ESXi hypervisor in a Nutanix cluster.
Caution: When hardening ESXi security, some settings may impact operations of a Nutanix cluster.
HostbasedAuthentication no
PermitTunnel no
AcceptEnv
GatewayPorts no
Compression no
StrictModes yes
KerberosAuthentication no
GSSAPIAuthentication no
PermitUserEnvironment no
PermitEmptyPasswords no
PermitRootLogin no

Match Address x.x.x.11,x.x.x.12,x.x.x.13,x.x.x.14,192.168.5.0/24
PermitRootLogin yes
PasswordAuthentication yes

ESXi Host 1-Click Upgrade

You can upgrade your host either automatically through Prism Element (1-click upgrade) or manually. For more information about automatic and manual upgrades, see ESXi Upgrade and ESXi Host Manual Upgrade respectively.

This paragraph describes the Nutanix hypervisor support policy for vSphere and Hyper-V hypervisor releases. Nutanix provides hypervisor compatibility and support statements that should be reviewed before planning an upgrade to a new release or applying a hypervisor update or patch:
  • Compatibility and Interoperability Matrix
  • Hypervisor Support Policy- See Support Policies and FAQs for the supported Acropolis hypervisors.

Review the Nutanix Field Advisory page also for critical issues that Nutanix may have uncovered with the hypervisor release being considered.

Note: You may need to log in to the Support Portal to view the links above.

The Acropolis Upgrade Guide provides steps that can be used to upgrade the hypervisor hosts. However, as noted in the documentation, the customer is responsible for reviewing the guidance from VMware or Microsoft, respectively, on other component compatibility and upgrade order (e.g. vCenter), which needs to be planned first.

ESXi Upgrade

How to upgrade your ESXi hypervisor host through the Prism Element web console Upgrade Software feature (also known as 1-click upgrade). To install or upgrade VMware vCenter Server or other third-party software, see your vendor documentation for this information.

AOS supports ESXi hypervisor upgrades that you can apply through the web console Upgrade Software feature (also known as 1-click upgrade).

You can view the available upgrade options, start an upgrade, and monitor upgrade progress through the web console. In the main menu, click the gear icon, and then select Upgrade Software in the Settings panel that appears, to see the current status of your software versions (and start an upgrade if warranted).

VMware ESXi Hypervisor Upgrade Recommendations and Limitations

  • To install or upgrade VMware vCenter Server or other third-party software, see your vendor documentation.
  • Always consult the VMware web site for any vCenter and hypervisor installation dependencies. For example, a hypervisor version might require that you upgrade vCenter first.
  • If you have not enabled DRS in your environment and want to upgrade the ESXi host, you need to upgrade the ESXi host manually. For more information about upgrading ESXi hosts manually, see ESXi Host Manual Upgrade.
  • Disable Admission Control to upgrade ESXi on AOS; if enabled, the upgrade process will fail. You can enable it for normal cluster operation otherwise.
Nutanix Support for ESXi Upgrades
Nutanix qualifies specific VMware ESXi hypervisor updates and provides a related JSON metadata upgrade file on the Nutanix Support Portal for one-click upgrade through the Prism web console Software Upgrade feature.

Nutanix does not provide ESXi binary files, only related JSON metadata upgrade files. Obtain ESXi offline bundles (not ISOs) from the VMware web site.

Nutanix supports the ability to patch upgrade ESXi hosts with versions that are greater than or released after the Nutanix qualified version, but Nutanix might not have qualified those releases. See the Nutanix hypervisor support statement in our Support FAQ. For updates that are made available by VMware that do not have a Nutanix-provided JSON metadata upgrade file, obtain the offline bundle and md5sum checksum available from VMware, then use the web console Software Upgrade feature to upgrade ESXi.

Mixing nodes with different processor (CPU) types in the same cluster
If you are mixing nodes with different processor (CPU) types in the same cluster, you must enable enhanced vMotion compatibility (EVC) to allow vMotion/live migration of VMs during the hypervisor upgrade. For example, if your cluster includes a node with a Haswell CPU and other nodes with Broadwell CPUs, open vCenter and enable VMware enhanced vMotion compatibility (EVC) setting and specifically enable EVC for Intel hosts.
Enhanced vMotion Compatibility (EVC)

AOS Controller VMs and Prism Central VMs require a minimum CPU micro-architecture version of Intel Sandy Bridge. For AOS clusters with ESXi hosts, or when deploying Prism Central VMs on any ESXi cluster: if you have set the vSphere cluster enhanced vMotion compatibility (EVC) level, the minimum level must be L4 - Sandy Bridge .

vCenter
Note: You might be unable to log in to vCenter Server as the /storage/seat partition for vCenter Server version 7.0 and later might become full due to a large number of SSH-related events. See KB 10830 at the Nutanix Support portal for symptoms and solutions to this issue.
  • If your cluster is running the ESXi hypervisor and is also managed by VMware vCenter, you must provide vCenter administrator credentials and vCenter IP address as an extra step before upgrading. Ensure that ports 80 / 443 are open between your cluster and your vCenter instance to successfully upgrade.
  • If You Have Just Registered Your Cluster in vCenter. Do not perform any cluster upgrades (AOS, Controller VM memory, hypervisor, and so on) if you have just registered your cluster in vCenter. Wait at least 1 hour before performing upgrades to allow cluster settings to become updated. Also do not register the cluster in vCenter and perform any upgrades at the same time.
  • Cluster Mapped to Two vCenters. Upgrading software through the web console (1-click upgrade) does not support configurations where a cluster is mapped to two vCenters or where it includes host-affinity must rules for VMs.

    Ensure that enough cluster resources are available for live migration to occur and to allow hosts to enter maintenance mode.

  • Do not deploy ESXi 6.5 on Nutanix clusters running AOS 5.x versions if you require or want to configure VMware fault tolerance (FT). Nutanix engineering has discovered and is aware of VMware FT compatibility issues in the ESXi 6.5 release, which have been reported to VMware.
Mixing Different Hypervisor Versions
For ESXi hosts, mixing different hypervisor versions in the same cluster is temporarily allowed for deferring a hypervisor upgrade as part of an add-node/expand cluster operation, reimaging a node as part of a break-fix procedure, planned migrations, and similar temporary operations.

Upgrading ESXi Hosts by Uploading Binary and Metadata Files

Before you begin

About this task

Do the following steps to download Nutanix-qualified ESXi metadata .JSON files and upgrade the ESXi hosts through Upgrade Software in the Prism Element web console. Nutanix does not provide ESXi binary files, only related JSON metadata upgrade files.

Procedure

  1. Before performing any upgrade procedure, make sure you are running the latest version of the Nutanix Cluster Check (NCC) health checks and upgrade NCC if necessary.
  2. Run NCC as described in Run NCC Checks .
  3. Log on to the Nutanix support portal and navigate to the Hypervisors Support page from the Downloads menu, then download the Nutanix-qualified ESXi metadata .JSON files to your local machine or media.
    1. The default view is All . From the drop-down menu, select Nutanix - VMware ESXi , which shows all available JSON versions.
    2. From the release drop-down menu, select the available ESXi version. For example, 7.0.0 u2a .
    3. Click Download to download the Nutanix-qualified ESXi metadata .JSON file.
    Figure. Downloads Page for ESXi Metadata JSON Click to enlarge This picture shows the portal page for ESXi metadata JSON downloads
  4. Log on to the Prism Element web console for any node in the cluster.
  5. Click the gear icon in the main menu, select Upgrade Software in the Settings page, and then click the Hypervisor tab.
  6. Click the upload the Hypervisor binary link.
  7. Click Choose File for the metadata JSON (obtained from Nutanix) and binary files (obtained from VMware), respectively, browse to the file locations, select the file, and click Upload Now .
  8. When the file upload is completed, click Upgrade > Upgrade Now , then click Yes to confirm.
    [optional] To run the pre-upgrade installation checks only on the Controller VM where you are logged on without upgrading, click Upgrade > Pre-upgrade . These checks also run as part of the upgrade procedure.
  9. Type your vCenter IP address and credentials, then click Upgrade .
    Ensure that you are using your Active Directory or LDAP credentials in the form of domain\username or username@domain .
    Note: AOS can detect if you have uploaded software that is already installed or upgraded. In this case, the Upgrade option is not displayed, because the software is already installed.
    The Upgrade Software dialog box shows the progress of your selection, including status of pre-installation checks and uploads, through the Progress Monitor .

Upgrading ESXi by Uploading an Offline Bundle File and Checksum

About this task

  • Do the following steps to download a non-Nutanix-qualified (patch) ESXi upgrade offline bundle from VMware, then upgrade ESXi through Upgrade Software in the Prism Element web console.
  • Typically you perform this procedure to patch your version of ESXi and Nutanix has not yet officially qualified that new patch version. Nutanix supports the ability to patch upgrade ESXi hosts with versions that are greater than or released after the Nutanix qualified version, but Nutanix might not have qualified those releases.

Procedure

  1. From the VMware web site, download the offline bundle (for example, update-from-esxi6.0-6.0_update02.zip ) and copy the associated MD5 checksum. Ensure that this checksum is obtained from the VMware web site, not manually generated from the bundle by you.
  2. Save the files to your local machine or media, such as a USB drive or other portable media.
  3. Log on to the Prism Element web console for any node in the cluster.
  4. Click the gear icon in the main menu of the Prism Element web console, select Upgrade Software in the Settings page, and then click the Hypervisor tab.
  5. Click the upload the Hypervisor binary link.
  6. Click enter md5 checksum and copy the MD5 checksum into the Hypervisor MD5 Checksum field.
  7. Scroll down and click Choose File for the binary file, browse to the offline bundle file location, select the file, and click Upload Now .
    Figure. ESXi 1-Click Upgrade, Unqualified Bundle Click to enlarge ESXi 1-Click Upgrade dialog box
  8. When the file upload is completed, click Upgrade > Upgrade Now , then click Yes to confirm.
    [optional] To run the pre-upgrade installation checks only on the Controller VM where you are logged on without upgrading, click Upgrade > Pre-upgrade . These checks also run as part of the upgrade procedure.
  9. Type your vCenter IP address and credentials, then click Upgrade .
    Ensure that you are using your Active Directory or LDAP credentials in the form of domain\username or username@domain .
    Note: AOS can detect if you have uploaded software that is already installed or upgraded. In this case, the Upgrade option is not displayed, because the software is already installed.
    The Upgrade Software dialog box shows the progress of your selection, including status of pre-installation checks and uploads, through the Progress Monitor .
  10. After the upgrade is complete, click Inventory > Perform Inventory to enable LCM to check, update and display the inventory information.
    For more information, see Performing Inventory With LCM in the Acropolis Upgrade Guide .

ESXi Host Manual Upgrade

If you have not enabled DRS in your environment and want to upgrade the ESXi host, you must upgrade the ESXi host manually. This topic describes all the requirements that you must meet before manually upgrading the ESXi host.

Tip: If you have enabled DRS and want to upgrade the ESXi host, use the one-click upgrade procedure from the Prism web console. For more information on the one-click upgrade procedure, see the ESXi Upgrade.

Nutanix supports the ability to patch upgrade the ESXi hosts with the versions that are greater than or released after the Nutanix qualified version, but Nutanix might not have qualified those releases. See the Nutanix hypervisor support statement in our Support FAQ.

Because ESXi hosts with different versions can co-exist in a single Nutanix cluster, upgrading ESXi does not require cluster downtime.

  • If you want to avoid cluster interruption, you must complete upgrading a host and ensure that the CVM is running before upgrading any other host. When two hosts in a cluster are down at the same time, all the data is unavailable.
  • If you want to minimize the duration of the upgrade activities and cluster downtime is acceptable, you can stop the cluster and upgrade all hosts at the same time.
Warning: By default, Nutanix clusters have redundancy factor 2, which means they can tolerate the failure of a single node or drive. Nutanix clusters with a configured option of redundancy factor 3 allow the Nutanix cluster to withstand the failure of two nodes or drives in different blocks.
  • Never shut down or restart multiple Controller VMs or hosts simultaneously.
  • Always run the cluster status command to verify that all Controller VMs are up before performing a Controller VM or host shutdown or restart.

ESXi Host Upgrade Process

Perform the following process to upgrade ESXi hosts in your environment.

Prerequisites and Requirements

Note: Use the following process only if you do not have DRS enabled in your Nutanix cluster.
  • If you are upgrading all nodes in the cluster at once, shut down all guest VMs and stop the cluster with the cluster stop command.
    Caution: There is downtime if you upgrade all the nodes in the Nutanix cluster at once. If you do not want downtime in your environment, you must ensure that only one CVM is shut down at a time in a redundancy factor 2 configuration.
  • If you are upgrading the nodes while keeping the cluster running, ensure that all nodes are up by logging on to a CVM and running the cluster status command. If any nodes are not running, start them before proceeding with the upgrade. Shut down all guest VMs on the node or migrate them to other nodes in the Nutanix cluster.
  • Disable email alerts in the web console under Email Alert Services or with the nCLI command.
    ncli> alerts update-alert-config enable=false
  • Run the complete NCC health check by using the health check command.
    nutanix@cvm$ ncc health_checks run_all
  • Run the cluster status command to verify that all Controller VMs are up and running, before performing a Controller VM or host shutdown or restart.
    nutanix@cvm$ cluster status
  • Place the host in the maintenance mode by using the web client.
  • Log on to the CVM with SSH and shut down the CVM.
    nutanix@cvm$ cvm_shutdown -P now
    Note: Do not reset or shutdown the CVM in any way other than the cvm_shutdown command to ensure that the cluster is aware that the CVM is unavailable.
  • Start the upgrade using vSphere Upgrade Guide or vCenter Update Manager VUM.

Upgrading ESXi Host

  • See the VMware Documentation for information about the standard ESXi upgrade procedures. If any problem occurs with the upgrade process, an alert is raised in the Alert dashboard.

Post Upgrade

Run the complete NCC health check by using the following command.

nutanix@cvm$ ncc health_checks run_all

vSphere Cluster Settings Checklist

Review the following checklist of the settings that you have to configure to successfully deploy vSphere virtual environment running Nutanix Enterprise cloud.

vSphere Availability Settings

  • Enable host monitoring.
  • Enable admission control and use the percentage-based policy with a value based on the number of nodes in the cluster.

    For more information about settings of percentage of cluster resources reserved as failover spare capacity, vSphere HA Admission Control Settings for Nutanix Environment.

  • Set the VM Restart Priority of all CVMs to Disabled .
  • Set the Host Isolation Response of the cluster to Power Off & Restart VMs .
  • Set the VM Monitoring for all CVMs to Disabled .
  • Enable datastore heartbeats by clicking Use datastores only from the specified list and choosing the Nutanix NFS datastore.

    If the cluster has only one datastore, click Advanced Options tab and add das.ignoreInsufficientHbDatastore with Value of true .

vSphere DRS Settings

  • Set the Automation Level on all CVMs to Disabled .
  • Select Automation Level to accept level 3 recommendations.
  • Leave power management disabled.

Other Cluster Settings

  • Configure advertised capacity for the Nutanix storage container (total usable capacity minus the capacity of one node for replication factor 2 or two nodes for replication factor 3).
  • Store VM swapfiles in the same directory as the VM.
  • Enable enhanced vMotion compatibility (EVC) in the cluster. For more information, see vSphere EVC Settings.
  • Configure Nutanix CVMs with the appropriate VM overrides. For more information, see VM Override Settings.
  • Check Nonconfigurable ESXi Components. Modifying the nonconfigurable components may inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.
Read article

Hyper-V Administration for Acropolis

AOS 6.5

Product Release Date: 2022-07-25

Last updated: 2022-09-20

Node Management

Logging on to a Controller VM

If you need to access a Controller VM on a host that has not been added to SCVMM or Hyper-V Manager, use this method.

Procedure

  1. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
  2. Log on to the Controller VM.
    > ssh nutanix@192.168.5.254

    Accept the host authenticity warning if prompted.

Placing the Controller VM and Hyper-V Host in Maintenance Mode

It is recommended that you place the Controller VM and Hyper-V host into maintenance mode when performing any maintenance or patch installation for the cluster.

Before you begin

Migrate the VMs that are running on the node to other nodes in the cluster.

About this task

Caution: Verify the data resiliency status of your cluster. You can only place one node in maintenance mode for each cluster.

To place the Controller VM and Hyper-V host in maintenance mode, do the following.

Procedure

  1. Log on to the Controller VM with SSH and get the CVM host ID.
    nutanix@cvm$ ncli host ls
  2. Run the following command to place the CVM in maintenance mode.
    nutanix@cvm$ ncli host edit id=host_id enable-maintenance-mode=true
    Replace host_id with the CVM host ID
  3. Log on to the Hyper-V host with Remote Desktop Connection and pause the Hyper-V host in the failover cluster using PowerShell.
    > Suspend-ClusterNode

Shutting Down a Node in a Cluster (Hyper-V)

Shut down a node in a Hyper-V cluster.

Before you begin

Shut down guest VMs that are running on the node, or move them to other nodes in the cluster.

In a Hyper-V cluster, you do not need to put the node in maintenance mode before you shut down the node. The steps to shut down the guest VMs running on the node or moving them to another node, and shutting down the CVM are adequate.

About this task

Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

Perform the following procedure to shut down a node in a Hyper-V cluster.

Procedure

  1. Log on to the Controller VM with SSH and shut down the Controller VM.
    nutanix@cvm$ cvm_shutdown -P now
    Note:

    Always use the cvm_shutdown command to reset, or shutdown the Controller VM. The cvm_shutdown command notifies the cluster that the Controller VM is unavailable.

  2. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
  3. Do one of the following to shut down the node.
    • > shutdown /s /t 0
    • > Stop-Computer -ComputerName localhost

    See the Microsoft documentation for up-to-date and additional details about how to shut down a Hyper-V node.

Starting a Node in a Cluster (Hyper-V)

After you start or restart a node in a Hyper-V cluster, verify if the Controller VM (CVM) is powered on and if the CVM is added to the metadata.

About this task

Perform the following steps to start a node in a Hyper-V cluster.

Procedure

  1. Power on the node. Do one of the following:
    • Press the power button on the front of the physical hardware server.
    • Use a remote tool such as iDRAC, iLO, or IPMI depending on your hardware.
  2. Log on to Hyper-V Manager and start PowerShell.
  3. Determine if the Controller VM is running.
    > Get-VM | Where {$_.Name -match 'NTNX.*CVM'}
    • If the Controller VM is off, a line similar to the following should be returned:
      NTNX-13SM35230026-C-CVM Stopped -           -             - Opera...

      Make a note of the Controller VM name in the second column.

    • If the Controller VM is on, a line similar to the following should be returned:
      NTNX-13SM35230026-C-CVM Running 2           16384             05:10:51 Opera...
  4. If the CVM is not powered on, power on the CVM by using Hyper-V Manager.
  5. Log on to the CVM with SSH and verify if the CVM is added back to the metadata.
    nutanix@cvm$ nodetool -h 0 ring

    The state of the IP address of the CVM you started must be Normal as shown in the following output.

    nutanix@cvm$ nodetool -h 0 ring
    Address         Status State      Load            Owns    Token                                                          
                                                              kV0000000000000000000000000000000000000000000000000000000000   
    XX.XXX.XXX.XXX  Up     Normal     1.84 GB         25.00%  000000000000000000000000000000000000000000000000000000000000   
    XX.XXX.XXX.XXX  Up     Normal     1.79 GB         25.00%  FV0000000000000000000000000000000000000000000000000000000000   
    XX.XXX.XXX.XXX  Up     Normal     825.49 MB       25.00%  V00000000000000000000000000000000000000000000000000000000000   
    XX.XXX.XXX.XXX  Up     Normal     1.87 GB         25.00%  kV0000000000000000000000000000000000000000000000000000000000
  6. Power on or failback the guest VMs by using Hyper-V Manager or Failover Cluster Manager.

Enabling 1 GbE Interfaces (Hyper-V)

If 10 GbE networking is specified during cluster setup, 1 GbE interfaces are disabled on Hyper-V nodes. Follow these steps if you need to enable the 1 GbE interfaces later.

About this task

To enable the 1 GbE interfaces, do the following on each host:

Procedure

  1. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
  2. List the network adapters.
    > Get-NetAdapter | Format-List Name,InterfaceDescription,LinkSpeed

    Output similar to the following is displayed.

    Name                 : vEthernet (InternalSwitch)
    InterfaceDescription : Hyper-V Virtual Ethernet Adapter #3
    LinkSpeed            : 10 Gbps
    
    Name                 : vEthernet (ExternalSwitch)
    InterfaceDescription : Hyper-V Virtual Ethernet Adapter #2
    LinkSpeed            : 10 Gbps
    
    Name                 : Ethernet
    InterfaceDescription : Intel(R) 82599 10 Gigabit Dual Port Network Connection
    LinkSpeed            : 10 Gbps
    
    Name                 : Ethernet 3
    InterfaceDescription : Intel(R) 82599 10 Gigabit Dual Port Network Connection #2
    LinkSpeed            : 10 Gbps
    
    Name                 : NetAdapterTeam
    InterfaceDescription : Microsoft Network Adapter Multiplexor Driver
    LinkSpeed            : 20 Gbps
    
    Name                 : Ethernet 4
    InterfaceDescription : Intel(R) I350 Gigabit Network Connection #2
    LinkSpeed            : 0 bps
    
    Name                 : Ethernet 2
    InterfaceDescription : Intel(R) I350 Gigabit Network Connection
    LinkSpeed            : 1 Gbps

    Make a note of the Name of the 1 GbE interfaces you want to enable.

  3. Configure the interface.

    Replace interface_name with the name of the 1 GbE interface as reported by Get-NetAdapter .

    1. Enable the interface.
      > Enable-NetAdapter -Name "interface_name"
    2. Add the interface to the NIC team.
      > Add-NetLBFOTeamMember -Team NetAdapterTeam -Name "interface_name"

      If you want to configure the interface as a standby for the 10 GbE interfaces, include the parameter -AdministrativeMode Standby

    Perform these steps once for each 1 GbE interface you want to enable.

Changing the Hyper-V Host Password

The cluster software needs to be able to log into each host as Admin to perform standard cluster operations, such as querying the status of VMs in the cluster. Therefore, after changing the Administrator password it is critical to update the cluster configuration with the new password.

About this task

Tip: Although it is not required for the Administrator user to have the same password on all hosts, doing so makes cluster management and support much easier. If you do select a different password for one or more hosts, make sure to note the password for each host.

Procedure

  1. Change the Admin password of all hosts.
    Perform these steps on every Hyper-V host in the cluster.
    1. Log on to the Hyper-V host with Remote Desktop Connection.
    2. Press Ctrl+Alt+End to display the management screen.
    3. Click Change a Password .
    4. Enter the old password and the new password in the specified fields and click the right arrow button.
    5. Click Ok to acknowledge the password change.
  2. Update the Administrator user password for all hosts in the cluster configuration.
    Warning: If you do not perform this step, the web console no longer shows correct statistics and alerts, and other cluster operations fail.
    1. Log on to any CVM in the cluster using SSH.
    2. Find the host IDs.

      On the clusters running the AOS release 4.5.x, type:

      nutanix@cvm$ ncli host list | grep -E 'ID|Hypervisor Address'

      On the clusters running the AOS release 4.6.x or later, type:

      nutanix@cvm$ ncli host list | grep -E 'Id|Hypervisor Address'

      Note the host ID for each hypervisor host.

    3. Update the hypervisor host password.
      nutanix@cvm$ ncli managementserver edit name=host_addr \
       password='host_password' 
      nutanix@cvm$ ncli host edit id=host_id \
       hypervisor-password='host_password'
      • Replace host_addr with the IP address of the hypervisor host.
      • Replace host_id with a host ID you determined in the preceding step.
      • Replace host_password with the Admin password on the corresponding hypervisor host.

      Perform this step for every hypervisor host in the cluster.

Changing a Host IP Address

Perform these steps once for every hypervisor host in the cluster. Complete the entire procedure on a host before proceeding to the next host.

Before you begin

Remove the host from the failover cluster and domain before changing the host IP address.

Procedure

  1. Configure networking on the node by following Configuring Host Networking for Hyper-V Manually.
  2. Log on to every Controller VM in the cluster and restart genesis.
    nutanix@cvm$ genesis restart

    If the restart is successful, output similar to the following is displayed.

    Stopping Genesis pids [1933, 30217, 30218, 30219, 30241]
    Genesis started on pids [30378, 30379, 30380, 30381, 30403]

Changing the VLAN ID for Controller VM

About this task

Perform the following procedure to change the VLAN ID of the Controller VM.

Procedure

  1. Log on to the Hyper-V host with the IPMI remote console and run the following PowerShell command to get the VLAN settings configured.
    > Get-VMNetworkAdapterVlan
  2. Change the VLAN ID.
    > Set-VMNetworkAdapterVlan -VMName cvm_name -VMNetworkAdapterName External -Access -VlanID vlan_ID
    Replace cvm_name with the name of the Nutanix Controller VM.

    Replace vlan_ID with the new VLAN ID.

    Note: The VM name of the Nutanix Controller VM must begin with NTNX-

Configuring VLAN for Hyper-V Host

About this task

Perform the following procedure to configure Hyper-V host VLANs.

Procedure

  1. Log on to the Hyper-V host with the IPMI remote console.
  2. Start a PowerShell prompt and run the following command to create a variable for the ExternalSwitch.
    >$netAdapter = Get-VMNetworkAdapter -Name "ExternalSwitch" -ManagementOS
  3. To set a new VLAN ID for the ExternalSwitch.
    >Set-VMNetworkAdapterVlan -VMNetworkAdapter $netAdapter -Access -VlanId vlan_ID
    Replace vlan_ID with the new VLAN ID.
    You can now communicate to the Hyper-V host on the new subnet.

Configuring Host Networking for Hyper-V Manually

Perform the following procedure to manually configure the Hyper-V host networking.

About this task

Note: Do not add any other device (including guest VMs) to the VLAN to which the CVM and hypervisor host are assigned. Isolate guest VMs on one or more separate VLANs.

Procedure

  1. Log on to the Hyper-V host with the IPMI remote console and start a Powershell prompt.
  2. List the network adapters.
    > Get-NetAdapter | Format-List Name,InterfaceDescription,LinkSpeed

    Output similar to the following is displayed.

    Name                 : vEthernet (InternalSwitch)
    InterfaceDescription : Hyper-V Virtual Ethernet Adapter #3
    LinkSpeed            : 10 Gbps
    
    Name                 : vEthernet (ExternalSwitch)
    InterfaceDescription : Hyper-V Virtual Ethernet Adapter #2
    LinkSpeed            : 10 Gbps
    
    Name                 : Ethernet
    InterfaceDescription : Intel(R) 82599 10 Gigabit Dual Port Network Connection
    LinkSpeed            : 10 Gbps
    
    Name                 : Ethernet 3
    InterfaceDescription : Intel(R) 82599 10 Gigabit Dual Port Network Connection #2
    LinkSpeed            : 10 Gbps
    
    Name                 : NetAdapterTeam
    InterfaceDescription : Microsoft Network Adapter Multiplexor Driver
    LinkSpeed            : 20 Gbps
    
    Name                 : Ethernet 4
    InterfaceDescription : Intel(R) I350 Gigabit Network Connection #2
    LinkSpeed            : 0 bps
    
    Name                 : Ethernet 2
    InterfaceDescription : Intel(R) I350 Gigabit Network Connection
    LinkSpeed            : 0 bps

    Make a note of the InterfaceDescription for the vEthernet adapter that links to the physical interface you want to modify.

  3. Start the Server Configuration utility.
    > sconfig
  4. Select Networking Settings by typing 8 and pressing Enter .
  5. Change the IP settings.
    1. Select a network adapter by typing the Index number of the adapter you want to change (refer to the InterfaceDescription you found in step 2) and pressing Enter .
      Warning: Do not select the network adapter with the IP address 192.168.5.1 . This IP address is required for the Controller VM to communicate with the host.
    2. Select Set Network Adapter Address by typing 1 and pressing Enter .
    3. Select Static by typing S and pressing Enter .
    4. Enter the IP address for the host and press Enter .
    5. Enter the subnet mask and press Enter .
    6. Enter the IP address for the default gateway and press Enter .
      The host networking settings are changed.
  6. (Optional) Change the DNS servers.
    DNS servers must be configured for a host to be part of a domain. You can either change the DNS servers in the sconfig utility or with setup_hyperv.py .
    1. Select Set DNS Servers by typing 2 .
    2. Enter the primary and secondary DNS servers and press Enter .
      The DNS servers are updated.
  7. Exit the Server Configuration utility by typing 4 and pressing Enter then 15 and pressing Enter .

Joining a Host to a Domain Manually

About this task

For information about how to join a host to a domain by using utilities provided by Nutanix, see Joining the Cluster and Hosts to a Domain . Perform these steps for each Hyper-V host in the cluster to manually join a host to a domain.

Procedure

  1. Log on to the Hyper-V host with the IPMI remote console and start a Powershell prompt.
  2. Join the host to the domain and rename it.
    > Add-Computer -DomainName domain_name -NewName node_name `
     -Credential domain_name\domain_admin_user -Restart -Force
    • Replace domain_name with the name of the join for the host to join.
    • Replace node_name with a new name for the host.
    • Replace domain_admin_user with the domain administrator username.
    The host restarts and joins the domain.

Changing CVM Memory Configuration (Hyper-V)

About this task

You can increase the memory reserved for each Controller VM in your cluster by using the 1-click Controller VM Memory Upgrade available from the Prism Element web console. Increase memory size depending on the workload type or to enable certain AOS features. For more information about CVM memory sizing recommendations and instructions about how to increase the CVM memory, see Increasing the Controller VM Memory Size in the Prism Web Console Guide .

Hyper-V Configuration

Before configuring Nutanix storage on Hyper-V, you need to ensure that you meet the requirements of Hyper-V installation. For more information, see Hyper-V Installation Requirements. After you configure all the pre-requisites for installing and setting up Hyper-V, you need to join the Hyper-V cluster and its constituent hosts to the domain and then create a failover cluster.

Hyper-V Installation Requirements

Ensure that the following requirements are met before installing Hyper-V.

Windows Active Directory Domain Controller

Requirements:

  • For a fresh installation, you need a version of Nutanix Foundation that is compatible with the version of Windows Server you want to install.
    Note: To install Windows Server 2016, you need Foundation 3.11.2 or later. For more information, see the Field Installation Guide.
  • The primary domain controller version must at least be 2008 R2.
    Note: If you have Volume Shadow Copy Service (VSS) based backup tool (for example Veeam), functional level of Active Directory must be 2008 or higher.
  • Active Directory Web Services (ADWS) must be installed and running. By default, connections are made over TCP port 9389, and firewall policies must enable an exception on this port for ADWS.

    To test that ADWS is installed and running on a domain controller, log on by using a domain administrator account in a Windows host other than the domain controller host that is joined to the same domain and has the RSAT-AD-Powershell feature installed, and run the following PowerShell command. If the command prints the primary name of the domain controller, then ADWS is installed and the port is open.

> (Get-ADDomainController).Name
  • The domain controller must run a DNS server.
    Note: If any of the above requirements are not met, you need to manually create an Active Directory computer object for the Nutanix storage in the Active Directory, and add a DNS entry for the name.
  • Ensure that the Active Directory domain is configured correctly for consistent time synchronization.
  • Place the AD server in a separate virtual or physical host residing in storage that is not dependent on the domains that the AD server manages.
    Note: Do not run a virtual Active Directory domain controller (DC) on a Nutanix Hyper-V cluster and join the cluster to the same domain.

Accounts and Privileges:

  • An Active Directory account with permission to create new Active Directory computer objects for either a storage container or Organizational Unit (OU) where Nutanix nodes are placed. The credentials of this account are not stored anywhere.
  • An account that has sufficient privileges to join a Windows host to a domain. The credentials of this account are not stored anywhere. These credentials are only used to join the hosts to the domain.

Additional Information Required:

  • The IP address of the primary domain controller.
    Note: The primary domain controller IP address is set as the primary DNS server on all the Nutanix hosts. It is also set as the NTP server in the Nutanix storage cluster to keep the Controller VM, host, and Active Directory time synchronized.
  • The fully qualified domain name to which the Nutanix hosts and the storage cluster is going to be joined.

SCVMM

Note: Relevant only if you have SCVMM in your environment.

Requirements:

  • The SCVMM version must be at least 2016 and it must be installed on Windows Server 2016. If you have SCVMM on an earlier release, upgrade it to 2016 before you register a Nutanix cluster running Hyper-V.
  • Kerberos authentication for storage is optional for Windows Server 2012 R2 (see Enabling Kerberos for Hyper-V), but it is required for Windows Server 2016. However, for Kerberos authentication to work with Windows Server 2016, the active directory server must reside outside the Nutanix cluster.
  • The SCVMM server must allow PowerShell remoting.

    To test this scenario, log on by using the SCVMM administrator account in a Windows host and run the following PowerShell command on a Windows host that is different to the SCVMM host (for example, run the command from the domain controller). If they print the name of the SCVMM server, then PowerShell remoting to the SCVMM server is not blocked.

    > Invoke-Command -ComputerName scvmm_server -ScriptBlock {hostname} -Credential MYDOMAIN\username

    Replace scvmm_server with the SCVMM host name and MYDOMAIN with Active Directory domain name.

    Note: If the SCVMM server does not allow PowerShell remoting, you can perform the SCVMM setup manually by using the SCVMM user interface.
  • The ipconfig command must run in a PowerShell window on the SCVMM server. To verify run the following command.

    > Invoke-Command -ComputerName scvmm_server_name -ScriptBlock {ipconfig} -Credential MYDOMAIN\username

    Replace scvmm_server_name with the SCVMM host name and MYDOMAIN with Active Directory domain name.

  • The SMB client configuration in the SCVMM server should have RequireSecuritySignature set to False. To verify, run the following command.

    > Invoke-Command -ComputerName scvmm_server_name -ScriptBlock {Get-SMBClientConfiguration | FL RequireSecuritySignature}

    Replace scvmm_server_name with the SCVMM host name.

    This can be set to True by a domain policy. In this case, the domain policy should be modified to set it to False. Also, if it is True, this can be configured back to False, but might not get changed throughout if there is a policy that reverts it back to True. To change it, you can use the following command in the PowerShell on the SCVMM host by logging in as a domain administrator.

    Set-SMBClientConfiguration -RequireSecuritySignature $False -Force

    If you are changing it from True to False, it is important to confirm that the policies that are on the SCVMM host have the correct value. On the SCVMM host run rsop.msc to review the resultant set of policy details, and verify the value by navigating to, Servername > Computer Configuration > Windows Settings > Security Settings > Local Policies > Security Options: Policy Microsoft network client: Digitally sign communications (always). The value displayed in RSOP must be, Disabled or Not Defined for the change to persist. Also, the group policies that have been configured in the domain to apply to the SCVMM server should to be updated to change this to Disabled, if the RSOP shows it as Enabled. Otherwise, the RequireSecuritySignature changes back to True at a later time. After setting the policy in Active Directory and propagating to the domain controllers, refresh the SCVMM server policy by running the command gpupdate /force . Confirm in RSOP that the value is Disabled .
    Note: If security signing is mandatory, then you need to enable Kerberos in the Nutanix cluster. In this case, it is important to ensure that the time remains synchronized between the Active Directory server, the Nutanix hosts, and the Nutanix Controller VMs. The Nutanix hosts and the Controller VMs set their NTP server as the Active Directory server, so it should be sufficient to ensure that Active Directory domain is configured correctly for consistent time synchronization.

Accounts and Privileges:

  • When adding a host or a cluster to the SCVMM, the run-as account you are specifying for managing the host or cluster must be different from the service account that was used to install SCVMM.
  • Run-as account must be a domain account and must have local administrator privileges on the Nutanix hosts. This can be a domain administrator account. When the Nutanix hosts are joined to the domain, the domain administrator accounts automatically takes administrator privileges on the host. If the domain account used as the run-as account in SCVMM is not a domain administrator account, you need to manually add it to the list of local administrators on each host by running sconfig .
    • SCVMM domain account with administrator privileges on SCVMM and PowerShell remote execution privileges.
  • If you want to install SCVMM server, a service account with local administrator privileges on the SCVMM server.

IP Addresses

  • One IP address for each Nutanix host.
  • One IP address for each Nutanix Controller VM.
  • One IP address for each Nutanix host IPMI interface.
  • One IP address for the Nutanix storage cluster.
  • One IP address for the Hyper-V failover cluster.
Note: For N nodes, (3*N + 2) IP addresses are required. All IP addresses must be in the same subnet.

DNS Requirements

  • Each Nutanix host must be assigned a name of 15 characters or less, which gets automatically added to the DNS server during domain joining.
  • The Nutanix storage cluster needs to be assigned a name of 15 characters or less, which must be added to the DNS server when the storage cluster is joined to the domain.
  • The Hyper-V failover cluster must be assigned a name of 15 characters or less, which gets automatically added to the DNS server when the failover cluster is created.
  • After the Hyper-V configuration, all names must resolve to an IP address in the Nutanix hosts, the SCVMM server (if applicable), or any other host that needs access to the Nutanix storage, for example, a host running the Hyper-V Manager.

Storage Access Requirements

  • Virtual machine and virtual disk paths must always refer to the Nutanix storage cluster by name, not the external IP address. If you use the IP address, it directs all the I/O to a single node in the cluster and thereby compromises performance and scalability.
    Note: For external non-Nutanix host that needs to access Nutanix SMB shares, see Nutanix SMB Shares Connection Requirements from Outside the Cluster topic.

Host Maintenance Requirements

  • When applying Windows updates to the Nutanix hosts, the hosts should be restarted one at a time, ensuring that Nutanix services comes up fully in the Controller VM of the restarted host before updating the next host. This can be accomplished by using Cluster Aware Updating and using a Nutanix-provided script, which can be plugged into the Cluster Aware Update Manager as a pre-update script. This pre-update script ensures that the Nutanix services go down on only one host at a time ensuring availability of storage throughout the update procedure. For more information about cluster-aware updating, see Installing Windows Updates with Cluster-Aware Updating.
    Note: Ensure that automatic Windows updates are not enabled for the Nutanix hosts in the domain policies.

General Host Requirements

  • Hyper-V hosts must have the remote script execution policy set at least to RemoteSigned . A Restricted setting might cause issues when you reboot the CVM.
Note: Nutanix supports the installation of language packs for Hyper-V hosts.

Limitations and Guidelines

Nutanix clusters running Hyper-V have the following limitations. Certain limitations might be attributable to other software or hardware vendors:

Guidelines

Hyper-V 2016 Clusters and Support for Windows Server 2016
  • VHD Set files (.vhds) are a new shared Virtual Disk model for guest clusters in Microsoft Server 2016 and are not supported. You can import existing shared .vhdx disks to Windows Server 2016 clusters. New VHDX format sharing is supported. Only fixed-size VHDX sharing is supported.

    Use the PowerShell Add-VMHardDiskDrive command to attach any existing or new VHDX file in shared mode to VMs. For example: Add-VMHardDiskDrive -VMName Node1 -Path \\gogo\smbcontainer\TestDisk\Shared.vhdx -SupportPersistentReservations .

Upgrading Hyper-V Hypervisor Hosts
  • When upgrading hosts to Hyper-V 2016, 2019, and later versions, the local administrator user name and password is reset to the default administrator name Administrator and password of nutanix/4u. Any previous changes to the administrator name and/or password are overwritten.
General Guidelines
  • Note: Do not add any other device (including guest VMs) to the VLAN to which the CVM and hypervisor host are assigned. Isolate guest VMs on one or more separate VLANs.
  • If you are destroying a cluster and creating a new one and want to reuse the hostnames, failover cluster name, and storage object name of the previous cluster, remove their computer accounts and objects from AD and DNS first.

Limitations

  • Intel Advanced Network Services (ANS) is not compatible with Load Balancing and Failover (LBFO), the built-in NIC teaming feature in Hyper-V. For more information, see the Intel support article, Teaming with Intel® Advanced Network Services .
  • Nutanix does not support the online resizing of the shared virtual hard disks (VHDX files).

Configuration Scenarios

After using Foundation to create a cluster, you can use the Nutanix web console to join the Hyper-V cluster and its constituent hosts to the domain, create the Hyper-V failover cluster, and enable Kerberos.

Note: If you are installing Windows Server 2016, you do not have to enable Kerberos. Kerberos is enabled during cluster creation.

You can then use the setup_hyperv.py script to add host and storage to SCVMM, configure a Nutanix library share in SCVMM, and register Nutanix storage containers as file shares in SCVMM.

Note: You can use the setup_hyperv.py script only with a standalone SCVMM instance. The script does not work with an SCVMM cluster.

The usage of the setup_hyperv.py script is as follows.

nutanix@cvm$ setup_hyperv.py flags command
commands:
register_shares
setup_scvmm

Nonconfigurable Components

The components listed here are configured by the Nutanix manufacturing and installation processes. Do not modify any of these components except under the direction of Nutanix Support.

Nutanix Software

Modifying any of the following Nutanix software settings may inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.

  • Local datastore name.
  • Configuration and contents of any CVM (except memory configuration to enable certain features).
Important: Note the following important considerations about CVMs.
  • Do not delete the Nutanix CVM.
  • Do not take a snapshot of the CVM for backup.
  • Do not rename, modify, or delete the admin and nutanix user accounts of the CVM.
  • Do not create additional CVM user accounts.

    Use the default accounts ( admin or nutanix ), or use sudo to elevate to the root account.

  • Do not decrease CVM memory below recommended minimum amounts required for cluster and add-in features.

    Nutanix Cluster Checks (NCC), preupgrade cluster checks, and the AOS upgrade process detect and monitor CVM memory.

  • Nutanix does not support the usage of third-party storage on the host part of Nutanix clusters.

    Normal cluster operations might be affected if there are connectivity issues with the third-party storage you attach to the hosts in a Nutanix cluster.

  • Do not run any commands on a CVM that are not in the Nutanix documentation.

Hyper-V Settings

  • Cluster name (using the web console)
  • Controller VM name
  • Controller VM virtual hardware configuration file (.xml file in Hyper-V version 2012 R2 and earlier and .vmcx file in Hyper-V version 2016 and later). Each AOS version and upgrade includes a specific Controller VM virtual hardware configuration. Therefore, do not edit or otherwise modify the Controller VM virtual hardware configuration file.
  • Host name (you can configure the host name only at the time of creating and expanding the cluster)
  • Internal switch settings (internal virtual switch and internal virtual network adapter) and external network adapter name

    Two virtual switches are created on the Nutanix host, ExternalSwitch and InternalSwitch. Two virtual network adapters are created on the host corresponding to these virtual switches, vEthernet (ExternalSwitch) and vEthernet (InternalSwitch).

    Note: Do not delete these switches and adapters. Do not change the names of the internal virtual switch, internal virtual network adapter, and external virtual network adapter. You can change the name of the external virtual switch. For more information about changing the name of the external virtual switch, see Updating the Cluster After Renaming the Hyper-V External Virtual Switch.
  • Windows roles and features

    No new Windows roles or features must be installed on the Nutanix hosts. This especially includes the Multipath IO feature, which can cause the Nutanix storage to become unavailable.

    Do not apply GPOs to the Nutanix nodes that impact Log on as a service. It is recommended not to remove the default entries of the following service.

    NT Service\All Services

    NT Virtual Machine\Virtual Machines

  • Note: This best practice helps keep the host operating system free of roles, features, and applications that aren't required to run Hyper-V. For more information, see the Hyper-V should be the only enabled role document in the Microsoft documentation portal.
  • Controller VM pre-configured VM setting of Automatic Start Action
  • Controller VM high-availability setting
  • Controller VM operations: migrating, saving state, or taking checkpoints of the Controller VM

Adding the Cluster and Hosts to a Domain

After completing foundation of the cluster, you need to add the cluster and its constituent hosts to the Active Directory (AD) domain. The adding of cluster and hosts to the domain facilitates centralized administration and security through the use of other Microsoft services such as Group Policy and enables administrators to manage the distribution of updates and hotfixes.

Before you begin

  • If you have a VLAN segmented network, verify that you have assigned the VLAN tags to the Hyper-V hosts and Controller VMs. For information about how to configure VLANs for the Controller VM, see the Advanced Setup Guide.
  • Ensure that you have valid credentials of the domain account that has the privileges to create a new computer account or modify an existing computer account in the Active Directory domain. An Active Directory domain created by using non-ASCII text may not be supported. For more information about usage of ASCII or non-ASCII text in Active Directory configuration, see Internationalization (i18n) .

Procedure

  1. Log on to the Web Console by using one of the Controller VM IP address or by using cluster virtual IP address.
  2. Click the gear icon in the main menu and select Join Cluster and Hosts to the Domain on the Settings page.
    Figure. Join Cluster and Hosts to the Domain
    Click to enlarge A sample image of the Join Cluster and Hosts to the Domain menu used to add a cluster and its constituent hosts to an AD domain.

  3. Enter the fully qualified name of the domain that you want to join the cluster and its constituent hosts to in the Full Domain Name box.
  4. Enter the IP address of the name server in the Name Server IP Address box that can resolve the domain name that you have entered in the Full Domain Name box.
  5. In the Base OU Path box, type the OU (organizational unit) path where the computer accounts must be stored after the host joins a domain. For example, if the organization is nutanix.com and the OU is Documentation, the Base OU Path can be specified as OU=Documentation,DC=nutanix,DC=com
    Specifying the Base OU Path is optional. When you specify the Base OU Path, the computer accounts are stored in the Base OU Path within the Active Directory after the hosts join a domain. If the Base OU Path is not specified, the computer accounts are stored in the default Computers OU.
  6. Enter a name for the cluster in the Nutanix Cluster Name box.
    The maximum length of the cluster name should not be more than 15 characters and it should be a valid NetBIOS name.
  7. Enter the virtual IP address of the cluster in the Nutanix Cluster Virtual IP Address box.
    If you have not already configured the virtual IP address of the cluster, you can configure it by using this box.
  8. Enter the prefix that should be used to name the hosts (according to your convention) in the Prefix box.
    • The prefix name should not end with a period.
    • The maximum length of the prefix name should not be more than 11 characters.
    • Should be a valid NetBIOS name.

      For example, if you enter prefix name as Tulip, the hosts are named as Tulip-1, Tulip-2, and so on, in the increasing order of the external IP address of the hosts.

    If you do not provide any prefix, the default name of NTNX- block-number is used. Click Advanced View to see the expanded view of all the hosts in all the blocks of the cluster and to rename them individually.
  9. In the Credentials field, enter the logon name and password of the domain account that has the privileges to create a new or modify an existing computer accounts in the Active Directory domain.
    Ensure that the logon name is in the DOMAIN\USERNAME format. The cluster and its constituent hosts require these credentials to join the AD domain. Nutanix does not store the credentials.
  10. When all the information is correct, click Join .
    The cluster is added to the domain. Also, all the hosts are renamed, added to the domain, and restarted. Allow the hosts and Controller VMs a few time to start-up. After the cluster is ready, the logon page is displayed.

What to do next

Create a Microsoft failover cluster. For more information, see Creating a Failover Cluster for Hyper-V.

Creating a Failover Cluster for Hyper-V

Before you begin

Perform the following tasks before you create a failover cluster:

Perform the following procedure to create a failover cluster that includes all the hosts in the cluster.

Procedure

  1. Log on to the Prism Element web console by using one of the Controller VM IP addresses or by using the cluster virtual IP address.
  2. Click the gear icon in the main menu and select Configure Failover Cluster from the Settings page.
    Figure. Configure Failover Cluster
    Click to enlarge Configuring Failover Cluster

  3. Type the failover cluster name in the Failover Cluster Name text box.
    The maximum length of the failover cluster name must not be more than 15 characters and must be a valid NetBIOS name.
  4. Type an IP address for the Hyper-V failover cluster in the Failover Cluster IP Address text box.
    This address is for the cluster of Hyper-V hosts that are currently being configured. It must be unique, different from the cluster virtual IP address and from all other IP addresses assigned to the hosts and Controller VMs. It must be in the same network range as the Hyper-V hosts.
  5. In the Credentials field, type the logon name and password of the domain account that has the privileges to create a new account or modify existing accounts in the Active Directory domain.
    The logon name must be in the format DOMAIN\USERNAME . The credentials are required to create a failover cluster. Nutanix does not store the credentials.
  6. Click Create Cluster .
    A failover cluster is created by the name that has been provided and it includes all the hosts in the cluster.
    For information on manually creating a failover cluster, see Manually Creating a Failover Cluster (SCVMM User Interface).

Manually Creating a Failover Cluster (SCVMM User Interface)

Join the hosts to the domain as described in Adding the Cluster and Hosts to a Domain in the Hyper-V Administration for Acropolis guide.

About this task

Perform the following procedure to manually create a failover cluster for Hyper-V by using System Center VM Manager (SCVMM).

If you are not using SCVMM or are using Hyper-V Manager, see Creating a Failover Cluster for Hyper-V.

Procedure

  1. Start the Failover Cluster Manager utility.
  2. Right-click and select Create Cluster , and click Next .
  3. Enter all the hosts that you want to add to the Failover cluster, and click Next .
  4. Select the No. I do not require support from Microsoft for this cluster, and therefore do not want to run the validation tests. When I click Next continue creating the cluster option, and click Next .
    Note:

    If you select Yes , two tests fail when you run the cluster validation tests. The tests fail because the internal network adapter on each host is configured with the same IP address (192.168.5.1). The network validation tests fail with the following error message:

    Duplicate IP address

    The failures occur despite the internal network being reachable only within a host, so the internal adapter can have the same IP address on different hosts. The second test, Validate Network Communication, fails due to the presence of the internal network adapter. Both failures are benign and can be ignored.

  5. Enter a name for the cluster, specify a static IP address, and click Next .
  6. Clear the All eligible storage to the cluster check box, and click Next .
  7. Wait until the cluster is created. After you receive the message that the cluster is successfully created, click Finish to exit the Cluster Creation wizard.
  8. Go to Networks in the cluster tree and select Cluster Network 1 and ensure it is in the internal network by verifying the IP address in the summary pane. The IP address must be 192.168.5.0/24 as shown in the following screen shot.
    Figure. Failover Cluster Manager Click to enlarge

  9. Click the Action tab on the toolbar and select Live Migration Settings .
  10. Remove Cluster Network 1 from Networks for Live Migration and click OK .
    Note: If you do not perform this step, live migrations fail because the internal network is added to the live migration network lists. Log on to SCVMM, add the cluster to SCVMM, check the host migration setting, and ensure that the internal network is not listed.

Changing the Failover Cluster IP Address

About this task

Perform the following procedure to change your Hyper-V failover cluster IP address.

Procedure

  1. Open Failover Cluster Manager and connect to your cluster.
  2. Enter the name of any one of the Hyper-V hosts and click OK .
  3. In the Failover Cluster Manager pane, select your cluster and expand Cluster Core Resources .
  4. Right-click select the cluster, and select Properties > IP address .
  5. Change the IP address of your failover cluster using the Edit option and click OK .
  6. Click Apply .

Enabling Kerberos for Hyper-V

If you are running Windows Server 2012 R2, perform the following procedure to configure Kerberos to secure the storage. You do not have to perform this procedure for Windows Server 2016 because Kerberos is enabled automatically during failover cluster creation.

Before you begin

  • Join the hosts to the domain as described in Adding the Cluster and Hosts to a Domain.
  • Verify that you have configured a service account for delegation. For more information on enabling delegation, see the Microsoft documentation .

Procedure

  1. Log on to the web console by using one of the Controller VM IP addresses or by using the cluster virtual IP address.
  2. Click the gear icon in the main menu and select Kerberos Management from the Settings page.
    Figure. Configure Failover Cluster Click to enlarge Enabling Kerberos

  3. Set the Kerberos Required option to enabled.
  4. In the Credentials field, type the logon name and password of the domain account that has the privileges to create and modify the virtual computer object representing the cluster in Active Directory. The credentials are required for enabling Kerberos.
    The logon name must be in the format DOMAIN\USERNAME . Nutanix does not store the credentials.
  5. Click Save .

Configuring the Hyper-V Computer Object by Using Kerberos

About this task

Perform the following procedure to complete the configuration of the Hyper-V Computer Object by using Kerberos and SMB signing (for enhanced security).
Note: Nutanix recommends you to configure Kerberos during a maintenance window to ensure cluster stability and prevent loss of storage access for user VMs.

Procedure

  1. Log on to Domain Controller and perform the following for each Hyper-V host computer object.
    1. Right-click the host object, and go to Properties . In the Delegation tab, select the Trust this computer for delegation to specified services only option, and select Use any authentication protocol .
    2. Click Add to add the cifs of the Nutanix storage cluster object.
    Figure. Adding the cifs of the Nutanix storage cluster object Click to enlarge

  2. Check the Service Principal Name (SPN) of the Nutanix storage cluster object.
    > Setspn -l name_of_cluster_object

    Replace name_of_cluster_object with the name of the Nutanix storage cluster object.

    Output similar to the following is displayed.

    Figure. SPN Registration Click to enlarge

    If the SPN is not registered for the Nutanix storage cluster object, create the SPN by running the following commands.

    > Setspn -S cifs/name_of_cluster_object name_of_cluster_object
    > Setspn -S cifs/FQDN_of_the_cluster_object name_of_cluster_object

    Replace name_of_cluster_object with the name of the Nutanix storage cluster object and FQDN_of_the_cluster_object with the domain name of the Nutanix storage cluster object.

    Example

    > Setspn -S cifs/virat virat
    > Setspn -S cifs/virat.sre.local virat
    
  3. [Optional] To enable SMB signing feature, log on to each Hyper-V host by using RDP and run the following PowerShell command to change the Require Security Signature setting to True .
    > Set-SMBClientConfiguration -RequireSecuritySignature $True –Force
    Caution: The SMB server will only communicate with an SMB client that can perform SMB packet signing, therefore if you decide to enable the SMB signing feature, it must be enabled for all the Hyper-V hosts in the cluster.

Disabling Kerberos for Hyper-V

Perform the following procedure to disable Kerberos.

Procedure

  1. Disable SMB signing.
    Log on to each Hyper-V host by using RDP and run the following PowerShell command to change the Require Security Signature setting to False .
    Set-SMBClientConfiguration -RequireSecuritySignature $False –Force
  2. Disable Kerberos from the Prism web console.
    1. Log into the web console by using one of the Controller VM IP address or by using cluster virtual IP address.
    2. From the gear icon, click Kerberos Management .
    3. Set Kerberos Required button to disabled.
    4. In the Credentials field, type the logon name and password of the domain account that has the privileges to create modify the virtual computer object representing the cluster in the Active Directory. The credentials are required for enabling Kerberos.
      This logon name must be in the format DOMAIN\USERNAME . Nutanix does not store the credentials.
    5. Click Save .

Setting Up Hyper-V Manager

Perform the following steps to set up Hyper-V Manager.

Before you begin

  • Add the server running Hyper-V Manager to the allowlist by using the Prism user interface. For more information, see Configuring a Filesystem Whitelist in the Prism Web Console Guide .
  • If Kerberos is enabled for accessing storage (by default it is disabled), enable SMB delegation.

Procedure

  1. Log into the Hyper-V Manager.
  2. Right-click the Hyper-V Manager and select Connect to Server .
  3. Type the name of the host that you want to add and click OK .
  4. Right-click the host and select Hyper-V Settings .
  5. Click Virtual Hard Disks and verify that the location to store virtual hard disk files is same that you have specified during storage container creation.
    For more information, see Creating a Storage Container in the Prism Web Console Guide .
  6. Click Virtual Machines and verify that the location to store virtual machine configuration files is same that you have specified during storage container creation.
    For more information, see Creating a Storage Container in the Prism Web Console Guide .
    After performing these steps, you are ready to create and manage virtual machines by using Hyper-V Manager.
    Warning: Virtual machines created by using Hyper-V should never be defined on storage using IP-based SMB share location.

Cluster Management

Installing Windows Updates with Cluster-Aware Updating

With storage containers that are configured with a replication factor of 2, Nutanix clusters can tolerate only a single node being down at a time. For such clusters, you need a way to update nodes one node at a time.

If your Nutanix cluster runs Microsoft Hyper-V, you can use the Cluster-Aware Updating (CAU) utility, which ensures that only one node is down at a time when Windows updates are applied.

Note: Nutanix does not recommend performing a manual patch installation for a Hyper-V cluster running on the Nutanix platform.

The procedure for configuring CAU for a Hyper-V cluster running on the Nutanix platform is the same as that for a Hyper-V cluster running on any other platform. However, for a Hyper-V cluster running on Nutanix, you need to use a Nutanix pre-update script created specifically for Nutanix clusters. The pre-update script ensures that the CAU utility does not proceed to the next node until the Controller VM on the node that was updated is fully back up, preventing a condition in which multiple Controller VMs are down at the same time.

The CAU utility might not install all the recommended updates, and you might have to install some updates manually. For a complete list of recommended updates, see the following articles in the Microsoft documentation portal.

  • Recommended hotfixes, updates, and known solutions for Windows Server 2012 R2 Hyper-V environments
  • Recommended hotfixes and updates for Windows Server 2012 R2-based failover clusters

Revisit these articles periodically and install any updates that are added to the list.

Note: Ensure that the Nutanix Controller VM and the Hyper-V host are placed in maintenance mode before any maintenance or patch installation. For more information, see Placing the Controller VM and Hyper-V Host in Maintenance Mode.

Preparing to Configure Cluster-Aware Updating

Configure your environment to run the Nutanix pre-update script for Cluster-Aware Updating. The Nutanix pre-update script is named cau_preupdate.ps1 and is, by default, located on each Hyper-V host in C:\Program Files\Nutanix\Utils\ . To ensure smooth configuration, make sure you have everything you need before you begin to configure CAU.

Before you begin

  • Review the required and recommended Windows updates for your cluster.
  • See the Microsoft documentation for information about the Cluster-Aware Updating feature. In particular, see the requirements and best practices for Cluster-Aware Updating in the Micorosoft documentation portal.
  • To enable the migration of virtual machines from one node to another, configure the virtual machines for high availability.

About this task

To configure your environment to run the Nutanix pre-update script, do the following:

Procedure

  1. If you plan to use self-updating mode, do the following:
    1. On each Hyper-V host and on the management workstation that you are using to configure CAU, create a directory such that the path to the directory and the directory name do not contain spaces (for example, C:\cau ).
      Note: The location of the directory must be the same on the hosts and the management workstation.
    2. From C:\Program Files\Nutanix\Utils\ on each host, copy the Nutanix pre-update file cau_preupdate.ps1 to the directory you created on the hosts and on the management workstation.

    A directory whose path does not contain spaces is necessary because Microsoft does not support the use of spaces in the PreUpdateScript field. The space in the default path ( C:\Program Files\Nutanix\Utils\ ) prevents the cluster from updating itself in the self-updating mode. However, that space does not cause issues if you update the cluster by using the remote-updating mode. If you plan to use only the remote-updating mode, you can use the pre-update script from its default location. If you plan to use the self-updating mode or both self-updating and remote-updating modes, use a directory whose path does not contain spaces.

  2. On each host, do the following.
    1. Unblock the script file.
      > powershell.exe Unblock–File -Path 'path-to-pre-update-script'

      Replace path-to-pre-update-script with the full path to the pre-update script (for example, C:\cau\cau_preupdate.ps1 ).

    2. Allow Windows PowerShell to run unsigned code.
      > powershell.exe  Set-ExecutionPolicy remoteSigned

Accessing the Cluster-Aware Updating Dialog Box

You configure CAU by using the Cluster-Aware Updating dialog box.

About this task

To access the Cluster-Aware Updating dialog box, do the following:

Procedure

  1. Open Failover Cluster Manager and connect to your cluster.
  2. In the Configure section, click Cluster-Aware Updating .
    Figure. Cluster-Aware Updating Dialog Box Click to enlarge "The Cluster-Aware Updating dialog box connects to a failover cluster. The dialog box displays the nodes in the cluster, a last update summary, logs of updates in progress, and links to CAU configuration options and wizards."

    The Cluster-Aware Updating dialog box appears. If the dialog box indicates that you are not connected to the cluster, in the Connect to a failover cluster field, enter the name of the cluster, and then click Connect .

Specifying the Nutanix Pre-Update Script in an Updating Run Profile

Specify the Nutanix pre-update script in an Updating Run and save the configuration to an Updating Run Profile in the XML format. This is a one-time task. The XML file contains the configuration for the cluster-update operation. You can reuse this file to drive cluster updates through both self-updating and remote-updating modes.

About this task

To specify the Nutanix pre-update script in an Updating Run Profile, do the following:

Procedure

  1. In the Cluster-Aware Updating dialog box, click Create or modify Updating Run Profile .
    You can see the current location of the XML file under the Updating Run profile to start from: field.
    Note: You cannot overwrite the default CAU configuration file, because non-local administrative users, including the AD administrative users, do not have permissions to modify files in the C:\Windows\System32\ directory.
  2. Click Save As .
  3. Select a new location for the file and rename the file. For example, you can rename the file to msfc_updating_run_profile.xml and save it to the following location: C:\Users\administrator\Documents .
  4. Click Save .
  5. In the Cluster-Aware Updating dialog box, under Cluster Actions , click Configure cluster self-updating options .
  6. Go to Input Settings > Advanced Options and, in the Updating Run options based on: field, click Browse to select the location to which you saved the XML file in an earlier step.
  7. In the Updating Run Profile Editor dialog box, in the PreUpdateScript field, specify the full path to the cau_preupdate.ps1 script. The default full path is C:\Program Files\Nutanix\Utils\cau_preupdate.ps1 . The default path is acceptable if you plan to use only the remote-updating mode. If you plan to use the self-updating mode, place cau_preupdate.ps1 in a directory such that the path does not include spaces. For more information, see Preparing to Configure Cluster-Aware Updating.
    Note: You can also place the script on the SMB file share if you can access the SMB file share from all your hosts and the workstation that you are configuring the CAU from.
  8. Click Save .
    Caution: Do not change the auto-populated ConfigurationName field value. Otherwise, the script fails.
    The CAU configuration is saved to an XML file in the following folder: C:\Windows\System32

What to do next

Save the Updating Run Profile to another location and use it for any other cluster updates.

Updating a Cluster by Using the Remote-Updating Mode

You can update the cluster by using the remote-updating mode to verify that CAU is configured and working correctly. You might need to use the remote-updating mode even when you have configured the self-updating mode, but mostly for updates that cannot wait until the next self-updating run.

About this task

Note: Do not turn off your workstation until all updates have been installed.
To update a cluster by using the remote-updating mode, do the following:

Procedure

  1. In the Cluster-Aware Updating dialog box, click Apply updates to this cluster .
    The Cluster-Aware Updating Wizard appears.
  2. Read the information on the Getting Started page, and then click Next .
  3. On the Advanced Options page, do the following.
    1. In the Updating Run options based on field, enter the full path to the CAU configuration file that you created in Specifying the Nutanix Pre-Update Script in an Updating Run Profile .
    2. Ensure that the full path to the downloaded script is shown in the PreUpdateScript field and that the value in the CauPluginName field is Microsoft.WindowsUpdatePlugin .
  4. On the Additional Update Options page, do the following.
    1. If you want to include recommended updates, select the Give me recommended updates the same way that I receive important updates check box.
    2. Click Next .
  5. On the Completion page, click Close .
    The update process begins.
  6. In the Cluster-Aware Updating dialog box, click the Log of Updates in Progress tab and monitor the update process.

Updating a Cluster by Using the Self-Updating Mode

The self-updating mode ensures that the cluster is up-to-date at all times.

About this task

To configure the self-updating mode, do the following:

Procedure

  1. In the Cluster-Aware Updating dialog box, click Configure cluster self-updating options .
    The Configure Self-Updating Options Wizard appears.
  2. Read the information on the Getting Started page, and then click Next .
  3. On the Add Clustered Role page, do the following.
    1. Select the Add the CAU clustered role, with self-updating mode enabled, to this cluster check box.
    2. If you have a prestaged computer account, select the I have a prestaged computer object for the CAU clustered role check box. Otherwise, leave the check box clear.
  4. On the Self-updating schedule page, specify details such as the self-updating frequency and start date.
  5. On the Advanced Options page, do the following.
    1. In the Updating Run options based on field, enter the full path to the CAU configuration file that you created in Specifying the Nutanix Pre-Update Script in an Updating Run Profile .
    2. Ensure that the full path to the Nutanix pre-update script is shown in the PreUpdateScript field and that the value in the CauPluginName field is Microsoft.WindowsUpdatePlugin .
  6. On the Additional Update Options page, do the following.
    1. If you want to include recommended updates, select the Give me recommended updates the same way that I receive important updates check box.
    2. Click Next .
  7. Click Close .

Moving a Hyper-V Cluster to a Different Domain

This topic describes the supported procedure to move all the hosts on a Nutanix cluster running Hyper-V from one domain to another domain. For example, you might need to do this when you are ready to transition a test cluster to your production environment. Ensure that you merge all the VM checkpoints before moving them to another domain. The VMs fail to start in another domain if there are multiple VM checkpoints.

Before you begin

This method involves cluster downtime. Therefore, schedule a maintenance window to perform the following operations.

Procedure

  1. Note: If you are using System Center Virtual Machine Manager (SCVMM) to manage the cluster, remove the cluster from the SCVMM console. Right-click the cluster in the SCVMM console, and select Remove .
    Destroy the Hyper-V failover cluster using the Failover Cluster Manager or PowerShell commands.
    Note:

    • Remove all the roles from the cluster before destroying the cluster by doing either of the following:
      • Open Failover Cluster Manager, and select Roles from the left navigation pane. Select all the VM's, and select Remove .
      • Log on to any Hyper-V host with domain administrator user credentials and remove the roles with the PowerShell command Get-ClusterGroup | Remove-ClusterGroup -RemoveResources -Force .
    • Destroying the cluster removes any non-VM role permanently. This does not impact the VMs, and the VMs are visible in Hyper-V manager only.
    • Open Failover Cluster Manager, right-click select the cluster, and select More Actions > Destroy Cluster .
    • Log on to any Hyper-V host with domain administrator user credentials and remove the cluster with the PowerShell command Remove-Cluster -Force -CleanupAD , which ensures that all Active Directory objects (all hosts in the Nutanix cluster, Hyper-V failover cluster object, Nutanix storage cluster object) and any corresponding entries are deleted.
  2. Log on to any controller VM in the cluster and remove the Nutanix cluster from the domain by using nCLI; ensure that you also specify the Active Directory administrator user name.
    nutanix@cvm$ ncli cluster unjoin-domain logon-name=domain\username
  3. Log on to each host as the domain administrator user and remove the domain security identifiers from the virtual machines.
    > $d = (Get-WMIObject Win32_ComputerSystem).Domain.Split(".")[0]
    > Get-VMConnectAccess | Where {$_.username.StartsWith("$d\")} | `
      Foreach {Revoke-VMConnectAccess -VMName * -UserName $_.UserName} 
  4. Caution:

    Ensure all the user VM's are powered off before performing this step.
    Log on to any controller VM in the cluster and remove all hosts in the Nutanix cluster from the domain.
    nutanix@cvm$ allssh 'source /etc/profile > /dev/null 2>&1; winsh "\$x=hostname; netdom \
      remove \$x /domain /force"'
  5. Restart all hosts.
  6. If a controller VM fails to restart, use the Repair-CVM Nutanix PowerShell cmdlet to help you recover from this issue. Otherwise, skip this step and perform the next step.
    1. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
    2. Start the controller VM repair process.
      > Repair-CVM
      The CVM will be shutdown. Proceed (Y/N)? Y

      Progress is displayed in the PowerShell command-line shell. When the process is complete, the controller VM configuration information is displayed:

      Using the following configuration:
      
      Name                           Value
      ----                           -----
      internal_adapter_name          Internal
      name                           cvm-host-name
      external_adapter_name          External
      processor_count                8
      memory_weight                  100
      svmboot_iso_path               C:\Program Files\Nutanix\Cvm\cvm_name\svmboot.iso
      nutanix_path                   C:\Program Files\Nutanix
      vm_repository                  C:\Users\Administrator\Virtual Machines
      internal_vswitch_name          InternalSwitch
      processor_weight               200
      external_vswitch_name          ExternalSwitch
      memory_size_bytes              12884901888
      pipe_name                      \\.\pipe\SVMPipe

What to do next

Add the hosts to the new domain as described in Adding the Cluster and Hosts to a Domain.

Recover a Controller VM by Using Repair-CVM

The Repair-CVM PowerShell cmdlet can repair an unusable or deleted Controller VM by removing the existing Controller VM (if present) and creating a new one. In the Nutanix enterprise cloud platform design, no data associated with the unusable or deleted Controller VM is lost.

About this task

If a Controller VM already exists and is running, Repair-CVM prompts you to shut down the Controller VM so it can be deleted and re-created. If the Controller VM has been deleted, the cmdlet creates a new one. In all cases, the new CVM automatically powers on and joins the cluster.

A Controller VM is considered unusable when:

  • The Controller VM is accidentally deleted.
  • The Controller VM configuration is accidentally or unintentionally changed and the original configuration parameters are unavailable.
  • The Controller VM fails to restart after unjoining the cluster from a Hyper-V domain as part of a domain move procedure.

To use the cmdlet, log on to the Hyper-V host, type Repair-CVM, and follow any prompts. The repair process creates a new Controller VM based on any available existing configuration information. If the process cannot find the information or the information does not exist, the cmdlet prompts you for:

  • Controller VM name
  • Controller VM memory size in GB
  • Number of processors to assign to the Controller VM
Note: After running this command, you need to manually re-apply all the custom configuration that you have performed, for example increased RAM size.

Procedure

  1. Log on to the Hyper-V host with Remote Desktop Connection and start PowerShell.
  2. Start the controller VM repair process.
    > Repair-CVM
    The CVM will be shutdown. Proceed (Y/N)? Y

    Progress is displayed in the PowerShell command-line shell. When the process is complete, the controller VM configuration information is displayed:

    Using the following configuration:

    Name                 Value
    ----                           -----
    internal_adapter_name          Internal
    name                           cvm-host-name
    external_adapter_name          External
    processor_count                8
    memory_weight                  100
    svmboot_iso_path               C:\Program Files\Nutanix\Cvm\cvm_name\svmboot.iso
    nutanix_path                   C:\Program Files\Nutanix
    vm_repository                  C:\Users\Administrator\Virtual Machines
    internal_vswitch_name          InternalSwitch
    processor_weight               200
    external_vswitch_name          ExternalSwitch
    memory_size_bytes              12884901888
    pipe_name                      \\.\pipe\SVMPipe

Connect to a Controller VM by Using Connect-CVM

Nutanix installs Hyper-V utilities on each Hyper-V host for troubleshooting and Controller VM access. This procedure describes how to use Connect-CVM to launch the FreeRDP utility to access a Controller VM console when a secure shell (SSH) is not available or cannot be used.

About this task

FreeRDP launches when you run: > Connect-CVM .

Procedure

  1. Log on to a Hyper-V host in your environment and open a PowerShell command window.
  2. Start Connect-CVM.
    > Connect-CVM
  3. In the authentication dialog box, type the local administrator credentials and click OK .
  4. Log on to the Controller VM at the FreeRDP console window.
  5. Login to the Controller VM by using Controller VM credentials.

Changing the Name of the Nutanix Storage Cluster

The name of the Nutanix Storage clusters cannot be changed by using the Web console.

About this task

To change the name of the Nutanix storage cluster, do the following:

Procedure

  1. Log on to the CVM with SSH.
  2. Unjoin the existing Nutanix storage cluster object from the domain.
    ncli> cluster unjoin-domain logon-name=domain\username
  3. Change the cluster name.
    ncli> cluster edit-params new-name=cluster_name

    Replace cluster_name with the new cluster name.

  4. Create a new AD object corresponding to the new storage cluster name.
    nutanix@cvm$ ncli cluster join-domain cluster-name=new_name domain=domain_name \
    external-ip-address=external_ip_address name-server-ip=dns_ip logon-name=domain\username
  5. Restart genesis on each Controller VM in the cluster.
    nutanix@cvm$ allssh 'genesis restart'
    A new entry for the cluster is created in \Windows\System32\drivers\etc\hosts on the Hyper-V hosts.

Changing the Nutanix Cluster External IP Address

About this task

To change the external IP address of the Nutanix cluster, do the following.

Procedure

  1. Log on to the Controller VM with SSH.
  2. Run the following command to change the cluster external IP address.
    nutanix@cvm$ ncli cluster edit-params external-ip-address external_ip_address
    Replace external_ip_address with the new Nutanix cluster external IP address.

Fast Clone a VM Based on Nutanix SMB Shares by using New-VMClone

This cmdlet provides for fast-cloning virtual machines that are based off of Nutanix SMB shares. This cmdlet provide options for creating one or more clones from a given virtual machine.

About this task

Run Get-Help New-VMClone -Full to get detailed help on using the cmdlet with all the options that are available.

Note: This cmdlet does not support creating clones of VMs that have Hyper-V checkpoints.

Procedure

Log on to the Hyper-V host with a Remote Desktop Connection and open a PowerShell command window.
  • The syntax to create single clone is as follows.
    > New-VMClone -VM vm_name -CloneName clone_name -ComputerName computer_name`
     -DestinationUncPath destination_unc_path -PowerOn`
    -Credential prism_credential common_parameters
  • The syntax to create multiple clones is as follows.
    > New-VMClone -VM vm_name -CloneNamePrefix  clone_name_prefix`
    -CloneNameSuffixBegin clone_name_suffix_begin -NCopies n_copies`
    -ComputerName computer_name -DestinationUncPath destination_unc_path -PowerOn`
    -Credential prism_credential -MaxConcurrency max_concurrency common_parameters
  • Replace vm_name with the name of the VM that you are cloning.
  • Replace clone_name with the name of the VM that you are creating.
  • Replace clone_name_prefix with the prefix that should be used for naming the clones.
  • Replace clone_name_suffix_begin with the starting number of the suffix.
  • Replace n_copies with the number of clones that you need to create.
  • Replace computer_name with the name of the computer on which you are creating the clone.
  • Replace destination_unc_path with path on the Nutanix SMB share to store the clone on.
  • Replace prism_credential with the credential to access the Prism (the Nutanix Management service).
  • Replace max_concurrency with the number of clones that you need to create in parallel.
  • Replace common_parameters with any additional parameters that you want to define. For example, -Verbose flag.

Change the Path of a VM Based on Nutanix SMB shares by using Set-VMPath

This cmdlet provides for repairing the UNC paths in the metadata of the VMs that are based off of Nutanix SMB shares and has the following two forms.

About this task

  • Replaces the specified IP address with the supplied DNS name for every occurrence of the IP address in the UNC paths in the VM metadata or configuration file.
  • Replaces the specified SMB server name with the supplied alternative in the UNC paths in the VM metadata without taking the case into consideration.
Note: You cannot use the Set-VMPath cmdlet in 4.5 release. You can use this cmdlet for 4.5.1 or later releases.

Procedure

Log on to the Hyper-V host with a Remote Desktop Connection and open a PowerShell command window.
  • The syntax to change the IP address to DNS name is as follows.
    > Set-VMPath -VMId vm_id -IPAddress ip_address -DNSName dns_name common_parameters
  • The syntax to change the SMB server name is as follows.
    > Set-VMPath -VMId vm_id -SmbServerName smb_server_name`
    -ReplacementSmbServerName replacement_smb_server_name common_parameters
  • Replace vm_id with the ID of the VM.
  • Replace ip_address with the IP address that you want to replace in the VM metadata or configuration file.
  • Replace dns_name with the DNS name that you want to replace the IP address with.
  • Replace smb_server_name with the SMB server name that you want to replace.
  • Replace replacement_smb_server_name with the SMB server name that you want as a replacement.
  • Replace common_parameters with any additional parameters that you want to define. For example, -Verbose flag.
Note: The target VM must be powered off for the operation to complete.

Nutanix SMB Shares Connection Requirements from Outside the Cluster

For external non-Nutanix host that needs to access Nutanix SMB shares must conform to following requirements.

  • Any external non-Nutanix host that needs to access Nutanix SMB shares must run at least Windows 8 or later version if it is a desktop client, and Windows 2012 or later version if it is running Windows Server. This requirement is because SMB 3.0 support is required for accessing Nutanix SMB shares.
  • The IP address of the host must be allowed in the Nutanix storage cluster.
    Note: The SCVMM host IP address is automatically included in the allowlist during the setup. For other IP addresses, you can add those source addresses to the allowlist after the setup configuration is completed by using the Web Console or the nCLI cluster add-to-nfs-whitelist command.
  • For accessing a Nutanix SMB share from Windows 10 or Windows Server 2016, you must enable Kerberos on the Nutanix cluster.
  • If Kerberos is not enabled in the Nutanix storage cluster (the default configuration), then the SMB client in the host must not have RequireSecuritySignature set to True. For more information about checking the policy, see System Center Virtual Machine Manager Configuration . You can verify this by running Get-SmbClientConfiguration in the host. If the SMB client is running in a Windows desktop instead of Windows Server, the account used to log on into the desktop should not be linked to an external Microsoft account.
  • If Kerberos is enabled in the Nutanix storage cluster, you can access the storage only by using the DNS name of the Nutanix storage cluster, and not by using the external IP address of the cluster.
Warning: Nutanix does not support using SMB shares of Hyper-V for storing anything other than virtual machine disks (e.g VHD, VHDX files) and their associated configuration files. This includes, but is not limited to, using Nutanix SMB shares of Hyper-V for general file sharing, virtual machine and configuration files for VMs running on outside of the Nutanix nodes, or any other type of hosted repository not based on virtual machine disks.

Updating the Cluster After Renaming the Hyper-V External Virtual Switch

About this task

You can rename the external virtual switch on your Hyper-V cluster to a name of your choice. After you rename the external virtual switch, you must update the new name in AOS so that AOS upgrades and VM migrations do not fail.

Note: In releases earlier than AOS 5.11, the name of the external virtual switch in your Hyper-V cluster must be ExternalSwitch .

See the Microsoft documentation for instructions about how to rename the external virtual switch.

Perform the following steps after you rename the external virtual switch.

Procedure

  1. Log on to a CVM with SSH.
  2. Restart Genesis on all the CVMs in the cluster.
    nutanix@cvm$ genesis restart
  3. Refresh all the guest VMs.
    1. Log on to a Hyper-V host.
    2. Go to Hyper-V Manager, select the VM and, in Settings , click the Refresh icon.
    See the Microsoft documentation for the updated instructions about how to refresh the guest VMs.

Upgrade to Windows Server Version 2016, 2019, and 2022

The following procedures describe how to upgrade earlier releases of Windows Server to Windows Server 2016, 2019, and 2022. For information about fresh installation of Windows Server, see Hyper-V Configuration.
Note: If you are upgrading from Windows Server 2012 R2 and if the AOS version is less than 5.11, then upgrade to Windows Server 2016 first and then upgrade to AOS 5.17. Proceed with upgrading to Windows Server 2019 if necessary.

Hyper-V Hypervisor Upgrade Recommendations, Requirements, and Limitations

This section provides the requirements, recommendations, and limitations to upgrade Hyper-V.

Recommendations

Nutanix recommends that you schedule a sufficiently long maintenance window to upgrade your Hyper-V clusters.

Budget sufficient time to upgrade: Depending on the number of VMs running on a node before the upgrade, a node could take more than 1.5 hours to upgrade. For example, the total time to upgrade a Hyper-V cluster from Hyper-V 2016 to Hyper-V 2019 is approximately the time per node multiplied by the number of nodes. Upgrading can take longer if you also need to upgrade your AOS version.

Requirements

Note:
  • You can upgrade to Windows Server 2022 Hyper-V only from a Hyper-V 2019 cluster.
  • Upgrade to Windows Server 2022 Hyper-V from an LACP enabled Hyper-V 2019 cluster is not supported.
  • Direct upgrade to Windows Server 2022 Hyper-V from Hyper-V 2016 or Windows Server 2012 R2 is not supported.
  • For Windows Server 2022 Hyper-V, only NX Series G6 and later models are supported.
  • For Windows Server 2022 Hyper-V, SET is the default teaming mode. LBFO teaming is not supported on Windows Server 2022 Hyper-V.
  • For Hyper-V 2019, if you do not choose LACP/LAG, SET is the default teaming mode. NX Series G5 and later models support Hyper-V 2019.
  • For Hyper-V 2016, if you do not choose LACP/LAG, the teaming mode is Switch Independent LBFO teaming.
  • For Hyper-V (2016 and 2019), if you choose LACP/LAG, the teaming mode is Switch Dependant LBFO teaming.
  • The platform must not be a light-compute platform.
  • Before upgrading, disable or uninstall third-party antivirus or security filter drivers that modify Windows firewall rules. Windows firewalls must accept inbound and outbound SSH traffic outside of the domain rules.
  • Enable Kerberos when upgrading from Windows Server 2012 R2 to Windows Server 2016. For more information, see Enabling Kerberos for Hyper-V .
    Note: Kerberos is enabled by default when upgrading from Windows Server 2016 to Windows Server 2019.
  • Enable virtual machine migration on the host. Upgrading reimages the hypervisor. Any custom or non-standard hypervisor configurations could be lost after the upgrade is completed.
  • If you are using System Center for Virtual Machine Management (SCVMM) 2012, upgrade to SCVMM 2016 first before upgrading to Hyper-V 2016. Similarly, upgrade to SCVMM 2019 before upgrading to Hyper-V 2019 and upgrade to SCVMM 2022 before upgrading to Windows Server 2022 Hyper-V.
  • Upgrade using ISOs and Nutanix JSON File
    • Upgrade using ISOs. The Prism Element web console supports 1-click upgrade ( Upgrade Software dialog box) of Hyper-V 2016, 2019, or 2022 by using metadata upgrade JSON file. This file is available in the Nutanix Support portal Hypervisor Details page and the Microsoft Hyper-V ISO file.
    • The Hyper-V upgrade JSON file, when used on clusters where Foundation 4.0 or later is installed, is available for Nutanix NX series G4 and later, Dell EMC XC series, or Lenovo Converged HX series platforms. You can upgrade hosts to Hyper-V 2016, 2019 (except for NX series G4) on these platforms by using this JSON file.

Limitations

  • When upgrading hosts to Hyper-V 2016, 2019, and later versions, the local administrator user name and password is reset to the default administrator name Administrator and password of nutanix/4u. Any previous changes to the administrator name and/or password are overwritten.
  • VMs with any associated files on local storage are lost.
    • Logical networks are not restored immediately after upgrade. If you configure logical switches, the configuration is not retained and VMs could become unavailable.
    • Any VMs created during hypervisor upgrade (including as part of disaster recovery operations) and not marked as HA (High Availability) experiences unavailability.
    • Disaster recovery: VMs with the Automatic Stop Action property set to Save is marked as CBR Not Capable if they are upgraded to version 8.0 after upgrading the hypervisor. Change the value of Automatic Stop Action to ShutDown or TurnOff when the VM is upgraded so that it is not marked as CBR Not Capable
  • Enabling Link Aggregation Control Protocol (LACP) for your cluster deployment is supported when upgrading hypervisor hosts from Windows Server 2016 to 2019.

Upgrading to Windows Server Version 2016, 2019, and 2022

About this task

Note:
  • It is possible that clusters running Windows Server 2012 R2 and AOS have time synchronization issues. Therefore, before you upgrade to Windows Server 2016 or Windows Server 2019 and AOS, make sure that the cluster is free from time synchronization issues.
  • Windows Server 2016 also implements Discrete Device Assignment (DDA) for passing through PCI Express devices to guest VMs. This feature is available in Windows Server 2019 too. Therefore, DiskMonitorService, which was used in earlier AOS releases for passing disks through to the CVM, no longer exists. For more information about DDA, see the Microsoft documentation.

Procedure

  1. Make sure that AOS, host, and hypervisor upgrade prerequisites are met.
    For more information, see Hyper-V Hypervisor Upgrade Recommendations, Requirements, and Limitations and the Acropolis Upgrade Guide.
  2. Upgrade AOS by either using the one-click upgrade procedure or uploading the installation files manually. The Prism web console performs both procedures.
    • After upgrading AOS and before upgrading your cluster hypervisor, perform a Life Cycle Manager (LCM) inventory, update LCM, and upgrade any recommended firmware. For more information, see the Life Cycle Manager documentation .
    • For more information, including recommended installation or upgrade order, see the Acropolis Upgrade Guide.
  3. Do one of the following if you want to manage your VMs with SCVMM:
    1. If you register the Hyper-V cluster with an SCVMM installation with a version that is earlier to 2016, do the following in any order:
      • Unregister the cluster from SCVMM.
      • Upgrade SCVMM to version 2016. See Microsoft documentation for this upgrade procedure.
        Note: Do the same when upgrading from Hyper-V 2016 to 2019. Upgrade SCVMM to version 2019 and register the cluster to SCVMM 2019. Similarly, when upgrading to any higher version, upgrade SCVMM to that version and register the cluster to the upgraded SCVMM.
    2. If you do not have SCVMM, deploy SCVMM 2016 / 2019/2022. See Microsoft documentation for this installation procedure.
    Regardless of whether you deploy a new instance of SCVMM 2016 or you upgrade an existing SCVMM installation, do not register the Hyper-V cluster with SCVMM now. To minimize the steps in the overall upgrade workflow, register the cluster with SCVMM 2016 after you upgrade the Hyper-V hosts.
  4. If you are upgrading from Windows Server 2012 R2 to Windows Server 2016, then enable Kerberos. For more information, see Enabling Kerberos for Hyper-V.
  5. Upgrade the Hyper-V hosts.
  6. After the cluster is up, add the cluster to SCVMM 2016. The procedure for adding the cluster to SCVMM 2016 is the procedure used for earlier versions of SCVMM. For more information, see Registering a Cluster with SCVMM.
  7. Any log redirection (for example, SCOM log redirection) configurations are lost during the hypervisor upgrade process. Reconfigure log redirection.

System Center Virtual Machine Manager Configuration

System Center Virtual Machine Manager (SCVMM) is a management platform for Hyper-V clusters. Nutanix provides a utility for joining Hyper-V hosts to a domain and adding Hyper-V hosts and storage to SCVMM. If you cannot or do not want to use this utility, you must join to hosts to the domain and add the hosts and storage to SCVMM manually.

Note: The Validate Cluster feature of the Microsoft System Center VM Manager (SCVMM) is not supported for Nutanix clusters managed by SCVMM.

SCVMM Configuration

After joining cluster and its constituent hosts to the domain and creating a failover cluster, you can configure SCVMM.

Registering a Cluster with SCVMM

Perform the following procedure to register a cluster with SCVMM.

Before you begin

  • Join the hosts in the Nutanix cluster to a domain manually or by following Adding the Cluster and Hosts to a Domain.
  • Make sure that the hosts are not registered with SCVMM.

Procedure

  1. Log on to any CVM in the cluster using SSH.
  2. Verify that the status of all services on all the CVMs are Up.
    nutanix@cvm$ cluster status
    If the Nutanix cluster is running properly, output similar to the following is displayed for each node in the Nutanix cluster.
    CVM:host IP-Address Up
                                    Zeus   UP       [9935, 9980, 9981, 9994, 10015, 10037]
                               Scavenger   UP       [25880, 26061, 26062]
                                  Xmount   UP       [21170, 21208]
                        SysStatCollector   UP       [22272, 22330, 22331]
                               IkatProxy   UP       [23213, 23262]
                        IkatControlPlane   UP       [23487, 23565]
                           SSLTerminator   UP       [23490, 23620]
                          SecureFileSync   UP       [23496, 23645, 23646]
                                  Medusa   UP       [23912, 23944, 23945, 23946, 24176]
                      DynamicRingChanger   UP       [24314, 24404, 24405, 24558]
                                  Pithos   UP       [24317, 24555, 24556, 24593]
                              InsightsDB   UP       [24322, 24472, 24473, 24583]
                                  Athena   UP       [24329, 24504, 24505]
                                 Mercury   UP       [24338, 24515, 24516, 24614]
                                  Mantle   UP       [24344, 24572, 24573, 24634]
                              VipMonitor   UP       [18387, 18464, 18465, 18466, 18474]
                                Stargate   UP       [24993, 25032]
                    InsightsDataTransfer   UP       [25258, 25348, 25349, 25388, 25391, 25393, 25396]
                                   Ergon   UP       [25263, 25414, 25415]
                                 Cerebro   UP       [25272, 25462, 25464, 25581]
                                 Chronos   UP       [25281, 25488, 25489, 25547]
                                 Curator   UP       [25294, 25528, 25529, 25585]
                                   Prism   UP       [25718, 25801, 25802, 25899, 25901, 25906, 25941, 25942]
                                     CIM   UP       [25721, 25829, 25830, 25856]
                            AlertManager   UP       [25727, 25862, 25863, 25990]
                                Arithmos   UP       [25737, 25896, 25897, 26040]
                                 Catalog   UP       [25749, 25989, 25991]
                               Acropolis   UP       [26011, 26118, 26119]
                                   Uhura   UP       [26037, 26165, 26166]
                                    Snmp   UP       [26057, 26214, 26215]
                       NutanixGuestTools   UP       [26105, 26282, 26283, 26299]
                              MinervaCVM   UP       [27343, 27465, 27466, 27730]
                           ClusterConfig   UP       [27358, 27509, 27510]
                                Aequitas   UP       [27368, 27567, 27568, 27600]
                             APLOSEngine   UP       [27399, 27580, 27581]
                                   APLOS   UP       [27853, 27946, 27947]
                                   Lazan   UP       [27865, 27997, 27999]
                                  Delphi   UP       [27880, 28058, 28060]
                                    Flow   UP       [27896, 28121, 28124]
                                 Anduril   UP       [27913, 28143, 28145]
                                   XTrim   UP       [27956, 28171, 28172]
                           ClusterHealth   UP       [7102, 7103, 27995, 28209,28495, 28496, 28503, 28510,	
    28573, 28574, 28577, 28594, 28595, 28597, 28598, 28602, 28603, 28604, 28607, 28645, 28646, 28648, 28792,	
    28793, 28837, 28838, 28840, 28841, 28858, 28859, 29123, 29124, 29127, 29133, 29135, 29142, 29146, 29150,	
    29161, 29162, 29163, 29179, 29187, 29219, 29268, 29273]
  3. Add the Nutanix hosts and storage to SCVMM.
    nutanix@cvm$ setup_hyperv.py setup_scvmm

    This script performs the following functions.

    • Adds the cluster to SCVMM.
    • Sets up the library share in SCVMM.
    • Unregisters the deleted storage containers from SCVMM.
    • Registers the new storage containers in SCVMM.

    Alternatively, you can specify all the parameters as given in the following steps as command-line arguments. If you do so, enclose the values in single quotation marks since the Controller VM shell does not otherwise correctly interpret the backslash (\).

    The utility prompts for the necessary parameters, for example:

    Getting the cluster configuration ... Done
    Getting information about each host ... Done
    The hosts are joined to domain hyperv.nutanix.com
    
    Please enter the domain account username that has local administrator rights on
    the hosts: hyperv.nutanix.com\Administrator
    Please enter the password for hyperv.nutanix.com\Administrator:
    Verifying credentials for accessing localhost ... Done
    
    Please enter the name of the SCVMM server: scvmmhyperv
    Getting the SCVMM server IP address ... 10.4.34.44
    Adding 10.4.34.44 to the IP address whitelist ... Done
    
    Please enter the domain account username (e.g. username@corp.contoso.com or
     CORP.CONTOSO.COM\username) that has administrator rights on the SCVMM server
    and is a member of the domain administrators group (press ENTER for hyperv.nutanix.com\Administrator):
    Verifying credentials for accessing scvmmhyperv ... Done
    
    Verifying SCVMM service account ... HYPERV\scvmm
    
    All nodes are already part of the Hyper-V failover cluster msfo-tulip.
    Preparing to join the Nutanix storage cluster to domain ... Already joined
    Creating an SCVMM run-as account ... hyperv-Administrator
    Verifying the DNS entry tulip.hyperv.nutanix.com -> 10.4.36.191 ... Done
    Verifying that the Hyper-V failover cluster IP address has been added to DNS ... 10.4.36.192
    Verifying SCVMM security settings ... Done
    Initiating adding the failover cluster to SCVMM ... Done
    Step 2 of adding the failover cluster to SCVMM ... Done
    Final step of adding the failover cluster to SCVMM ... Done
    Querying registered Nutanix library shares ... None
    Add a Nutanix share to the SCVMM library for storing VM templates, useful for deploying VMs using Fast File Copy ([Y]/N)? Y
    Querying the registered library servers ... Done
    Using library server scvmmhyperv.hyperv.nutanix.com.
    Please enter the name of the Nutanix library share to be created (press ENTER for "msfo-tulip-library"): 
    Creating container msfo-tulip-library ... Done
    Registering msfo-tulip-library as a library share with server scvmmhyperv.hyperv.nutanix.com in SCVMM ... Done
    Please enter the Prism password: 
    Registering the SMI-S provider with SCVMM ... Done
    Configuring storage in SCVMM ... Done
    Registered default-container-11962
    
    1. Type the domain account username and password.
      This username must include the fully-qualified domain name, for example hyperv.nutanix.com\Administrator .
    2. Type the SCVMM server name.
      The name must resolve to an IP address.
    3. Type the SCVMM username and password if they are different from the domain account; otherwise press Enter to use the domain account.
    4. Choose whether to create a library share.
      Add a Nutanix share to the SCVMM library for storing VM templates, useful for
       deploying VMs using Fast File Copy ([Y]/N)?

      If you choose to create a library share, output similar to the following is displayed.

      Querying the registered library servers ... Done
      Add a Nutanix share to the SCVMM library for storing VM templates, useful for deploying VMs using Fast File Copy ([Y]/N)? Y
      Querying the registered library servers ... Done
      Using library server scvmmhyperv.hyperv.nutanix.com.
      Please enter the name of the Nutanix library share to be created (press ENTER
       for "NTNX-HV-library"):
      Creating container NTNX-HV-library ... Done
      Registering NTNX-HV-library as a library share with server scvmmhyperv.hyperv.nutanix.com ... Done
      
      Finally the following output is displayed.
      Registering the SMI-S provider with SCVMM ... Done
      Configuring storage in SCVMM ... Done
      Registered share ctr1
      
      Setup complete.
    Note: You can also register Nutanix Cluster by using SCVMM. For more information, see Adding Hosts and Storage to SCVMM Manually (SCVMM User Interface).
    Warning: If you change the Prism password, you must change the Prism run as account in SCVMM.

Adding Hosts and Storage to SCVMM Manually (SCVMM User Interface)

If you are unable to add hosts and storage to SCVMM by using the utility provided by Nutanix, you can add the hosts and storage to SCVMM by using the SCVMM user interface.

Before you begin

  • Verify that the SCVMM server IP address is on the cluster allowlist.
  • Verify that the SCVMM library server has a run-as account specified. Right-click the library server, click Properties , and ensure that Library management credential is populated.

Procedure

  1. Log into the SCVMM user interface and click VMs and Services .
  2. Right-click All Hosts and select Add Hyper-V Hosts and Clusters , and click Next .
    The Specify the Credentials to use for discovery screen appears.
  3. Click Browse and select an existing Run As Account or create a new Run As Account by clicking Create Run As Account . Click OK and then click Next .
    The Specify the search scope for virtual machine host candidates screen appears.
  4. Type the failover cluster name in the Computer names text box, and click Next .
  5. Select the failover cluster that you want to add, and click Next .
  6. Select Reassociate this host with this VMM environment check box, and click Next .
    The Confirm the settings screen appears.
  7. Click Finish .
    Warning: If you are adding the cluster for the first time, the addition action fails with the following error message.
    Error (10400)
    Before Virtual Machine Manager can perform the current operation, the virtualization server must be restarted.

    Remove the cluster that you were adding and perform the same procedure again.

  8. Register a Nutanix SMB share as a library share in SCVMM by clicking Library and then adding the Nutanix SMB share.
    1. Right-click the Library Servers and click Add Library Shares .
    2. Click Add Unmanaged Share and type the SMB file share path, click OK , and click Next .
    3. Click Add Library Shares .
      If all the parameters are correct, the library share is added.
  9. Register the Nutanix SMI-S provider.
    1. Go to Settings > Security > Run As Accounts and click Create Run As Account .
    2. Enter the Prism user name and password, de-select Validate domain credentials , and click Finish .
      Note:

      Only local Prism accounts are supported and even if AD authentication in Prism is configured, SMI-S provider cannot use it for authentication.

    3. Go to Fabric > Storage > Providers .
    4. Right-click Providers and select Add Storage Devices .
    5. Select SAN and NAS devices discovered and managed by a SMI-S provider check box, and Click Next .
    6. Specify the protocol and address of the storage SMI-S provider.
      • In the Protocol drop-down menu, select SMI-S CIMXML .
      • In the Provider IP Address or FQDN text box, provide the Nutanix storage cluster name. For example, clus-smb .
        Note: The Nutanix storage cluster name is not the same as the Hyper-V cluster name. You should get the storage cluster name from the cluster details in the web console.
      • Select the Use Secure sockets layer SSL connection check box.
      • In the Run As Account field, click Browse and select the Prism Run As Account that you have created earlier, and click Next .
      Note: If you encounter the following error when attempting to add an SMI-S provider, see KB 5070:
      Could not retrieve a certificate from the <clustername> server because of the error:
      The request was aborted: Could not create SSL/TLS secure channel.
    7. Click Import to verify the identity of the storage provider.
      The discovery process starts and at the completion of the process, the storage is displayed.
    8. Click Next and select all the SMB shares exported by the Nutanix cluster except the library share and click Next .
    9. Click Finish .
      The newly added provider is displayed under Providers. Go to Storage > File Clusters to verify that the Managed column is Yes .
  10. Add the file shares to the Nutanix cluster by navigating to VMs and Services .
    1. Right-click the cluster name and select Properties .
    2. Go to File Share Storage , and click Add to add file shares to the cluster.
    3. From the File share path drop-down menu, select all the shares that you want to add, and click OK .
    4. Right-click the cluster and click Refresh . Wait for the refresh job to finish.
    5. Right-click the cluster name and select Properties > File Share Storage . You should see the access status with a green check mark, which means that the shares are successfully added.
    6. Select all the virtual machines in the cluster, right-click, and select Refresh .

SCVMM Operations

You can perform the operational procedures on a Hyper-V mode by using SCVMM such as placing a host in the maintenance mode.

Placing a Host in Maintenance Mode

If you try to place a host that is managed by SCVMM in maintenance mode, by default the Controller VM running on the host is placed in the saved state, which might create issues. Perform the following procedure to properly place a host in the maintenance mode.

Procedure

  1. Log into the Controller VM of the host that you are planning to place in maintenance mode by using SSH and shut down the Controller VM.
    nutanix@cvm$ cvm_shutdown -P now

    Wait for the Controller VM to completely shut down.

  2. Select the host and place it in the maintenance mode by navigating to the Host tab in the Host group and clicking Start Maintenance Mode .
    Wait for the operation to complete before performing any maintenance activity on the host.
  3. After the maintenance activity is completed, bring out the host from the maintenance mode by navigating to the Host tab in the Host group and clicking Stop Maintenance Mode .
  4. Start the Controller VM manually.
Read article

Migration Guide

AOS 6.5

Product Release Date: 2022-07-25

Last updated: 2022-07-25

This Document Has Been Removed

Nutanix Move is the Nutanix-recommended tool for migrating a VM. Please see the Move documentation at the Nutanix Support portal.

Read article

vSphere Administration Guide for Acropolis

AOS 6.5

Product Release Date: 2022-07-25

Last updated: 2022-12-08

Overview

Nutanix Enterprise Cloud delivers a resilient, web-scale hyperconverged infrastructure (HCI) solution built for supporting your virtual and hybrid cloud environments. The Nutanix architecture runs a storage controller called the Nutanix Controller VM (CVM) on every Nutanix node in a cluster to form a highly distributed, shared-nothing infrastructure.

All CVMs work together to aggregate storage resources into a single global pool that guest VMs running on the Nutanix nodes can consume. The Nutanix Distributed Storage Fabric manages storage resources to preserve data and system integrity if there is node, disk, application, or hypervisor software failure in a cluster. Nutanix storage also enables data protection and High Availability that keep critical data and guest VMs protected.

This guide describes the procedures and settings required to deploy a Nutanix cluster running in the VMware vSphere environment. To know more about the VMware terms referred to in this document, see the VMware Documentation.

Hardware Configuration

See the Field Installation Guide for information about how to deploy and create a Nutanix cluster running ESXi for your hardware. After you create the Nutanix cluster by using Foundation, use this guide to perform the management tasks.

Limitations

For information about ESXi configuration limitations, see Nutanix Configuration Maximums webpage.

Nutanix Software Configuration

The Nutanix Distributed Storage Fabric aggregates local SSD and HDD storage resources into a single global unit called a storage pool. In this storage pool, you can create several storage containers, which the system presents to the hypervisor and uses to host VMs. You can apply a different set of compression, deduplication, and replication factor policies to each storage container.

Storage Pools

A storage pool on Nutanix is a group of physical disks from one or more tiers. Nutanix recommends configuring only one storage pool for each Nutanix cluster.

Replication factor
Nutanix supports a replication factor of 2 or 3. Setting the replication factor to 3 instead of 2 adds an extra data protection layer at the cost of more storage space for the copy. For use cases where applications provide their own data protection or high availability, you can set a replication factor of 1 on a storage container.
Containers
The Nutanix storage fabric presents usable storage to the vSphere environment as an NFS datastore. The replication factor of a storage container determines its usable capacity. For example, replication factor 2 tolerates one component failure and replication factor 3 tolerates two component failures. When you create a Nutanix cluster, three storage containers are created by default. Nutanix recommends that you do not delete these storage containers. You can rename the storage container named default - xxx and use it as the main storage container for hosting VM data.
Note: The available capacity and the vSphere maximum of 2,048 VMs limits the number of VMs a datastore can host.

Capacity Optimization

  • Nutanix recommends enabling inline compression unless otherwise advised.
  • Nutanix recommends disabling deduplication for all workloads except VDI.

    For mixed-workload Nutanix clusters, create a separate storage container for VDI workloads and enable deduplication on that storage container.

Nutanix CVM Settings

CPU
Keep the default settings as configured by the Foundation during the hardware configuration.

Change the CPU settings only if Nutanix Support recommends it.

Memory
Most workloads use less than 32 GB RAM memory per CVM. However, for mission-critical workloads with large working sets, Nutanix recommends more than 32 GB CVM RAM memory.
Tip: You can increase CVM RAM memory up to 64 GB using the Prism one-click memory upgrade procedure. For more information, see Increasing the Controller VM Memory Size in the Prism Web Console Guide .
Networking
The Nutanix CVM uses the standard Ethernet MTU (maximum transmission unit) of 1500 bytes for all the network interfaces by default. The standard 1500 byte MTU helps deliver enhanced excellent performance and stability. Nutanix does not support configuring the MTU on a network interface of CVMs to higher values.
Caution: Do not use jumbo Frames for the Nutanix CVM.
Caution: Do not change the vSwitchNutanix or the internal vmk (VMkernel) interface.

Nutanix Cluster Settings

Nutanix recommends that you do the following.

  • Map a Nutanix cluster to only one vCenter Server.

    Due to the way the Nutanix architecture distributes data, there is limited support for mapping a Nutanix cluster to multiple vCenter Servers. Some Nutanix products (Move, Era, Calm, Files, Prism Central), and features (disaster recovery solution) are unstable when a Nutanix cluster maps to multiple vCenter Servers.

  • Configure a Nutanix cluster with replication factor 2 or replication factor 3.
    Tip: Nutanix recommends using replication factor 3 for clusters with more than 16 nodes. Replication factor 3 requires at least five nodes so that the data remains online even if two nodes fail concurrently.
  • Use the advertised capacity feature to ensure that the resiliency capacity is equivalent to one node of usable storage for replication factor 2 or two nodes for replication factor 3.

    The advertised capacity of a storage container must equal the total usable cluster space minus the capacity of either one or two nodes. For example, in a 4-node cluster with 20 TB usable space per node with replication factor 2, the advertised capacity of the storage container must be 60 TB. That spares 20 TB capacity to sustain and rebuild one node for self-healing. Similarly, in a 5-node cluster with 20 TB usable space per node with replication factor 3, advertised capacity of the storage container must be 60 TB. That spares 40 TB capacity to sustain and rebuild two nodes for self-healing.

  • Use the default storage container and mounting it on all the ESXi hosts in the Nutanix cluster.

    You can also create a single storage container. If you are creating multiple storage containers, ensure that all the storage containers follow the advertised capacity recommendation.

  • Configure the vSphere cluster according to settings listed in vSphere Cluster Settings Checklist.

Software Acceptance Level

The Foundation sets the software acceptance level of an ESXi image to CommunitySupported by default. If there is a requirement to upgrade the software acceptance level, run the following command to upgrade the software acceptance level to the maximum acceptance level of PartnerSupported .

root@esxi# esxcli software acceptance set --level=PartnerSupported

Scratch Partition Settings

ESXi uses the scratch partition (/scratch) to dump the logs when it encounters a purple screen of death (PSOD) or a kernel dump. The Foundation install automatically creates this partition on the SATA DOM or M.2 device with the ESXi installation. Moving the scratch partition to any location other than the SATA DOM or M.2 device can cause issues with LCM, 1-click hypervisor updates, and the general stability of the Nutanix node.

vSphere Networking

vSphere on the Nutanix platform enables you to dynamically configure, balance, or share logical networking components across various traffic types. To ensure availability, scalability, performance, management, and security of your infrastructure, configure virtual networking when designing a network solution for Nutanix clusters.

You can configure networks according to your requirements. For detailed information about vSphere virtual networking and different networking strategies, refer to the Nutanix vSphere Storage Solution Document and the VMware Documentation . This chapter describes the configuration elements required to run VMware vSphere on the Nutanix Enterprise infrastrucutre.

Virtual Networking Configuration Options

vSphere on Nutanix supports the following types of virtual switches.

vSphere Standard Switch (vSwitch)
vSphere Standard Switch (vSS) with Nutanix vSwitch is the default configuration for Nutanix deployments and suits most use cases. A vSwitch detects which VMs are connected to each virtual port and uses that information to forward traffic to the correct VMs. You can connect a vSwitch to physical switches by using physical Ethernet adapters (also referred to as uplink adapters) to join virtual networks with physical networks. This type of connection is similar to connecting physical switches together to create a larger network.
Tip: A vSwitch works like a physical Ethernet switch.
vSphere Distributed Switch (vDS)

Nutanix recommends vSphere Distributed Switch (vDS) coupled with network I/O control (NIOC version 2) and load-based teaming. This combination provides simplicity, ensures traffic prioritization if there is contention, and reduces operational management overhead. A vDS acts as a single virtual switch across all associated hosts on a datacenter. It enables VMs to maintain consistent network configuration as they migrate across multiple hosts. For more information about vDS, see NSX-T Support on Nutanix Platform.

Nutanix recommends setting all vNICs as active on the port group and dvPortGroup unless otherwise specified. The following table lists the naming convention, port groups, and the corresponding VLAN Nutanix uses for various traffic types.

Table 1. Port Groups and Corresponding VLAN
Port group VLAN Description
MGMT_10 10 VM kernel interface for host management traffic
VMOT_20 20 VM kernel interface for vMotion traffic
FT_30 30 Fault tolerance traffic
VM_40 40 VM traffic
VM_50 50 VM traffic
NTNX_10 10 Nutanix CVM to CVM cluster communication traffic (public interface)
Svm-iscsi-pg N/A Nutanix CVM to internal host traffic
VMK-svm-iscsi-pg N/A VM kernel port for CVM to hypervisor communication (internal)

All Nutanix configurations use an internal-only vSwitch for the NFS communication between the ESXi host and the Nutanix CVM.This vSwitch remains unmodified regardless of the virtual networking configuration for ESXi management, VM traffic, vMotion, and so on.

Caution: Do not modify the internal-only vSwitch (vSwitch-Nutanix). vSwitch-Nutanix facilitates communication between the CVM and the internal hypervisor.

VMware NSX Support

Running VMware NSX on Nutanix infrastructure ensures that VMs always have access to fast local storage and compute, consistent network addressing and security without the burden of physical infrastructure constraints. The supported scenario connects the Nutanix CVM to a traditional VLAN network, with guest VMs inside NSX virtual networks. For more information, see the Nutanix vSphere Storage Solution Document .

NSX-T Support on Nutanix Platform

Nutanix platform relies on communication with vCenter to work with networks backed by vSphere standard switch (vSS) or vSphere Distributed Switch (vDS). With the introduction of a new management plane, that enables network management agnostic to the compute manager (vCenter), network configuration information is available through the NSX-T manager. To collect the network configuration information from the NSX-T Manager, you must modify the Nutanix infrastructure workflows (AOS upgrades, LCM upgrades, and so on).

Figure. Nutanix and the NSX-T Workflow Overview Click to enlarge Nutanix and NSX-T Workflow Overview

The Nutanix platform supports the following in the NSX-T configuration.

  • ESXi hypervisor only.
  • vSS and vDS virtual switch configurations.
  • Nutanix CVM connection to VLAN backed NSX-T segments only.
  • The NSX-T Manager credentials registration using the CLI.

The Nutanix platform does not support the following in the NSX-T configuration.

  • Network segmentation with N-VDS.
  • Nutanix CVM connection to overlay NSX-T segments.
  • Link Aggregation/LACP for the uplinks backing the NVDS host switch connecting Nutanix CVMs.
  • The NSX-T Manager credentials registration through Prism.

NSX-T Segments

Nutanix supports NSX-T logical segments to co-exist on Nutanix clusters running ESXi hypervisors. All infrastructure workflows that include the use of the Foundation, 1-click upgrades, and AOS upgrades are validated to work in the NSX-T configurations where CVM is backed by the NSX-T VLAN logical segment.

NSX-T has the following types of segments.

VLAN backed
VLAN backed segments operate similar to the standard port group in a vSphere switch. A port group is created on the NVDS, and VMs that are connected to the port group have their network packets tagged with the configured VLAN ID.
Overlay backed
Overlay backed segments use the Geneve overlay to create a logical L2 network over L3 network. Encapsulation occurs at the transport layer (which is the NVDS module on the host).

Multicast Filtering

Enabling multicast snooping on a vDS with a Nutanix CVM attached affects the ability of CVM to discover and add new nodes in the Nutanix cluster (the cluster expand option in Prism and the Nutanix CLI).

Creating Segment for NVDS

This procedure provides details about creating a segment for nVDS.

About this task

To check the vSwitch configuration of the host and verify if NSX-T network supports the CVM port-group, perform the following steps.

Procedure

  1. Log on to vCenter sever and go to the NSX-T Manager.
  2. Click Networking , and go to Connectivity > Segments in the left pane.
  3. Click ADD SEGMENT under the SEGMENTS tab on the right pane and specify the following information.
    Figure. Create New Segment Click to enlarge Create New Segment

    1. Segment Name : Enter a name for the segment.
    2. Transport Zone : Select the VLAN-based transport zone.
      This transport name is associated with the Transport Zone when configuring the NSX switch .
    3. VLAN : Enter the number 0 for native VLAN.
  4. Click Save to create a segment for NVDS.
  5. Click Yes when the system prompts to continue with configuring the segment.
    The newly created segment appears below the prompt.
    Figure. New Segment Created Click to enlarge New Segment Created

Creating NVDS Switch on the Host by Using NSX-T Manager

This procedure provides instructions to create an NVDS switch on the ESXi host. The management and CVM external interface of the host is migrated to the NVDS switch.

About this task

To create an NVDS switch and configure the NSX-T Manager, do the following.

Procedure

  1. Log on to NSX-T Manager.
  2. Click System , and go to Configuration > Fabric > Nodes in the left pane.
    Figure. Add New Node Click to enlarge Add New Node

  3. Click ADD HOST NODE under the HOST TRANSPORT NODES in the right pane.
    1. Specify the following information in the Host Details dialog box.
      Figure. Add Host Details Click to enlarge Add Host Details

        1. Name : Enter an identifiable ESXi host name.
        2. Host IP : Enter the IP address of the ESXi host.
        3. Username : Enter the username used to log on to the ESXi host.
        4. Password : Enter the password used to log on to the ESXi host.
        5. Click Next to move to the NSX configuration.
    2. Specify the following information in the Configure NSX dialog box.
      Figure. Configure NSX Click to enlarge Configure NSX

        1. Mode : Select Standard option.

          Nutanix recommends the Standard mode only.

        2. Name : Displays the default name of the virtual switch that appears on the host. You can edit the default name and provide an identifiable name as per your configuration requirements.
        3. Transport Zone : Select the transport zone that you selected in Creating Segment for NVDS.

          These segments operate in the way similar to the standard port group in a vSphere switch. A port group is created on the NVDS, and VMs that are connected to the port group have their network packets tagged with the configured VLAN ID.

        4. Uplink Profile : Select an uplink profile for the new nVDS switch.

          This selected uplink profile represents the NICs connected to the host. For more information about uplink profiles, see the VMware Documentation .

        5. LLDP Profile : Select the LLDP profile for the new nVDS switch.

          For more information about LLDP profiles, see the VMware Documentation .

        6. Teaming Policy Uplink Mapping : Map the uplinks with the physical NICs of the ESXi host.
          Note: To verify the active physical NICs on the host, select ESXi host > Configure > Networking > Physical Adapters .

          Click Edit icon and enter the name of the active physical NIC in the ESXi host selected for migration to the NVDS.

          Note: Always migrate one physical NIC at a time to avoid connectivity failure with the ESXi host.
        7. PNIC only Migration : Turn on the switch to Yes if there are no VMkernal Adapters (vmks) associated with the PNIC selected for migration from vSS switch to the nVDS switch.
        8. Network Mapping for Install . Click Add Mapping to migrate the VMkernels (vmks) to the NVDS switch.
        9. Network Mapping for Uninstall : To revert the migration of VMKernels.
  4. Click Finish to create the ESXi host to the NVDS switch.
    The newly created nVDS switch appears on the ESXi host.
    Figure. NVDS Switch Created Click to enlarge NVDS Switch Created

Registering NSX-T Manager with Nutanix

After migrating the external interface of the host and the CVM to the NVDS switch, it is mandatory to inform Genesis about the registration of the cluster with the NSX-T Manager. This registration helps Genesis communicate with the NSX-T Manager and avoid failures during LCM, 1-click, and AOS upgrades.

About this task

This procedure demonstrates an AOS upgrade error encountered for a non-registered NSX-T Manager with Nutanix and how to register the the NSX-T Manager with Nutanix and resolve the issue.

To register an the NSX-T Manager with Nutanix, do the following.

Procedure

  1. Log on to the Prism Element web console.
  2. Select VM > Settings > Upgrade Software > Upgrade > Pre-upgrade to upgrade AOS on the host.
    Figure. Upgrade AOS Click to enlarge

  3. The upgrade process throws an error if the NSX-T Manager is not registered with Nutanix.
    Figure. AOS Upgrade Error for Unregistered NSX-T Click to enlarge

    The AOS upgrade determines if the NSX-T networks supports the CVM, its VLAN, and then attempts to get the VLAN information of those networks. To get VLAN information for the CVM, the NSX-T Manager information must be configured in the Nutanix cluster.

  4. To fix this upgrade issue, log on to the Prism Element web console using SSH.
  5. Access the cluster directory.
    nutanix@cvm$ cd ~/cluster/bin
  6. Verify if the NSX-T Manager was registered with the CVM earlier.
    nutanix@cvm$ ~/cluster/bin$ ./nsx_t_manager -l

    If the NSX-T Manager was not registered earlier, you get the following message.

    No MX-T manager configured in the cluster
  7. Register the NSX-T Manager with the CVM if it was not registered earlier. Specify the credentials of the NSX-T Manager to the CVM.
    nutanix@cvm$ ~/cluster/bin$ ./nsx_t_manager -a
    IP address: 10.10.10.10
    Username: admin
    Password: 
    /usr/local/nutanix/cluster/lib/py/requests-2.12.0-py2.7.egg/requests/packages/urllib3/conectionpool.py:843:
     InsecureRequestWarning: Unverified HTTPS request is made. Adding certificate verification is strongly advised. 
    See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
    Successfully persisted NSX-T manager information
  8. Verify the registration of NSX-T Manager with the CVM.
    nutanix@cvm$ ~/cluster/bin$ ./nsx_t_manager -l

    If there are no errors, the system displays a similar output.

    IP address: 10.10.10.10
    Username: admin
  9. In the Prism Element Web Console, click the Pre-upgrade to continue the AOS upgrade procedure.

    The AOS upgrade is completed successfully.

Networking Components

IP Addresses

All CVMs and ESXi hosts have two network interfaces.
Note: An empty interface eth2 is created on CVM during deployment by Foundation. The eth2 interface is used for backplane when backplane traffic isolation (Network Segmentation) is enabled in the cluster. For more information about backplane interface and traffic segmentation, see Security Guide.
Interface IP address vSwitch
ESXi host vmk0 User-defined vSwitch0
CVM eth0 User-defined vSwitch0
ESXi host vmk1 192.168.5.1 vSwitchNutanix
CVM eth1 192.168.5.2 vSwitchNutanix
CVM eth1:1 192.168.5.254 vSwitchNutanix
CVM eth2 User-defined vSwitch0
Note: The ESXi and CVM interfaces on vSwitch0 cannot use IP addresses in any subnets that overlap with subnet 192.168.5.0/24.

vSwitches

A Nutanix node is configured with the following two vSwitches.

  • vSwitchNutanix

    Local communications between the CVM and the ESXi host use vSwitchNutanix. vSwitchNutanix has no uplinks.

    Caution: To manage network traffic between VMs with greater control, create more port groups on vSwitch0. Do not modify vSwitchNutanix.
    Figure. vSwitchNutanix Configuration Click to enlarge vSwitchNutanix Configuration

  • vSwitch0

    All other external communications like CVM to a differnet host (in case of HA re-direction) use vSwitch0 that has uplinks to the physical network interfaces. Since network segmentation is disabled by default, the backplane traffic uses vSwitch0.

    vSwitch0 has the following two networks.

    • Management Network

      HA, vMotion, and vCenter communications use the Management Network.

    • VM Network

      All VMs use the VM Network.

    Caution:
    • The Nutanix CVM uses the standard Ethernet maximum transmission unit (MTU) of 1,500 bytes for all the network interfaces by default. The standard 1,500-byte MTU delivers excellent performance and stability. Nutanix does not support configuring the MTU on a network interface of CVMs to higher values.
    • You can enable jumbo Frames (MTU of 9,000 bytes) on the physical network interfaces of ESXi hosts and guest VMs if the applications on your guest VMs require them. If you choose to use jumbo Frames on hypervisor hosts, ensure to enable them end-to-end in the desired network and consider both the physical and virtual network infrastructure impacted by the change.
    Figure. vSwitch0 Configuration Click to enlarge vSwitch0 Configuration

Configuring Host Networking (Management Network)

After you create the Nutanix cluster by using Foundation, configure networking for your ESXi hosts.

About this task

Figure. Configure Management Network Click to enlarge Ip Configuration image

Procedure

  1. On the ESXi host console, press F2 and then provide the ESXi host logon credentials.
  2. Press the down arrow key until Configure Management Network highlights and then press Enter .
  3. Select Network Adapters and then press Enter .
  4. Ensure that the connected network adapters are selected.
    If they are not selected, press Space key to select them and press Enter key to return to the previous screen.
    Figure. Network Adapters Click to enlarge Select a Network Adapters
  5. If a VLAN ID needs to be configured on the Management Network, select VLAN (optional) and press Enter . In the dialog box, provide the VLAN ID and press Enter .
    Note: Do not add any other device (including guest VMs) to the VLAN to which the CVM and hypervisor host are assigned. Isolate guest VMs on one or more separate VLANs.
  6. Select IP Configuration and press Enter .
    Figure. Configure Management Network Click to enlarge IP Address Configuration
  7. If necessary, highlight the Set static IP address and network configuration option and press Space to update the setting.
  8. Provide values for the following: IP Address , Subnet Mask , and Default Gateway fields based on your environment and then press Enter .
  9. Select DNS Configuration and press Enter .
  10. If necessary, highlight the Use the following DNS server addresses and hostname option and press Space to update the setting.
  11. Provide values for the Primary DNS Server and Alternate DNS Server fields based on your environment and then press Enter .
  12. Press Esc and then Y to apply all changes and restart the management network.
  13. Select Test Management Network and press Enter .
  14. Press Enter to start the network ping test.
  15. Verify that the default gateway and DNS servers reported by the ping test match those that you specified earlier in the procedure and then press Enter .

    Ensure that the tested addresses pass the ping test. If they do not, confirm that the correct IP addresses are configured.

    Figure. Test Management Network Click to enlarge Test Management Network

    Press Enter to close the test window.

  16. Press Esc to log off.

Changing a Host IP Address

About this task

To change a host IP address, perform the following steps. Perform the following steps once for each hypervisor host in the Nutanix cluster. Complete the entire procedure on a host before proceeding to the next host.
Caution: The cluster cannot tolerate duplicate host IP addresses. For example, when swapping IP addresses between two hosts, temporarily change one host IP address to an interim unused IP address. Changing this IP address avoids having two hosts with identical IP addresses on the cluster. Then complete the address change or swap on each host using the following steps.
Note: All CVMs and hypervisor hosts must be on the same subnet. The hypervisor can be multihomed provided that one interface is on the same subnet as the CVM.

Procedure

  1. Configure networking on the Nutanix node. For more information, see Configuring Host Networking (Management Network).
  2. Update the host IP addresses in vCenter. For more information, see Reconnecting a Host to vCenter.
  3. Log on to every CVM in the Nutanix cluster and restart Genesis service.
    nutanix@cvm$ genesis restart

    If the restart is successful, output similar to the following is displayed.

    Stopping Genesis pids [1933, 30217, 30218, 30219, 30241]
    Genesis started on pids [30378, 30379, 30380, 30381, 30403]

Reconnecting a Host to vCenter

About this task

If you modify the IP address of a host, you must reconnect the host with the vCenter. To reconnect the host to the vCenter, perform the following procedure.

Procedure

  1. Log on to vCenter with the web client.
  2. Right-click the host with the changed IP address and select Disconnect .
  3. Right-click the host again and select Remove from Inventory .
  4. Right-click the Nutanix cluster and then click Add Hosts... .
    1. Enter the IP address or fully qualified domain name (FQDN) of the host you want to reconnect in the IP address or FQDN under New hosts .
    2. Enter the host logon credentials in the User name and Password fields, and click Next .
      If a security or duplicate management alert appears, click Yes .
    3. Review the Host Summary and click Next .
    4. Click Finish .
    You can see the host with the updated IP address in the left pane of vCenter.

Selecting a Management Interface

Nutanix tracks the management IP address for each host and uses that IP address to open an SSH session into the host to perform management activities. If the selected vmk interface is not accessible through SSH from the CVMs, activities that require interaction with the hypervisor fail.

If multiple vmk interfaces are present on a host, Nutanix uses the following rules to select a management interface.

  1. Assigns weight to each vmk interface.
    • If vmk is configured for the management traffic under network settings of ESXi, then the weight assigned is 4. Otherwise, the weight assigned is 0.
    • If the IP address of vmk belongs to the same IP subnet as eth0 of the CVMs interface, then 2 is added to its weight.
    • If the IP address of vmk belongs to the same IP subnet as eth2 of the CVMs interface, then 1 is added to its weight.
  2. The vmk interface that has the highest weight is selected as the management interface.

Example of Selection of Management Network

Consider an ESXi host with following configuration.

  • vmk0 IP address and mask: 2.3.62.204, 255.255.255.0
  • vmk1 IP address and mask: 192.168.5.1, 255.255.255.0
  • vmk2 IP address and mask: 2.3.63.24, 255.255.255.0

Consider a CVM with following configuration.

  • eth0 inet address and mask: 2.3.63.31, 255.255.255.0
  • eth2 inet address and mask: 2.3.62.12, 255.255.255.0

According to the rules, the following weights are assigned to the vmk interfaces.

  • vmk0 = 4 + 0 + 1 = 5
  • vmk1 = 0 + 0 + 0 = 0
  • vmk2 = 0 + 2 + 0 = 2

Since vmk0 has the highest weight assigned, vmk0 interface is used as a management IP address for the ESXi host.

To verify that vmk0 interface is selected for management IP address, use the following command.

root@esx# esxcli network ip interface tag get -i vmk0

You see the following output.

Tags: Management, VMmotion

For the other two interfaces, no tags are displayed.

If you want any other interface to act as the management IP address, enable management traffic on that interface by following the procedure described in Selecting a New Management Interface.

Selecting a New Management Interface

You can mark the vmk interface to select as a management interface on an ESXi host by using the following method.

Procedure

  1. Log on to vCenter with the web client.
  2. Do the following on the ESXi host.
    1. Go to Configure > Networking > VMkernel adapters .
    2. Select the interface on which you want to enable the management traffic.
    3. Click Edit settings of the port group to which the vmk belongs.
    4. Select Management check box from the Enabled services option to enable management traffic on the vmk interface.
  3. Open an SSH session to the ESXi host and enable the management traffic on the vmk interface.
    root@esx# esxcli network ip interface tag add -i vmkN --tagname=Management

    Replace vmkN with the vmk interface where you want to enable the management traffic.

Updating Network Settings

After you configure networking of your vSphere deployments on Nutanix Enterprise Cloud, you may want to update the network settings.

  • To know about the best practice of ESXi network teaming policy, see Network Teaming Policy.

  • To migrate an ESXi host networking from a vSphere Standard Switch (vSwitch) to a vSphere Distributed Switch (vDS) with LACP/LAG configuration, see Migrating to a New Distributed Switch with LACP/LAG.

  • To migrate an ESXi host networking from a vSphere standard switch (vSwitch) to a vSphere Distributed Switch (vDS) without LACP, see Migrating to a New Distributed Switch without LACP/LAG.

    .

Network Teaming Policy

On an ESXi host, NIC teaming policy allows you to bundle two or more physical NICs into a single logical link to provide more network bandwidth aggregation and link redundancy to a vSwitch. The NIC teaming policies in the ESXi networking configuration for a vSwitch consists of the following.

  • Route based on originating virtual port.
  • Route based on IP hash.
  • Route based on source MAC hash.
  • Explicit failover order.

In addition to the earlier mentioned NIC teaming policy, vDS uses an extra teaming policy that consists of - Route based on physical NIC load.

When Foundation or Phoenix imaging is performed on a Nutanix cluster, the following two standard virtual switches are created on ESXi hosts:

  • vSwitch0
  • vSwitchNutanix

On vSwitch0, the Nutanix best practice guide (see Nutanix vSphere Networking Solution Document) provides the following recommendations for NIC teaming:

  • vSwitch. Route based on originating virtual port
  • vDS. Route based on physical NIC load

On vSwitchNutanix, there are no uplinks to the virtual switch, so there is no NIC teaming configuration required.

Migrate from a Standard Switch to a Distributed Switch

This topic provides detailed information about how to migrate from a vSphere Standard Switch (vSS) to a vSphere Distributed Switch (vDS).

The following are the two types of virtual switches (vSwitch) in vSphere.

  • vSphere standard switch (vSwitch) (see vSphere Standard Switch (vSwitch) in vSphere Networking).
  • vSphere Distributed Switch (vDS) (see vSphere Distributed Switch (vDS) in vSphere Networking).
Tip: For more information about vSwitches and the associated network concepts, see the VMware Documentation .

For migrating from a vSS to a vDS with LACP/LAG configuration, see Migrating to a New Distributed Switch with LACP/LAG.

For migrating from a vSS to a vDS without LACP/LAG configuration, see Migrating to a New Distributed Switch without LACP/LAG.

Standard Switch Configuration

The standard switch configuration consists of the following.

vSwitchNutanix
vSwitchNutanix handles internal communication between the CVM and the ESXi host. There are no uplink adapters associated with this vSwitch. This virtual switch enables the communication between the CVM and the host. Administrators must not modify the settings of this virtual switch or its port groups. The only members of this port group must be the CVM and its host. Do not modify this virtual switch configuration as it can disrupt communication between the host and the CVM.
vSwitch0
vSwitch0 consists of the vmk (VMkernel) management interface, vMotion interface, and VM port groups. This virtual switch connects to uplink network adapters that are plugged into a physical switch.

Planning the Migration

It is important to plan and understand the migration process. An incorrect configuration can disrupt communication, which can require downtime to resolve.

Consider the following while or before planning the migration.

  • Read Nutanix Best Practice Guide for VMware vSphere Networking available here.

  • Understand the various teaming and load-balancing algorithms on vSphere.

    For more information, see the VMware Documentation .

  • Confirm communication on the network through all the connected uplinks.
  • Confirm access to the host using IPMI when there are network connectivity issues during migration.

    Access the host to troubleshoot the network issue or move the management network back to the standard switch depending on the issue.

  • Confirm that the hypervisor external management IP address and the CVM IP address are in the same public subnet for the data path redundancy functionality to work.
  • When performing migration to the distributed switch, migrate one host at a time and verify that networking is working as desired.
  • Do not migrate the port groups and vmk (VMkernel) interfaces that are on vSwitchNutanix to the distributed switch (dvSwitch).

Unassigning Physical Uplink of the Host for Distributed Switch

All the physical adapters connect to the vSwitch0 of the host. A live distributed switch must have a physical uplink connected to it to work. To assign the physical adapter of the host to the distributed switch, unassign the physical adapter of the host and assign it to the new distributed switch.

About this task

To unassign the physical uplink of the host, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Click Configure , and go to Networking > Virtual Switches .
  4. Click MANAGE PHYSICAL ADAPTERS tab and select the active adapters from the Assigned adapters that you want to unassign from the list of physical adapters of the host.
    Figure. Managing Physical Adapters Click to enlarge Managing Physical Adapters

  5. Click X on the top.
    The selected adapter is unassigned from the list of physical adapters of the host.
    Tip: Ping the host to check and confirm if you are able to communicate with the active physical adapter of the host. If you lose network connectivity to the ESXi host during this test, review your network configuration.

Migrating to a New Distributed Switch without LACP/LAG

Migrating to a new distributed switch without LACP/LAG consists of the following workflow.

  1. Creating a Distributed Switch
  2. Creating Port Groups on the Distributed Switch
  3. Configuring Port Group Policies

Creating a Distributed Switch

Connect to vCenter and create a distributed switch.

About this task

To create a distributed switch, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Distributed Switch Creation Click to enlarge Distributed Switch Creation

  3. Right-click the host, select Distributed Switch > New Distributed Switch , and specify the following information in the New Distributed Switch dialog box.
    1. Name and Location : Enter name for the distributed switch.
    2. Select Version : Select a distributed switch version that is compatible with all your hosts in that datacenter.
    3. Configure Settings : Select the number of uplinks you want to connect to the distributed switch.
      Select Create a default port group checkbox to create a port group. To configure a port group later, see Creating Port Groups on the Distributed Switch.
    4. Ready to complete : Review the configuration and click Finish .
    A new distributed switch is created with the default uplink port group. The uplink port group is the port group to which the uplinks connect. This uplink is different from the vmk (VMkernel) or the VM port groups.
    Figure. New Distributed Switch Created in the Host Click to enlarge New Distributed Switch Created in the Host

Creating Port Groups on the Distributed Switch

Create one or more vmk (VMkernel) port groups and VM port groups depending on the vSphere features you plan to use and or the physical network layout. The best practice is to have the vmk Management interface, vmk vMotion interface, and vmk iSCSI interface on separate port groups.

About this task

To create port groups on the distributed switch, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Creating Distributed Port Groups Click to enlarge Creating Distributed Port Groups

  3. Right-click the host, select Distributed Switch > Distributed Port Group > New Distributed Port Group , and follow the wizard to create the remaining distributed port group (vMotion interface and VM port groups).
    You would need the following port groups because you would be migrating from the standard switch to the distributed switch.
    • VMkernel Management interface . Use this port group to connect to the host for all management operations.
    • VMNetwork . Use this port group to connect to the new VMs.
    • vMotion . This port group is an internal interface and the host will use this port during failover for vMotion traffic.
    Note: Nutnaix recommends you to use static port binding instead of ephemeral port binding when you create a port group.
    Figure. Distibuted Port Groups Created Click to enlarge Distibuted Port Groups Created

    Note: The port group for vmk management interface is created during the distributed switch creation. See Creating a Distributed Switch for more information.

Configuring Port Group Policies

To configure port groups, you must configure VLANs, Teaming and failover, and other distributed port groups policies at the port group layer or at the distributed switch layer. Refer to the following topics to configure the port group policies.

  1. Configuring Policies on the Port Group Layer
  2. Configuring Policies on the Distributed Switch Layer
  3. Adding ESXi Host to the Distributed Switch

Configuring Policies on the Port Group Layer

Ensure that the distributed switches port groups have VLANs tagged if the physical adapters of the host have a VLAN tagged to them. Update the policies for the port group, VLANs, and teaming algorithms to configure the physical network switch. Configure the load balancing policy as per the network configuration requirements on the physical switch.

About this task

To configure the port group policies, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Configure Port Group Policies on the Distributed Switch Click to enlarge Configure Port Group Policies on the Distributed Switch

  3. Right-click the host, select Distributed Switch > Distributed Port Group > Edit Settings , and follow the wizard to configure the VLAN, Teaming and failover, and other options.
    Note: For more information about configuring port group policies, see the VMware Documentation .
  4. Click OK to complete the configuration.
  5. Repeat steps 2–4 to configure the other port groups.
Configuring Policies on the Distributed Switch Layer

You can configure the same policy for all the port groups simultaneously.

About this task

To configure the same policy for all the port groups, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Manage Distributed Port Groups Click to enlarge Manage Distributed Port Groups

  3. Right-click the host, select Distributed Switch > Distributed Port Group > Manage Distributed Port Groups , and specify the following information in Manage Distributed Port Group dialog box.
    1. In the Select port group policies tab, select the port group policies that you want to configure and click Next .
      Note: For more information about configuring port group policies, see the VMware Documentation .
    2. In the Select port groups tab, select the distributed port groups on which you want to configure the policy and click Next .
    3. In the Teaming and failover tab, configure the Load balancing policy, Active uplinks , and click Next .
    4. In the Ready to complete window, review the configuration and click Finish .
Adding ESXi Host to the Distributed Switch

Migrate the management interface and CVM of the host to the distributed switch.

About this task

To migrate the Management interface and CVM of the ESXi host, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Add ESXi Host to Distributed Switch Click to enlarge Add ESXi Host to Distributed Switch

  3. Right-click the host, select Distributed Switch > Add and Manage Hosts , and specify the following information in Add and Manage Hosts dialog box.
    1. In the Select task tab, select Add hosts to add new host to the distributed switch and click Next .
    2. In the Select hosts tab, click New hosts to select the ESXi host and add it to the distributed switch.
      Note: Add one host at a time to the distributed switch and then migrate all the CVMs from the host to the distributed switch.
    3. In the Manage physical adapters tab, configure the physical NICs (PNICs) on the distributed switch.
      Tip: For consistent network configuration, you can connect the same physical NIC on every host to the same uplink on the distributed switch.
        1. Select a PNIC from the On other switches/unclaimed section and click Assign uplink .
          Figure. Select Physical Adapter for Uplinking Click to enlarge Select Physical Adapter for Uplinking

          Important: If you select physical NICs connected to other switches, those physical NICs migrate to the current distributed switch.
        2. Select the Uplink in the distributed switch to which you want to assign the PNIC of the host and click OK .
        3. Click Next .
    4. In the Manage VMkernel adapters tab, configure the vmk adapters.
        1. Select a VMkernel adapter from the On other switches/unclaimed section and click Assign port group .
        2. Select the port group in the distributed switch to which you want to assign the VMkernel of the host and click OK .
          Figure. Select a Port Group Click to enlarge Select a Port Group

        3. Click Next .
    5. (optional) In the Migrate VM networking tab, select Migrate virtual machine networking to connect all the network adapters of a VM to a distributed port group.
        1. Select the VM to connect all the network adapters of the VM to a distributed port group, or select an individual network adapter to connect with the distributed port group.
        2. Click Assign port group and select the distributed port group to which you want to migrate the VM or network adapter and click OK .
        3. Click Next .
    6. In the Ready to complete tab, review the configuration and click Finish .
  4. Go to the Hosts and Clusters view in the vCenter web client and Hosts > Configure to review the network configuration for the host.
    Note: Run a ping test to confirm that the networking on the host works as expected.
  5. Follow the steps 2–4 to add the remaining hosts to the distributed switch and migrate the adapters.

Migrating to a New Distributed Switch with LACP/LAG

Migrating to a new distributed switch without LACP/LAG consists of the following workflow.

  1. Creating a Distributed Switch
  2. Creating Port Groups on the Distributed Switch
  3. Creating Link Aggregation Group on Distributed Switch
  4. Creating Port Groups to use the LAG
  5. Adding ESXi Host to the Distributed Switch

Creating Link Aggregation Group on Distributed Switch

Using Link Aggregation Group (LAG) on a distributed switch, you can connect the ESXi host to physical switched by using dynamic link aggregation. You can create multiple link aggregation groups (LAGs) on a distributed switch to aggregate the bandwidth of physical NICs on ESXi hosts that are connected to LACP port channels.

About this task

To create a LAG, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
  3. Right-click the host, select Distributed Switch > Configure > LACP .
    Figure. Create LAG on Distributed Switch Click to enlarge Create LAG on Distributed Switch

  4. Click New and enter the following details in the New Link Aggregation Group dialog box.
    1. Name : Enter a name for the LAG.
    2. Number of Ports : Enter the number of ports.
      The number of ports must match the physical ports per host in the LACP LAG. For example, if the Number of Ports two, then you can attach two physical ports per ESXi host to the LAG.
    3. Mode : Specify the state of the physical switch.
      Based on the configuration requirements, you can set the mode to Active or Passive .
    4. Load balancing mode : Specify the load balancing mode for the physical switch.
      For more information about the various load balancing options, see the VMware Documentation .
    5. VLAN trunk range : Specify the VLANs if you have VLANs configured in your environment.
  5. Click OK .
    LAG is created on the distributed switch.

Creating Port Groups to Use LAG

To use LAG as the uplink you have to edit the settings of the port group created on the distributed switch.

About this task

To edit the settings on the port group to use LAG, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
  3. Right-click the host, select Management port Group > Edit Setting .
  4. Go to the Teaming and failover tab in the Edit Settings dialog box and specify the following information.
    Figure. Configure the Management Port Group Click to enlarge Configure the Management Port Group

    1. Load Balancing : Select Route based IP hash .
    2. Active uplinks : Move the LAG under the Unused uplinks section to Active Uplinks section.
    3. Unused uplinks : Select the physical uplinks ( Uplink 1 and Uplink 2 ) and move them to the Unused uplinks section.
  5. Repeat steps 2–4 to configure the other port groups.

Adding ESXi Host to the Distributed Switch

Add the ESXi host to the distributed switch and migrate the network from the standard switch to the distributed switch. Migrate the management interface and CVM of the ESXi host to the distributed switch.

About this task

To migrate the Management interface and CVM of ESXi host, do the following.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Networking view and select the host from the left pane.
    Figure. Add ESXi Host to Distributed Switch Click to enlarge Add ESXi Host to Distributed Switch

  3. Right-click the host, select Distributed Switch > Add and Manage Hosts , and specify the following information in Add and Manage Hosts dialog box.
    1. In the Select task tab, select Add hosts to add new host to the distributed switch and click Next .
    2. In the Select hosts tab, click New hosts to select the ESXi host and add it to the distributed switch.
      Note: Add one host at a time to the distributed switch and then migrate all the CVMs from the host to the distributed switch.
    3. In the Manage physical adapters tab, configure the physical NICs (PNICs) on the distributed switch.
      Tip: For consistent network configuration, you can connect the same physical NIC on every host to the same uplink on the distributed switch.
        1. Select a PNIC from the On other switches/unclaimed section and click Assign uplink .
          Important: If you select physical NICs connected to other switches, those physical NICs migrate to the current distributed switch.
        2. Select the LAG Uplink in the distributed switch to which you want to assign the PNIC of the host and click OK .
        3. Click Next .
    4. In the Manage VMkernel adapters tab, configure the vmk adapters.
      Select the VMkernel adapter that is associated with vSwitch0 as your management VMkernel adapter. Migrate this adapter to the corresponding port group on the distributed switch.
      Note: Do not migrate the VMkernel adapter associated with vSwitchNutanix.
      Note: If the are any VLANs associated with the port group on the standard switch, ensure that the corresponding distributed port group also has the correct VLAN. Verify the physical network configuration to ensure it is configured as required.
        1. Select a VMkernel adapter from the On other switches/unclaimed section and click Assign port group .
        2. Select the port group in the distributed switch to which you want to assign the VMkernel of the host and click OK .
        3. Click Next .
    5. (optional) In the Migrate VM networking tab, select Migrate virtual machine networking to connect all the network adapters of a VM to a distributed port group.
        1. Select the VM to connect all the network adapters of the VM to a distributed port group, or select an individual network adapter to connect with the distributed port group.
        2. Click Assign port group and select the distributed port group to which you want to migrate the VM or network adapter and click OK .
        3. Click Next .
    6. In the Ready to complete tab, review the configuration and click Finish .

vCenter Configuration

VMware vCenter enables the centralized management of multiple ESXi hosts. You can either create a vCenter Server or use an existing vCenter Server. To create a vCenter Server, refer to the VMware Documentation .

This section considers that you already have a vCenter Server and therefore describes the operations you can perform on an existing vCenter Server. To deploy vSphere clusters running Nutanix Enterprise Cloud, perform the following steps in the vCenter.

Tip: For a single-window management of all your ESXi nodes, you can also integrate the vCenter Server to Prism Central. For more information, see Registering a Cluster to vCenter Server

1. Create a cluster entity within the existing vCenter inventory and configure its settings according to Nutanix best practices. For more information, see Creating a Nutanix Cluster in vCenter.

2. Configure HA. For more information, see vSphere HA Settings.

3. Configure DRS. For more information, see vSphere DRS Settings.

4. Configure EVC. For more information, see vSphere EVC Settings.

5. Configure override. For more information, see VM Override Settings.

6. Add the Nutanix hosts to the new cluster. For more information, see Adding a Nutanix Node to vCenter.

Registering a Cluster to vCenter Server

To perform core VM management operations directly from Prism without switching to vCenter Server, you need to register your cluster with the vCenter Server.

Before you begin

Ensure that you have vCenter Server Extension privileges as these privileges provide permissions to perform vCenter registration for the Nutanix cluster.

About this task

Following are some of the important points about registering vCenter Server.

  • Nutanix does not store vCenter Server credentials.
  • Whenever a new node is added to Nutanix cluster, vCenter Sever registration for the new node is automatically performed.
  • Nutanix supports vCenter Enhanced Linked Mode.

    When registering a Nutanix cluster to a vCenter Enhanced Linked Mode (EHM) enabled ESXi environment, ensure that Prism is registered to the vCenter containing the vSphere Cluster and Nutanix nodes (often the local vCenter). For more information about vCenter Enhanced Linked Mode, see vCenter Enhanced Linked Mode in the vCenter Server Installation and Setup documentation.

Procedure

  1. Log into the Prism web console.
  2. Click the gear icon in the main menu and then select vCenter Registration in the Settings page.
    The vCenter Server that is managing the hosts in the cluster is auto-discovered and displayed.
  3. Click the Register link.
    The IP address is auto-populated in the Address field. The port number field is also auto-populated with 443. Do not change the port number. For the complete list of required ports, see Port Reference.
  4. Type the administrator user name and password of the vCenter Server in the Admin Username and Admin Password fields.
    Figure. vCenter Registration Figure 1 Click to enlarge vcenter registration

  5. Click Register .
    During the registration process a certificate is generated to communicate with the vCenter Server. If the registration is successful, relevant message is displayed in the Tasks dashboard. The Host Connection field displays as Connected, which implies that all the hosts are being managed by the vCenter Server that is registered.
    Figure. vCenter Registration Figure 2 Click to enlarge vcenter registration

Unregistering a Cluster from the vCenter Server

To unregister the vCenter Server from your cluster, perform the following procedure.

About this task

  • Ensure that you unregister the vCenter Server from the cluster before changing the IP address of the vCenter Server. After you change the IP address of the vCenter Sever, you should register the vCenter Server again with the new IP address with the cluster.
  • The vCenter Server Registration page displays the registered vCenter Server. If for some reason the Host Connection field changes to Not Connected , it implies that the hosts are being managed by a different vCenter Server. In this case, there will be new vCenter entry with host connection status as Connected and you need to register to this vCenter Server.

Procedure

  1. Log into the Prism web console.
  2. Click the gear icon in the main menu and then select vCenter Registration in the Settings page.
    A message that cluster is already registered to the vCenter Server is displayed.
  3. Type the administrator user name and password of the vCenter Server in the Admin Username and Admin Password fields.
  4. Click Unregister .
    If the credentials are correct, the vCenter Server is unregistered from the cluster and a relevant message is displayed in the Tasks dashboard.

Creating a Nutanix Cluster in vCenter

Before you begin

Nutanix recommends creating a storage container in the Prism Element running on the host or using the default container to mount NFS datastore on all ESXi hosts.

About this task

To enable the vCenter to discover the Nutanix clusters, perform the following steps in the vCenter.

Procedure

  1. Log on to vCenter with the web client.
  2. Do one of the following.
    • If you want the Nutanix cluster to be in an existing datacenter, proceed to step 3.
    • If you want the Nutanix cluster to be in a new datacenter or if there is no datacenter, perform the following steps to create a datacenter.
      Note: Nutanix clusters must be in a datacenter.
    1. Go to the Hosts and Clusters view and right-click the IP address of the vCenter Server in the left pane.
    2. Click New Datacenter .
    3. Enter a meaningful name for the datacenter (for example, NTNX-DC ) and click OK .
  3. Right-click the datacenter node and click New Cluster .
    1. Enter a meaningful name for the cluster in the Name field (for example, NTNX-Cluster ).
    2. Turn on the vSphere DRS switch.
    3. Turn on the Turn on vSphere HA switch.
    4. Uncheck Manage all hosts in the cluster with a single image .
    Nutanix cluster ( NTNX-Cluster ) is created with the default settings for vSphere HA and vSphere DRS.

What to do next

Add all the Nutanix nodes to the Nutanix cluster inventory in vCenter. For more information, see Adding a Nutanix Node to vCenter.

Adding a Nutanix Node to vCenter

Before you begin

Configure the Nutanix cluster according to Nutanix specifications given in Creating a Nutanix Cluster in vCenter and vSphere Cluster Settings Checklist.

About this task

Note: To ensure that vCenter managed ESXi hosts are accessible through vCenter only and are not directly accessible, put the vCenter managed ESXi hosts in lockdown mode. Lockdown mode forces all operations through the vCenter Server.
Tip: Refer to KB-1661 for the default credentials of all cluster components.

Procedure

  1. Log on to vCenter with the web client.
  2. Right-click the Nutanix cluster and then click Add Hosts... .
    1. Enter the IP address or fully qualified domain name (FQDN) of the host you want to reconnect in the IP address or FQDN under New hosts .
    2. Enter the host logon credentials in the User name and Password fields, and click Next .
      If a security or duplicate management alert appears, click Yes .
    3. Review the Host Summary and click Next .
    4. Click Finish .
  3. Select the host under the Nutanix cluster from the left pane and go to Configure > System > Security Profile .
    Ensure that Lockdown Mode is Disabled because Nutanix does not support lockdown mode.
  4. Configure DNS servers.
    1. Go to Configure > Networking > TCP/IP configuration .
    2. Click Default under TCP/IP stack and go to TCP/IP .
    3. Click the pencil icon to configure DNS servers and perform the following.
        1. Select Enter settings manually .
        2. Type the domain name in the Domain field.
        3. Type DNS server addresses in the Preferred DNS Server and Alternate DNS Server fields and click OK .
  5. Configure NTP servers.
    1. Go to Configure > System > Time Configuration .
    2. Click Edit .
    3. Select the Use Network Time Protocol (Enable NTP client) .
    4. Type the NTP server address in the NTP Servers text box.
    5. In the NTP Service Startup Policy, select Start and stop with host from the drop-down list.
      Add multiple NTP servers if necessary.
    6. Click OK .
  6. Click Configure > Storage and ensure that NFS datastores are mounted.
    Note: Nutanix recommends creating a storage container in Prism Element running on the host.
  7. If HA is not enabled, set the CVM to start automatically when the ESXi host starts.
    Note: Automatic VM start and stop is disabled in clusters where HA is enabled.
    1. Go to Configure > Virtual Machines > VM Startup/Shutdown .
    2. Click Edit .
    3. Ensure that Automatically start and stop the virtual machines with the system is checked.
    4. If the CVM is listed in Manual Startup , click the up arrow to move the CVM into the Automatic Startup section.
    5. Click OK .

What to do next

Configure HA and DRS settings. For more information, see vSphere HA Settings and vSphere DRS Settings.

Nutanix Cluster Settings

To ensure the optimal performance of your vSphere deployment running on Nutanix cluster, configure the following settings from the vCenter.

vSphere General Settings

About this task

Configure the following general settings from vCenter.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Click Configure , and go to Configuration > General .
    1. Under General , set the Swap file location to Virtual machine directory .
      Setting the swap file location to the VM directory stores the VM swap files in the same directory as the VM.
    2. Under Default VM Compatibility , set the compatibility to Use datacenter setting and host version .
      Do not change the compatibility unless the cluster has to support previous versions of ESXi VMs.
      Figure. General Cluster Settings Click to enlarge General Cluster Settings

vSphere HA Settings

If there is a node failure, vSphere HA (High Availability) settings ensure that there are sufficient compute resources available to restart all VMs that were running on the failed node.

About this task

Configure the following HA settings from vCenter.
Note: Nutanix recommends that you configure vSphere HA and DRS even if you do not use the features. The vSphere cluster configuration preserves the settings, so if you later decide to enable the features, the settings are in place and conform to Nutanix best practices.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Click Configure , and go to Services > vSphere Availability .
  4. Click Edit next to the text showing vSphere HA status.
    Figure. vSphere Availability Settings: Failures and Responses Click to enlarge vSphere Availability Settings: Failures and Responses

    1. Turn on the vSphere HA and Enable Host Monitoring switches.
    2. Specify the following information under the Failures and Responses tab.
        1. Host Failure Response : Select Restart VMs from the drop-down list.

          This option configures the cluster-wide host isolation response settings.

        2. Response for Host Isolation : Select Power off and restart VMs from the drop-down list.
        3. Datastore with PDL : Select Disabled from the drop-down list.
        4. Datastore with APD : Select Disabled from the drop-down list.
          Note: To enable the VM component protection in vCenter, refer to the VMware Documentation.
        5. VM Monitoring : Select Disabled from the drop-down list.
    3. Specify the following information under the Admission Control tab.
      Note: If you are using replication factor 2 with cluster sizes up to 16 nodes, configure HA admission control settings to tolerate one node failure. For cluster sizes larger than 16 nodes, configure HA admission control to sustain two node failures and use replication factor 3. vSphere 6.7, and newer versions automatically calculate the percentage of resources required for admission control.
      Figure. vSphere Availability Settings: Admission Control Click to enlarge vSphere Availability Settings: Admission Control

        1. Host failures cluster tolerates : Enter 1 or 2 based on the number of nodes in the Nutanix cluster and the replication factor.
        2. Define host failover capacity by : Select Cluster resource Percentage from the drop-down list.
        3. Performance degradation VMs tolerate : Set the percentage to 100.

          For more information about settings of percentage of cluster resources reserved as failover spare capacity, see vSphere HA Admission Control Settings for Nutanix Environment.

    4. Specify the following information under the Heartbeat Datastores tab.
      Note: vSphere HA uses datastore heart beating to distinguish between hosts that have failed and hosts that reside on a network partition. With datastore heart beating, vSphere HA can monitor hosts when a management network partition occurs while continuing to respond to failures.
      Figure. vSphere Availability Settings: Heartbeat Datastores Click to enlarge vSphere Availability Settings: Heartbeat Datastores

        1. Select Use datastores only from the specified list .
        2. Select the named storage container mounted as the NFS datastore (Nutanix datastore).

          If you have more than one named storage container, select all that are applicable.

        3. If the cluster has only one datastore, click Advanced Options tab and add das.ignoreInsufficientHbDatastore with Value of true .
    5. Click OK .

vSphere HA Admission Control Settings for Nutanix Environment

Overview

If you are using redundancy factor 2 with cluster sizes of up to 16 nodes, you must configure HA admission control settings with the appropriate percentage of CPU/RAM to achieve at least N+1 availability. For cluster sizes larger than 16 nodes, you must configure HA admission control with the appropriate percentage of CPU/RAM to achieve at least N+2 availability.

N+2 Availability Configuration

The N+2 availability configuration can be achieved in the following two ways.

  • Redundancy factor 2 and N+2 vSphere HA admission control setting configured.

    Because Nutanix distributed file system recovers in the event of a node failure, it is possible to have a second node failure without data being unavailable if the Nutanix cluster has fully recovered before the subsequent failure. In this case, a N+2 vSphere HA admission control setting is required to ensure sufficient compute resources are available to restart all the VMs.

  • Redundancy factor 3 and N+2 vSphere HA admission control setting configured.
    If you want two concurrent node failures to be tolerated and the cluster has insufficient blocks to use block awareness, redundancy factor 3 in a cluster of five or more nodes is required. In either of these two options, the Nutanix storage pool must have sufficient free capacity to restore the configured redundancy factor (2 or 3). The percentage of free space required is the same as the required HA admission control percentage setting. In this case, redundancy factor 3 must be configured at the storage container layer. An N+2 vSphere HA admission control setting is also required to ensure sufficient compute resources are available to restart all the VMs.
    Note: For redundancy factor 3, a minimum of five nodes is required, which provides the ability that two concurrent nodes can fail while ensuring data remains online. In this case, the same N+2 level of availability is required for the vSphere cluster to enable the VMs to restart following a failure.
Table 1. Minimum Reservation Percentage for vSphere HA Admission Control Setting For redundancy factor 2 deployments, the recommended minimum HA admission control setting percentage is marked with single asterisk (*) symbol in the following table. For redundancy factor 2 or redundancy factor 3 deployments configured for multiple non-concurrent node failures to be tolerated, the minimum required HA admission control setting percentage is marked with two asterisks (**) in the following table.
Nodes Availability Level
N+1 N+2 N+3 N+4
1 N/A N/A N/A N/A
2 N/A N/A N/A N/A
3 33* N/A N/A N/A
4 25* 50 75 N/A
5 20* 40** 60 80
6 18* 33** 50 66
7 15* 29** 43 56
8 13* 25** 38 50
9 11* 23** 33 46
10 10* 20** 30 40
11 9* 18** 27 36
12 8* 17** 25 34
13 8* 15** 23 30
14 7* 14** 21 28
15 7* 13** 20 26
16 6* 13** 19 25
Nodes Availability Level
N+1 N+2 N+3 N+4
17 6 12* 18** 24
18 6 11* 17** 22
19 5 11* 16** 22
20 5 10* 15** 20
21 5 10* 14** 20
22 4 9* 14** 18
23 4 9* 13** 18
24 4 8* 13** 16
25 4 8* 12** 16
26 4 8* 12** 16
27 4 7* 11** 14
28 4 7* 11** 14
29 3 7* 10** 14
30 3 7* 10** 14
31 3 6* 10** 12
32 3 6* 9** 12

The table also represents the percentage of the Nutanix storage pool, which should remain free to ensure that the cluster can fully restore the redundancy factor in the event of one or more nodes, or even a block failure (where three or more blocks exist within a cluster).

Block Awareness

For deployments of at least three blocks, block awareness automatically ensures data availability when an entire block of up to four nodes configured with redundancy factor 2 can become unavailable.

If block awareness levels of availability are required, the vSphere HA admission control setting must ensure sufficient compute resources are available to restart all virtual machines. In addition, the Nutanix storage pool must have sufficient space to restore redundancy factor 2 to all data.

The vSphere HA minimum availability level must be equal to number of nodes per block.

Note: For block awareness, each block must be populated with a uniform number of nodes. In the event of a failure, a non-uniform node count might compromise block awareness or the ability to restore the redundancy factor, or both.

Rack Awareness

Rack fault tolerance is the ability to provide a rack-level availability domain. With rack fault tolerance, data is replicated to nodes that are not in the same rack. Rack failure can occur in the following situations.

  • All power supplies in a rack fail.
  • Top-of-rack (TOR) switch fails.
  • Network partition occurs: one of the racks becomes inaccessible from the other racks.

With rack fault tolerance enabled, the cluster has rack awareness and guest VMs can continue to run even during the failure of one rack (with replication factor 2) or two racks (with replication factor 3). The redundant copies of guest VM data and metadata persist on other racks when one rack fails.

Table 2. Rack awareness has minimum requirements, described in the following table.
Replication factor Minimum number of nodes Minimum number of Blocks Minimum number of racks Data resiliency
2 3 3 3 Failure of 1 node, block, or rack
3 5 5 5 Failure of 2 nodes, blocks, or racks

vSphere DRS Settings

About this task

Configure the following DRS settings from vCenter.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Click Configure , and go to Services > vSphere DRS .
  4. Click Edit next to the text showing vSphere DRS status.
    Figure. vSphere DRS Settings: Automation Click to enlarge vSphere DRS Settings: Automation

    1. Turn on the vSphere DRS switch.
    2. Specify the following information under the Automation tab.
        1. Automation Level : Select Fully Automated from the drop-down list.
        2. Migration Threshold : Set the bar between conservative and aggressive (value=3).

          Migration threshold provides optimal resource utilization while minimizing DRS migrations with little benefit. This threshold automatically manages data locality in such a way that whenever VMs move, writes are always written on one of the replicas locally to maximize the subsequent read performance.

          Nutanix recommends the migration threshold at 3 in a fully automated configuration.

        3. Predictive DRS : Leave the option disabled.

          The value of predictive DRS depends on whether you use other VMware products such as vRealize operations. Unless you use vRealize operations, Nutanix recommends disabling predictive DRS.

        4. Virtual Machine Automation : Enable VM automation.
    3. Specifying anything under the Additional Options tab is optional.
    4. Specify the following information under the Power Management tab.
      Figure. vSphere DRS Settings: Power Management Click to enlarge vSphere DRS Settings: Power Management

        1. DPM : Leave the option disabled.

          Enabling DPM causes nodes in the Nutanix cluster to go offline, affecting cluster resources.

    5. Click OK .

vSphere EVC Settings

vSphere enhanced vMotion compatibility (EVC) ensures that workloads can live migrate, using vMotion, between ESXi hosts in a Nutanix cluster that are running different CPU generations. The general recommendation is to have EVC enabled as it will help you in the future where you will be scaling your Nutanix clusters with new hosts that might contain new CPU models.

About this task

To enable EVC in a brownfield scenario can be challenging. Configure the following EVC settings from vCenter.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Shut down all the VMs on the hosts with feature sets greater than the EVC mode.
    Ensure that the Nutanix cluster contains hosts with CPUs from only one vendor, either Intel or AMD.
  4. Click Configure , and go to Configuration > VMware EVC .
  5. Click Edit next to the text showing VMware EVC.
  6. Enable EVC for the CPU vendor and feature set appropriate for the hosts in the Nutanix cluster, and click OK .
    If the Nutanix cluster contains nodes with different processor classes, enable EVC with the lower feature set as the baseline.
    Tip: To know the processor class of a node, perform the following steps.
      1. Log on to Prism Element running on the Nutanix cluster.
      2. Click Hardware from the menu and go to Diagram or Table view.
      3. Click the node and look for the Block Serial field in Host Details .
    Figure. VMware EVC Click to enlarge VMware EVC

  7. Start the VMs in the Nutanix cluster to apply the EVC.
    If you try to enable EVC on a Nutanix cluster with mismatching host feature sets (mixed processor clusters), the lowest common feature set (lowest processor class) is selected. Hence, if VMs are already running on the new host and if you want to enable EVC on the host, you must first shut down the VMs and then enable EVC.
    Note: Do not shut down more than one CVM at the same time.

VM Override Settings

You must exclude Nutanix CVMs from vSphere availability and resource scheduling and therefore tweak the following VM overriding settings.

Procedure

  1. Log on to vCenter with the web client.
  2. Go to the Hosts and Clusters view and select the Nutanix cluster from the left pane.
  3. Click Configure , and go to Configuration > VM Overrides .
  4. Select all the CVMs and click Next .
    If you do not have the CVMs listed, click Add to ensure that the CVMs are added to the VM Overrides dialog box.
    Figure. VM Override Click to enlarge VM Override

  5. In the VM override section, configure override for the following parameters.
    • DRS Automation Level: Disabled
    • VM HA Restart Priority: Disabled
    • VM Monitoring: Disabled
  6. Click Finish .

Migrating a Nutanix Cluster from One vCenter Server to Another

About this task

Perform the following steps to migrate a Nutanix cluster from one vCenter Server to another vCenter Server.
Note: The following steps are to migrate a Nutanix cluster with vSphere Standard Switch (vSwitch). To migrate a Nutanix cluster with vSphere Distributed Switch (vDS), see the VMware Documentation. .

Procedure

  1. Create a vSphere cluster in the vCenter Server where you want to migrate the Nutanix cluster. See Creating a Nutanix Cluster in vCenter.
  2. Configure HA, DRS, and EVC on the created vSphere cluster. See Nutanix Cluster Settings.
  3. Unregister the Nutanix cluster from the source vCenter Server. See Unregistering a Cluster from the vCenter Server.
  4. Move the nodes from the source vCenter Server to the new vCenter Server.
    See the VMware Documentation to know the process.
  5. Register the Nutanix cluster to the new vCenter Server. See Registering a Cluster to vCenter Server.

Storage I/O Control (SIOC)

SIOC controls the I/O usage of a virtual machine and gradually enforces the predefined I/O share levels. Nutanix converged storage architecture does not require SIOC. Therefore, while mounting a storage container on an ESXi host, the system disables SIOC in the statistics mode automatically.

Caution: While mounting a storage container on ESXi hosts running older versions (6.5 or below), the system enables SIOC in the statistics mode by default. Nutanix recommends disabling SIOC because an enabled SIOC can cause the following issues.
  • The storage can become unavailable because the hosts repeatedly create and delete the access .lck-XXXXXXXX files under the .iorm.sf subdirectory, located in the root directory of the storage container.
  • Site Recovery Manager (SRM) failover and failback does not run efficiently.
  • If you are using Metro Availability disaster recovery feature, activate and restore operations do not work.
    Note: For using Metro Availability disaster recovery feature, Nutanix recommends using an empty storage container. Disable SIOC and delete all the files from the storage container that are related to SIOC. For more information, see KB-3501 .
Run the NCC health check (see KB-3358 ) to verify if SIOC and SIOC in statistics mode are disabled on storage containers. If SIOC and SIOC in statistics mode are enabled on storage containers, disable them by performing the procedure described in Disabling Storage I/O Control (SIOC) on a Container.

Disabling Storage I/O Control (SIOC) on a Container

About this task

Perform the following procedure to disable storage I/O statistics collection.

Procedure

  1. Log on to vCenter with the web client.
  2. Click the Storage view in the left pane.
  3. Right-click the storage container under the Nutanix cluster and select Configure Storage I/O Controller .
    The properties for the storage container are displayed. The Disable Storage I/O statistics collection option is unchecked, which means that SIOC is enabled by default.
    1. Disable Storage I/O Control and statistics collection.
    2. Disable Storage I/O Control but enable statistics collection.
    3. Disable Storage I/O Control and statistics collection: Select the Disable Storage I/O Control and statistics collection option to disable SIOC.
      Uncheck Include I/O Statistics for SDRS option.
    4. Click OK .

Node Management

This chapter describes the management tasks you can do on a Nutanix node.

Node Maintenance (ESXi)

You are required to gracefully place a node into the maintenance mode or non-operational state for reasons such as making changes to the network configuration of a node, performing manual firmware upgrades or replacements, performing CVM maintenance or any other maintenance operations.

Entering and Exiting Maintenance Mode

With a minimum AOS release of 6.1.2, 6.5.1 or 6.6, you can only place one node at a time in maintenance mode for each cluster.​ When a host is in maintenance mode, the CVM is placed in maintenance mode as part of the node maintenance operation and any associated RF1 VMs are powered-off. The cluster marks the host as unschedulable so that no new VM instances are created on it. When a node is placed in the maintenance mode from the Prism web console, an attempt is made to evacuate VMs from the host. If the evacuation attempt fails, the host remains in the "entering maintenance mode" state, where it is marked unschedulable, waiting for user remediation.

When a host is placed in the maintenance mode, the CVM is placed in the maintenance mode as part of the node maintenance operation. The non-migratable VMs (for example, pinned or RF1 VMs which have affinity towards a specific node) are powered-off while live migratable or high availability (HA) VMs are moved from the original host to other hosts in the cluster. After exiting the maintenance mode, all non-migratable guest VMs are powered on again and the live migrated VMs are automatically restored on the original host.
Note: VMs with CPU passthrough or PCI passthrough, pinned VMs (with host affinity policies), and RF1 VMs are not migrated to other hosts in the cluster when a node undergoes maintenance. Click View these VMs link to view the list of VMs that cannot be live-migrated.

See Putting a Node into Maintenance Mode (vSphere) to place a node under maintenance.

You can also enter or exit a host under maintenance through the vCenter web client. See Putting the CVM and ESXi Host in Maintenance Mode Using vCenter.

Exiting a Node from Maintenance Mode

See Exiting a Node from the Maintenance Mode (vSphere) to remove a node from the maintenance mode.

Viewing a Node under Maintenance Mode

See Viewing a Node that is in Maintenance Mode to view the node under maintenance mode.

Guest VM Status when Node under Maintenance Mode

See Guest VM Status when Node is in Maintenance Mode to view the status of guest VMs when a node is undergoing maintenance operations.

Best Practices and Recommendations

Nutanix strongly recommends using the Enter Maintenance Mode option to place a node under maintenance.

Known Issues and Limitations ESXi

  • The Prism web console enabled maintenance operations (enter and exit node maintenance) are currently supported on ESXi.
  • Entering or exiting a node under maintenance using the vCenter for ESXi is not equivalent to entering or exiting the node under maintenance from the Prism Element web console.
  • You cannot exit the node from maintenance mode from Prism Element web console if the node is placed under maintenance mode using vCenter (ESXi node). However, you can enter the node maintenance through the Prism Element web console and exit the node maintenance using the vCenter (ESXi node).

Putting a Node into Maintenance Mode (vSphere)

Before you begin

Check the cluster status and resiliency before putting a node under maintenance. You can also verify the status of the guest VMs. See Guest VM Status when Node is in Maintenance Mode for more information.

About this task

As the node enter the maintenance mode, the following high-level tasks are performed internally.
  • The host initiates entering the maintenance mode.
  • The HA VMs are live migrated.
  • The pinned and RF1 VMs are powered-off.
  • The CVM enters the maintenance mode.
  • The CVM is shutdown.
  • The host completes entering the maintenance mode.

For more information, see Guest VM Status when Node is in Maintenance Mode to view the status of the guest VMs.

Procedure

  1. Login to the Prism Element web console.
  2. On the home page, select Hardware from the drop-down menu.
  3. Go to the Table > Host view.
  4. Select the node that you want to put under maintenance.
  5. Click the Enter Maintenance Mode option.
    Figure. Enter Maintenance Mode Option Click to enlarge

  6. On the Host Maintenance window, provide the vCenter credentials for the ESXi host and click Next .
    Figure. Host Maintenance Window (vCenter Credentials) Click to enlarge

  7. On Host Maintenance window, select the Power off VMs that cannot migrate check box to enable the Enter Maintenance Mode button.
    Figure. Host Maintenance Window (Enter Maintenance Mode) Click to enlarge

    Note: VMs with CPU passthrough, PCI passthrough, pinned VMs (with host affinity policies), and RF1 are not migrated to other hosts in the cluster when a node undergoes maintenance. Click View these VMs link to view the list of VMs that cannot be live-migrated.
  8. Click the Enter Maintenance Mode button.
    • A revolving icon appears as a tool tip beside the selected node and also in the Host Details view. This indicates that the host is entering the maintenance mode.
    • The revolving icon disappears and the Exit Maintenance Mode option is enabled after the node completely enters the maintenance mode.
      Figure. Enter Node Maintenance (On-going) Click to enlarge

    • You can also monitor the progress of the node maintenance operation through the newly created Host enter maintenance and Enter maintenance mode tasks which appear in the task tray.
    Note: In case of a node maintenance failure, certain roll-back operations are performed. For example, the CVM is rebooted. But the live-migrated VMs are not restored to the original host.

What to do next

Once the maintenance activity is complete, you can perform any of the following.
  • View the nodes under maintenance, see Viewing a Node that is in Maintenance Mode
  • View the status of the guest VMs, see Guest VM Status when Node is in Maintenance Mode
  • Remove the node from the maintenance mode Exiting a Node from the Maintenance Mode (vSphere))

Viewing a Node that is in Maintenance Mode

About this task

Note: This procedure is the same for AHV and ESXI nodes.

Perform the following steps to view a node under maintenance.

Procedure

  1. Login to the Prism Element web console.
  2. On the home page, select Hardware from the drop-down menu.
  3. Go to the Table > Host view.
  4. Observe the icon along with a tool tip that appears beside the node which is under maintenance. You can also view this icon in the host details view.
    Figure. Example: Node under Maintenance (Table and Host Details View) in AHV Click to enlarge

  5. Alternatively, view the node under maintenance from the Hardware > Diagram view.
    Figure. Example: Node under Maintenance (Diagram and Host Details View) in AHV Click to enlarge

What to do next

You can:
  • View the status of the guest VMs, see Guest VM Status when Node is in Maintenance Mode.
  • Remove the node from the maintenance mode Exiting a Node from the Maintenance Mode (vSphere) .

Exiting a Node from the Maintenance Mode (vSphere)

After you perform any maintenance activity, exit the node from the maintenance mode.

About this task

As the node exits the maintenance mode, the following high-level tasks are performed internally.
  • The host is taken out of maintenance.
  • The CVM is powered on.
  • The CVM is taken out of maintenance.
After the host exits the maintenance mode, the RF1 VMs continue to be powered on and the VMs migrate to restore host locality.

For more information, see Guest VM Status when Node is in Maintenance Mode to view the status of the guest VMs.

Procedure

  1. On the Prism web console home page, select Hardware from the drop-down menu.
  2. Go to the Table > Host view.
  3. Select the node which you intend to remove from the maintenance mode.
  4. Click the Exit Maintenance Mode option.
    Figure. Exit Maintenance Mode Option - Table View Click to enlarge

    Figure. Exit Maintenance Mode Option - Diagram View Click to enlarge

  5. On the Host Maintenance window, provide the vCenter credentials for the ESXi host and click Next .
    Figure. Host Maintenance Window (vCenter Credentials) Click to enlarge

  6. On the Host Maintenance window, click the Exit Maintenance Mode button.
    Figure. Host Maintenance Window (Enter Maintenance Mode) Click to enlarge

    • A revolving icon appears as a tool tip beside the selected node and also in the Host Details view. This indicates that the host is exiting the maintenance mode.
    • The revolving icon disappears and the Enter Maintenance Mode option is enabled after the node completely exits the maintenance mode.
    • You can also monitor the progress of the exit node maintenance operation through the newly created Host exit maintenance and Exit maintenance mode tasks which appear in the task tray.

What to do next

You can:
  • View the status of node under maintenance, see Viewing a Node that is in Maintenance Mode.
  • View the status of the guest VMs, see Guest VM Status when Node is in Maintenance Mode.

Guest VM Status when Node is in Maintenance Mode

The following scenarios demonstrate the behavior of three guest VM types - high availability (HA) VMs, pinned VMs, and RF1 VMs, when a node enters and exits a maintenance operation. The HA VMs are live VMs that can migrate across nodes if the host server goes down or reboots. The pinned VMs have the host affinity set to a specific node. The RF1 VMs have affinity towards a specific node or a CVM. To view the status of the guest VMs, go to VM > Table .

Note: The following scenarios are the same for AHV and ESXI nodes.

Scenario 1: Guest VMs before Node Entering Maintenance Mode

In this example, you can observe the status of the guest VMs on the node prior to the node entering the maintenance mode. All the guest VMs are powered-on and reside on the same host.

Figure. Example: Original State of VM and Hosts in AHV Click to enlarge

Scenario 2: Guest VMs during Node Maintenance Mode

  • As the node enter the maintenance mode, the following high-level tasks are performed internally.
    1. The host initiates entering the maintenance mode.
    2. The HA VMs are live migrated.
    3. The pinned and RF1 VMs are powered-off.
    4. The host completes entering the maintenance mode.
    5. The CVM enters the maintenance mode.
    6. The AHV host completes entering the maintenance mode.
    7. The CVM enters the maintenance mode.
    8. The CVM is shut down.
Figure. Example: VM and Hosts before Entering Maintenance Mode Click to enlarge

Scenario 3: Guest VMs after Node Exiting Maintenance Mode

  • As the node exits the maintenance mode, the following high-level tasks are performed internally.
    1. The CVM is powered on.
    2. The CVM is taken out of maintenance.
    3. The host is taken out of maintenance.
    After the host exits the maintenance mode, the RF1 VMs continue to be powered on and the VMs migrate to restore host locality.
Figure. Example: Original State of VM and Hosts in AHV Click to enlarge

Nonconfigurable ESXi Components

The Nutanix manufacturing and installation processes done by running Foundation on the Nutanix nodes configures the following components. Do not modify any of these components except under the direction of Nutanix Support.

Nutanix Software

Modifying any of the following Nutanix software settings may inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.

  • Local datastore name.
  • Configuration and contents of any CVM (except memory configuration to enable certain features).
Important: Note the following important considerations about CVMs.
  • Do not delete the Nutanix CVM.
  • Do not take a snapshot of the CVM for backup.
  • Do not rename, modify, or delete the admin and nutanix user accounts of the CVM.
  • Do not create additional CVM user accounts.

    Use the default accounts ( admin or nutanix ), or use sudo to elevate to the root account.

  • Do not decrease CVM memory below recommended minimum amounts required for cluster and add-in features.

    Nutanix Cluster Checks (NCC), preupgrade cluster checks, and the AOS upgrade process detect and monitor CVM memory.

  • Nutanix does not support the usage of third-party storage on the host part of Nutanix clusters.

    Normal cluster operations might be affected if there are connectivity issues with the third-party storage you attach to the hosts in a Nutanix cluster.

  • Do not run any commands on a CVM that are not in the Nutanix documentation.

ESXi

Modifying any of the following ESXi settings can inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.

  • NFS datastore settings
  • VM swapfile location
  • VM startup/shutdown order
  • CVM name
  • CVM virtual hardware configuration file (.vmx file)
  • iSCSI software adapter settings
  • Hardware settings, including passthrough HBA settings.

  • vSwitchNutanix standard virtual switch
  • vmk0 interface in Management Network port group
  • SSH
    Note: An SSH connection is necessary for various scenarios. For example, to establish connectivity with the ESXi server through a control plane that does not depend on additional management systems or processes. The SSH connection is also required to modify the networking and control paths in the case of a host failure to maintain High Availability. For example, the CVM autopathing (Ha.py) requires an SSH connection. In case a local CVM becomes unavailable, another CVM in the cluster performs the I/O operations over the 10GbE interface.
  • Open host firewall ports
  • CPU resource settings such as CPU reservation, limit, and shares of the CVM.
    Caution: Do not use the Reset System Configuration option.
  • ProductLocker symlink setting to point at the default datastore.

    Do not change the /productLocker symlink to point at a non-local datastore.

    Do not change the ProductLockerLocation advanced setting.

Putting the CVM and ESXi Host in Maintenance Mode Using vCenter

About this task

Nutanix recommends placing the CVM and ESXi host into maintenance mode while the Nutanix cluster undergoes maintenance or patch installations.
Caution: Verify the data resiliency status of your Nutanix cluster. Ensure that the replication factor (RF) supports putting the node in maintenance mode.

Procedure

  1. Log on to vCenter with the web client.
  2. If vSphere DRS is enabled on the Nutanix cluster, skip this step. If vSphere DRS is disabled, perform one of the following.
    • Manually migrate all the VMs except the CVM to another host in the Nutanix cluster.
    • Shut down VMs other than the CVM that you do not want to migrate to another host.
  3. Right-click the host and select Maintenance Mode > Enter Maintenance Mode.
  4. In the Enter Maintenance Mode dialog box, check Move powered-off and suspended virtual machines to other hosts in the cluster and click OK .
    Note:

    In certain rare conditions, even when DRS is enabled, some VMs do not automatically migrate due to user-defined affinity rules or VM configuration settings. The VMs that do not migrate appear under cluster DRS > Faults when a maintenance mode task is in progress. To address the faults, either manually shut down those VMs or ensure the VMs can be migrated.

    Caution: When you put the host in maintenance mode, the maintenance mode process powers down or migrates all the VMs that are running on the host.
    The host gets ready to go into maintenance mode, which prevents VMs from running on this host. DRS automatically attempts to migrate all the VMs to another host in the Nutanix cluster.

    The host enters maintenance mode after its CVM is shut down.

Shutting Down an ESXi Node in a Nutanix Cluster

Before you begin

Verify the data resiliency status of your Nutanix cluster. If the Nutanix cluster only has replication factor 2 (RF2), you can shut down only one node for each cluster. If an RF2 cluster has more than one node shut down, shut down the entire cluster.

About this task

You can put the ESXi host into maintenance mode and shut it down either from the web client or from the command line. For more information about shutting down a node from the command line, see Shutting Down an ESXi Node in a Nutanix Cluster (vSphere Command Line).

Procedure

  1. Log on to vCenter with the web client.
  2. Put the Nutanix node in the maintenance mode. For more information, see Putting the CVM and ESXi Host in Maintenance Mode Using vCenter.
    Note: If DRS is not enabled, manually migrate or shut down all the VMs excluding the CVM. The VMs that are not migrated automatically even when the DRS is enabled can be because of a configuration option in the VM that is not present on the target host.
  3. Right-click the host and select Shut Down .
    Wait until the vCenter displays that the host is not responding, which may take several minutes. If you are logged on to the ESXi host rather than to vCenter, the web client disconnects when the host shuts down.

Shutting Down an ESXi Node in a Nutanix Cluster (vSphere Command Line)

Before you begin

Verify the data resiliency status of your Nutanix cluster. If the Nutanix cluster only has replication factor 2 (RF2), you can shut down only one node for each cluster. If an RF2 cluster has more than one node shut down, shut down the entire cluster.

About this task

Procedure

  1. Log on to the CVM with SSH and shut down the CVM.
    nutanix@cvm$ cvm_shutdown -P now
  2. Log on to another CVM in the Nutanix cluster with SSH.
  3. Shut down the host.
    nutanix@cvm$ ~/serviceability/bin/esx-enter-maintenance-mode -s cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host.

    If successful, this command returns no output. If it fails with a message like the following, VMs are probably still running on the host.

    CRITICAL esx-enter-maintenance-mode:42 Command vim-cmd hostsvc/maintenance_mode_enter failed with ret=-1

    Ensure that all VMs are shut down or moved to another host and try again before proceeding.

    nutanix@cvm$ ~/serviceability/bin/esx-shutdown -s cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host..

    Alternatively, you can put the ESXi host into maintenance mode and shut it down using the vSphere web client. For more information, see Shutting Down an ESXi Node in a Nutanix Cluster.

    If the host shuts down, a message like the following is displayed.

    INFO esx-shutdown:67 Please verify if ESX was successfully shut down using ping hypervisor_ip_addr

    hypervisor_ip_addr is the IP address of the ESXi host.

  4. Confirm that the ESXi host has shut down.
    nutanix@cvm$ ping hypervisor_ip_addr

    Replace hypervisor_ip_addr with the IP address of the ESXi host.

    If no ping packets are answered, the ESXi host shuts down.

Starting an ESXi Node in a Nutanix Cluster

About this task

You can start an ESXi host either from the web client or from the command line. For more information about starting a node from the command line, see Starting an ESXi Node in a Nutanix Cluster (vSphere Command Line).

Procedure

  1. If the node is off, turn it on by pressing the power button on the front. Otherwise, proceed to the next step.
  2. Log on to vCenter (or to the node if vCenter is not running) with the web client.
  3. Right-click the ESXi host and select Exit Maintenance Mode .
  4. Right-click the CVM and select Power > Power on .
    Wait approximately 5 minutes for all services to start on the CVM.
  5. Log on to another CVM in the Nutanix cluster with SSH.
  6. Confirm that the Nutanix cluster services are running on the CVM.
    nutanix@cvm$ ncli cluster status | grep -A 15 cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host.

    Output similar to the following is displayed.
        Name                      : 10.1.56.197
        Status                    : Up
        ... ... 
        StatsAggregator           : up
        SysStatCollector          : up

    Every service listed should be up .

  7. Right-click the ESXi host in the web client and select Rescan for Datastores . Confirm that all Nutanix datastores are available.
  8. Verify that the status of all services on all the CVMs are Up.
    nutanix@cvm$ cluster status
    If the Nutanix cluster is running properly, output similar to the following is displayed for each node in the Nutanix cluster.
    CVM:host IP-Address Up
                                    Zeus   UP       [9935, 9980, 9981, 9994, 10015, 10037]
                               Scavenger   UP       [25880, 26061, 26062]
                                  Xmount   UP       [21170, 21208]
                        SysStatCollector   UP       [22272, 22330, 22331]
                               IkatProxy   UP       [23213, 23262]
                        IkatControlPlane   UP       [23487, 23565]
                           SSLTerminator   UP       [23490, 23620]
                          SecureFileSync   UP       [23496, 23645, 23646]
                                  Medusa   UP       [23912, 23944, 23945, 23946, 24176]
                      DynamicRingChanger   UP       [24314, 24404, 24405, 24558]
                                  Pithos   UP       [24317, 24555, 24556, 24593]
                              InsightsDB   UP       [24322, 24472, 24473, 24583]
                                  Athena   UP       [24329, 24504, 24505]
                                 Mercury   UP       [24338, 24515, 24516, 24614]
                                  Mantle   UP       [24344, 24572, 24573, 24634]
                              VipMonitor   UP       [18387, 18464, 18465, 18466, 18474]
                                Stargate   UP       [24993, 25032]
                    InsightsDataTransfer   UP       [25258, 25348, 25349, 25388, 25391, 25393, 25396]
                                   Ergon   UP       [25263, 25414, 25415]
                                 Cerebro   UP       [25272, 25462, 25464, 25581]
                                 Chronos   UP       [25281, 25488, 25489, 25547]
                                 Curator   UP       [25294, 25528, 25529, 25585]
                                   Prism   UP       [25718, 25801, 25802, 25899, 25901, 25906, 25941, 25942]
                                     CIM   UP       [25721, 25829, 25830, 25856]
                            AlertManager   UP       [25727, 25862, 25863, 25990]
                                Arithmos   UP       [25737, 25896, 25897, 26040]
                                 Catalog   UP       [25749, 25989, 25991]
                               Acropolis   UP       [26011, 26118, 26119]
                                   Uhura   UP       [26037, 26165, 26166]
                                    Snmp   UP       [26057, 26214, 26215]
                       NutanixGuestTools   UP       [26105, 26282, 26283, 26299]
                              MinervaCVM   UP       [27343, 27465, 27466, 27730]
                           ClusterConfig   UP       [27358, 27509, 27510]
                                Aequitas   UP       [27368, 27567, 27568, 27600]
                             APLOSEngine   UP       [27399, 27580, 27581]
                                   APLOS   UP       [27853, 27946, 27947]
                                   Lazan   UP       [27865, 27997, 27999]
                                  Delphi   UP       [27880, 28058, 28060]
                                    Flow   UP       [27896, 28121, 28124]
                                 Anduril   UP       [27913, 28143, 28145]
                                   XTrim   UP       [27956, 28171, 28172]
                           ClusterHealth   UP       [7102, 7103, 27995, 28209,28495, 28496, 28503, 28510,	
    28573, 28574, 28577, 28594, 28595, 28597, 28598, 28602, 28603, 28604, 28607, 28645, 28646, 28648, 28792,	
    28793, 28837, 28838, 28840, 28841, 28858, 28859, 29123, 29124, 29127, 29133, 29135, 29142, 29146, 29150,	
    29161, 29162, 29163, 29179, 29187, 29219, 29268, 29273]

Starting an ESXi Node in a Nutanix Cluster (vSphere Command Line)

About this task

You can start an ESXi host either from the command line or from the web client. For more information about starting a node from the web client, see Starting an ESXi Node in a Nutanix Cluster .

Procedure

  1. Log on to a running CVM in the Nutanix cluster with SSH.
  2. Start the CVM.
    nutanix@cvm$ ~/serviceability/bin/esx-exit-maintenance-mode -s cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host.

    If successful, this command produces no output. If it fails, wait 5 minutes and try again.

    nutanix@cvm$ ~/serviceability/bin/esx-start-cvm -s cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host.

    .

    If the CVM starts, a message like the following is displayed.

    INFO esx-start-cvm:67 CVM started successfully. Please verify using ping cvm_ip_addr

    cvm_ip_addr is the IP address of the CVM on the ESXi host.

    After starting, the CVM restarts once. Wait three to four minutes before you ping the CVM.

    Alternatively, you can take the ESXi host out of maintenance mode and start the CVM using the web client. For more information, see Starting an ESXi Node in a Nutanix Cluster

  3. Verify that the status of all services on all the CVMs are Up.
    nutanix@cvm$ cluster status
    If the Nutanix cluster is running properly, output similar to the following is displayed for each node in the Nutanix cluster.
    CVM:host IP-Address Up
                                    Zeus   UP       [9935, 9980, 9981, 9994, 10015, 10037]
                               Scavenger   UP       [25880, 26061, 26062]
                                  Xmount   UP       [21170, 21208]
                        SysStatCollector   UP       [22272, 22330, 22331]
                               IkatProxy   UP       [23213, 23262]
                        IkatControlPlane   UP       [23487, 23565]
                           SSLTerminator   UP       [23490, 23620]
                          SecureFileSync   UP       [23496, 23645, 23646]
                                  Medusa   UP       [23912, 23944, 23945, 23946, 24176]
                      DynamicRingChanger   UP       [24314, 24404, 24405, 24558]
                                  Pithos   UP       [24317, 24555, 24556, 24593]
                              InsightsDB   UP       [24322, 24472, 24473, 24583]
                                  Athena   UP       [24329, 24504, 24505]
                                 Mercury   UP       [24338, 24515, 24516, 24614]
                                  Mantle   UP       [24344, 24572, 24573, 24634]
                              VipMonitor   UP       [18387, 18464, 18465, 18466, 18474]
                                Stargate   UP       [24993, 25032]
                    InsightsDataTransfer   UP       [25258, 25348, 25349, 25388, 25391, 25393, 25396]
                                   Ergon   UP       [25263, 25414, 25415]
                                 Cerebro   UP       [25272, 25462, 25464, 25581]
                                 Chronos   UP       [25281, 25488, 25489, 25547]
                                 Curator   UP       [25294, 25528, 25529, 25585]
                                   Prism   UP       [25718, 25801, 25802, 25899, 25901, 25906, 25941, 25942]
                                     CIM   UP       [25721, 25829, 25830, 25856]
                            AlertManager   UP       [25727, 25862, 25863, 25990]
                                Arithmos   UP       [25737, 25896, 25897, 26040]
                                 Catalog   UP       [25749, 25989, 25991]
                               Acropolis   UP       [26011, 26118, 26119]
                                   Uhura   UP       [26037, 26165, 26166]
                                    Snmp   UP       [26057, 26214, 26215]
                       NutanixGuestTools   UP       [26105, 26282, 26283, 26299]
                              MinervaCVM   UP       [27343, 27465, 27466, 27730]
                           ClusterConfig   UP       [27358, 27509, 27510]
                                Aequitas   UP       [27368, 27567, 27568, 27600]
                             APLOSEngine   UP       [27399, 27580, 27581]
                                   APLOS   UP       [27853, 27946, 27947]
                                   Lazan   UP       [27865, 27997, 27999]
                                  Delphi   UP       [27880, 28058, 28060]
                                    Flow   UP       [27896, 28121, 28124]
                                 Anduril   UP       [27913, 28143, 28145]
                                   XTrim   UP       [27956, 28171, 28172]
                           ClusterHealth   UP       [7102, 7103, 27995, 28209,28495, 28496, 28503, 28510,	
    28573, 28574, 28577, 28594, 28595, 28597, 28598, 28602, 28603, 28604, 28607, 28645, 28646, 28648, 28792,	
    28793, 28837, 28838, 28840, 28841, 28858, 28859, 29123, 29124, 29127, 29133, 29135, 29142, 29146, 29150,	
    29161, 29162, 29163, 29179, 29187, 29219, 29268, 29273]
  4. Verify the storage.
    1. Log on to the ESXi host with SSH.
    2. Rescan for datastores.
      root@esx# esxcli storage core adapter rescan --all
    3. Confirm that cluster VMFS datastores, if any, are available.
      root@esx# esxcfg-scsidevs -m | awk '{print $5}'

Restarting an ESXi Node using CLI

Before you begin

Shut down the guest VMs, including vCenter that are running on the node or move them to other nodes in the Nutanix cluster.

About this task

Procedure

  1. Log on to vCenter (or to the ESXi host if the node is running the vCenter VM) with the web client.
  2. Right-click the host and select Maintenance mode > Enter Maintenance Mode .
    In the Confirm Maintenance Mode dialog box, click OK .
    The host is placed in maintenance mode, which prevents VMs from running on the host.
    Note: The host does not enter in the maintenance mode until after the CVM is shut down.
  3. Log on to the CVM with SSH and shut down the CVM.
    nutanix@cvm$ cvm_shutdown -P now
    Note: Do not reset or shutdown the CVM in any way other than the cvm_shutdown command to ensure that the cluster is aware that the CVM is unavailable.
  4. Right-click the node and select Power > Reboot .
    Wait until vCenter shows that the host is not responding and then is responding again, which takes several minutes.

    If you are logged on to the ESXi host rather than to vCenter, the web client disconnects when the host shuts down.

  5. Right-click the ESXi host and select Exit Maintenance Mode .
  6. Right-click the CVM and select Power > Power on .
    Wait approximately 5 minutes for all services to start on the CVM.
  7. Log on to the CVM with SSH.
  8. Confirm that the Nutanix cluster services are running on the CVM.
    nutanix@cvm$ ncli cluster status | grep -A 15 cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the CVM on the ESXi host.

    Output similar to the following is displayed.
        Name                      : 10.1.56.197
        Status                    : Up
        ... ... 
        StatsAggregator           : up
        SysStatCollector          : up

    Every service listed should be up .

  9. Right-click the ESXi host in the web client and select Rescan for Datastores . Confirm that all Nutanix datastores are available.

Rebooting an ESXI Node in a Nutanix Cluster

About this task

The Request Reboot operation in the Prism web console gracefully restarts the selected nodes one after the other.

Perform the following procedure to restart the nodes in the cluster.

Procedure

  1. Click the gear icon in the main menu and then select Reboot in the Settings page.
  2. In the Request Reboot window, select the nodes you want to restart, and click Reboot .
    Figure. Request Reboot of ESXi Node Click to enlarge

    A progress bar is displayed that indicates the progress of the restart of each node.

Changing an ESXi Node Name

After running a bare-metal Foundation, you can change the host (node) name from the command line or by using the vSphere web client.

To change the hostname, see VMware Documentation. .

Changing an ESXi Node Password

Although it is not required for the root user to have the same password on all hosts (nodes), doing so makes cluster management and support much easier. If you do select a different password for one or more hosts, make sure to note the password for each host.

To change the host password, see VMware Documentation .

Changing the CVM Memory Configuration (ESXi)

About this task

You can increase the memory reserved for each CVM in your Nutanix cluster by using the 1-click CVM Memory Upgrade option available from the Prism Element web console.

Increase memory size depending on the workload type or to enable certain AOS features. For more information about CVM memory sizing recommendations and instructions about how to increase the CVM memory, see Increasing the Controller VM Memory Size in the Prism Web Console Guide .

VM Management

For the list of supported VMs, see Compatibility and Interoperability Matrix.

VM Management Using Prism Central

You can create and manage a VM on your ESXi from Prism Central. For more information, see Creating a VM through Prism Central (ESXi) and Managing a VM (ESXi).

Creating a VM through Prism Central (ESXi)

In ESXi clusters, you can create a new virtual machine (VM) through Prism Central.

Before you begin

  • See the requirements and limitations section in vCenter Server Integration in the Prism Central Guide before proceeding.
  • Register the vCenter Server with your cluster. For more information, see Registering vCenter Server (Prism Central) in the Prism Central Guide .

About this task

To create a VM, do the following:

Procedure

  1. Go to the List tab of the VMs dashboard (see VM Summary View in the Prism Central Guide ) and click the Create VM button.
    The Create VM wizard appears.
  2. In the Configuration step, do the following in the indicated fields:
    1. Name : Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Cluster : Select the target cluster from the pull-down list on which you intend to create the VM.
    4. Number of VMs : Enter the number of VMs you intend to create. The created VM names are suffixed sequentially.
    5. vCPU(s) : Enter the number of virtual CPUs to allocate to this VM.
    6. Number of Cores per vCPU : Enter the number of cores assigned to each virtual CPU.
    7. Memory : Enter the amount of memory (in GiBs) to allocate to this VM.
    Figure. Create VM Window (Configuration) Click to enlarge VM update window display

  3. In the Resources step, do the following.

    Disks: To attach a disk to the VM, click the Attach Disk button. The Add Disks dialog box appears. Do the following in the indicated fields:

    Figure. Add Disk Dialog Box Click to enlarge

    1. Type : Select the type of storage device, Disk or CD-ROM , from the pull-down list.
    2. Operation : Specify the device contents from the pull-down list.
      • Select Clone from NDSF file to copy any file from the cluster that can be used as an image onto the disk.

      • [ CD-ROM only] Select Empty CD-ROM to create a blank CD-ROM device. A CD-ROM device is needed when you intend to provide a system image from CD-ROM.
      • [Disk only] Select Allocate on Storage Container to allocate space without specifying an image. Selecting this option means you are allocating space only. You have to provide a system image later from a CD-ROM or other source.
      • Select Clone from Image to copy an image that you have imported by using image service feature onto the disk.
    3. Bus Type : Select the bus type from the pull-down list. The choices are IDE , SCSI , PCI , or SATA .
    4. Path : Enter the path to the desired system image.
    5. Clone from Image : Select the image that you have created by using the image service feature.

      This field appears only when Clone from Image Service is selected. It specifies the image to copy.

      Note: If the image you created does not appear in the list, see KB-4892.
    6. Storage Container : Select the storage container to use from the pull-down list.

      This field appears only when Allocate on Storage Container is selected. The list includes all storage containers created for this cluster.

    7. Capacity : Enter the disk size in GiB.
    8. When all the field entries are correct, click the Add button to attach the disk to the VM and return to the Create VM dialog box.
    9. Repeat this step to attach additional devices to the VM.
    Figure. Create VM Window (Resources) Click to enlarge Create VM Window (Resources)

  4. In the Resources step, do the following.

    Networks: To create a network interface for the VM, click the Attach to Subnet button. The Attach to Subnet dialog box appears.

    Do the following in the indicated fields:

    Figure. Attach to Subnet Dialog Box Click to enlarge Create NIC window display

    1. Subnet : Select the target virtual LAN from the pull-down list.

      The list includes all defined networks (see Configuring Network Connections in the Prism Central Guide ).

    2. Network Adapter Type : Select the network adapter type from the pull-down list.

      For information about the list of supported adapter types, see vCenter Server Integration in the Prism Central Guide .

    3. Network Connection State : Select the state for the network that you want it to operate in after VM creation. The options are Connected or Disconnected .
    4. When all the field entries are correct, click the Add button to create a network interface for the VM and return to the Create VM dialog box.
    5. Repeat this step to create more network interfaces for the VM.
  5. In the Management step, do the following.
    1. Categories : Search for the category to be assigned to the VM. The policies associated with the category value are assigned to the VM.
    2. Guest OS : Type and select the guest operating system.

      The guest operating system that you select affects the supported devices and number of virtual CPUs available for the virtual machine. The Create VM wizard does not install the guest operating system. For information about the list of supported operating systems, see vCenter Server Integration in the Prism Central Guide .

      Figure. Create VM Window (Management) Click to enlarge VM update resources display - VM Flash Mode

  6. In the Review step, when all the field entries are correct, click the Create VM button to create the VM and close the Create VM dialog box.

    The new VM appears in the VMs entity page list.

Managing a VM (ESXi)

You can manage virtual machines (VMs) in an ESXi cluster through Prism Central.

Before you begin

  • See the requirements and limitations section in vCenter Server Integration in the Prism Central Guide before proceeding.
  • Ensure that you have registered the vCenter Server with your cluster. For more information, see Registering vCenter Server (Prism Central) in the Prism Central Guide .

About this task

After creating a VM (see Creating a VM through Prism Central (ESXi)), you can use Prism Central to update the VM configuration, delete the VM, clone the VM, launch a console window, power on (or off) the VM, enable flash mode for a VM, assign the VM to a protection policy, create VM recovery point, add the VM to a recovery plan, run a playbook, manage categories, install and manage Nutanix Guest Tools (NGT), manage the VM ownership, or configure QoS settings.

You can perform these tasks by using any of the following methods:

  • Select the target VM in the List tab of the VMs dashboard (see VM Summary View in the Prism Central Guide ) and choose the required action from the Actions menu.
  • Right-click on the target VM in the List tab of the VMs dashboard and select the required action from the drop-down list.
  • Go to the details page of a selected VM (see VM Details View in the Prism Central Guide ) and select the desired action.
Note: The available actions appear in bold; other actions are unavailable. The available actions depend on the current state of the VM and your permissions.

Procedure

  • To modify the VM configuration, select Update .

    The Update VM dialog box appears, which includes the same fields as the Create VM dialog box. Make the desired changes and then click the Save button in the Review step.

    Figure. Update VM Window (Resources) Click to enlarge VM update resources display - VM Flash Mode

  • Disks: You can add new disks to the VM using the Attach Disk option. You can also modify the existing disk attached to the VM using the controls under the actions column. See Creating a VM through Prism Central (ESXi) before you create a new disk for a VM. You can enable or disable the flash mode settings for the VM. To enable flash mode on the VM, click the Enable Flash Mode check box. After you enable this feature on the VM, the status is updated in the VM table view.
  • Networks: You can attach new network to the VM using the Attach to Subnet option. You can also modify the existing subnet attached to the VM. See Creating a VM through Prism Central (ESXi) before you modify NIC network or create a new NIC for a VM.
  • To delete the VM, select Delete . A window prompt appears; click the OK button to delete the VM.
  • To clone the VM, select Clone .

    This displays the Clone VM dialog box, which includes the same fields as the Create VM dialog box. A cloned VM inherits most the configurations (except the name) of the source VM. Enter a name for the clone and then click the Save button to create the clone. You can optionally override some of the configurations before clicking the Save button. For example, you can override the number of vCPUs, memory size, boot priority, NICs, or the guest customization.

    Note:
    • You can clone up to 250 VMs at a time.
    • You cannot override the secure boot setting while cloning a VM, unless the source VM already had secure boot setting enabled.
    Figure. Clone VM Window Click to enlarge clone VM window display

  • To launch a console window, select Launch Console .

    This opens a Virtual Network Computing (VNC) client and displays the console in a new tab or window. This option is available only when the VM is powered on. The VM power options that you access from the Power On Actions (or Power Off Actions ) action link below the VM table can also be accessed from the VNC console window. To access the VM power options, click the Power button at the top-right corner of the console window.

    Note: A VNC client may not function properly on all browsers. Some keys are not recognized when the browser is Chrome. (Firefox typically works best.)
    Figure. Console Window (VNC) Click to enlarge VNC console window display

  • To power on (or off) the VM, select Power on (or Power off ).
  • To disable (or enable) efficiency measurement for the VM, select Disable Efficiency Measurement (or Enable Efficiency Measurement ).
  • To disable (or enable) anomaly detection the VM, select Disable Anomaly Detection (or Enable Anomaly Detection ).
  • To assign the VM to a protection policy, select Protect . This opens a page to specify the protection policy to which this VM should be assigned. To remove the VM from a protection policy, select Unprotect .
    Note: You can create a protection policy for a VM or set of VMs that belong to one or more categories by enabling Leap and configuring the Availability Zone.
  • To migrate the VM to another host, select Migrate .

    This displays the Migrate VM dialog box. Select the target host from the pull-down list (or select the System will automatically select a host option to let the system choose the host) and then click the Migrate button to start the migration.

    Figure. Migrate VM Window Click to enlarge migrate VM window display

    Note: Nutanix recommends to live migrate VMs when they are under light load. If they are migrated while heavily utilized, migration may fail because of limited bandwidth.
  • To add this VM to a recovery plan you created previously, select Add to Recovery Plan . For more information, see Adding Guest VMs Individually to a Recovery Plan in the Leap Administration Guide .
  • To create VM recovery point, select Create Recovery Point .

    This displays the Create VM Recovery Point dialog box. Enter a name of the recovery action for the VM. You can choose to create an App Consistent VM recovery point by enabling the check-box. The VM can be restored or replicated from a Recovery Point either locally or remotely in a state of a chosen recovery point.

    Figure. Create VM Recovery Point Window Click to enlarge Create VM Recovery Point Window display

  • To run a playbook you created previously, select Run Playbook . For more information, see Running a Playbook (Manual Trigger) in the Prism Central Guide .
  • To assign the VM a category value, select Manage Categories .

    This displays the Manage VM Categories page. For more information, see Assigning a Category in the Prism Central Guide .

  • To install Nutanix Guest Tools (NGT), select Install NGT . For more information, see Installing NGT on Multiple VMs in the Prism Central Guide .
  • To enable (or disable) NGT, select Manage NGT Applications . For more information, see Managing NGT Applications in the Prism Central Guide .
    The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
    Note: If you clone a VM, by default NGT is not enabled on the cloned VM. You need to again enable and mount NGT on the cloned VM. If you want to enable NGT on multiple VMs simultaneously, see Enabling NGT and Mounting the NGT Installer on Cloned VMs in the Prism Web Console Guide .

    If you eject the CD, you can mount the CD back again by logging into the Controller VM and running the following nCLI command.

    ncli> ngt mount vm-id=virtual_machine_id

    For example, to mount the NGT on the VM with VM_ID=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-c1601e759987, type the following command.

    ncli> ngt mount vm-id=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-
    c1601e759987
    Note:
    • Self-service restore feature is not enabled by default on a VM. You need to manually enable the self-service restore feature.
    • If you have created the NGT ISO CD-ROMs on AOS 4.6 or earlier releases, the NGT functionality will not work even if you upgrade your cluster because REST APIs have been disabled. You need to unmount the ISO, remount the ISO, install the NGT software again, and then upgrade to a later AOS version.
  • To upgrade NGT, select Upgrade NGT . For more information, see Upgrading NGT in the Prism Central Guide .
  • To establish VM host affinity, select Configure VM Host Affinity .

    A window appears with the available hosts. Select (click the icon for) one or more of the hosts and then click the Save button. This creates an affinity between the VM and the selected hosts. If possible, it is recommended that you create an affinity to multiple hosts (at least two) to protect against downtime due to a node failure. For more information about VM affinity policies, see Affinity Policies Defined in Prism Central in the Prism Central Guide .

    Figure. Set VM Host Affinity Window Click to enlarge

  • To add a VM to the catalog, select Add to Catalog . This displays the Add VM to Catalog page. For more information, see Adding a Catalog Item in the Prism Central Guide .
  • To specify a project and user who own this VM, select Manage Ownership .
    In the Manage VM Ownership window, do the following in the indicated fields:
    1. Project : Select the target project from the pull-down list.
    2. User : Enter a user name. A list of matches appears as you enter a string; select the user name from the list when it appears.
    3. Click the Save button.
    Figure. VM Ownership Window Click to enlarge

  • To configure quality of service (QoS) settings, select Set QoS Attributes . For more information, see Setting QoS for an Individual VM in the Prism Central Guide .

VM Management using Prism Element

You can create and manage a VM on your ESXi from Prism Element. For more information, see Creating a VM (ESXi) and Managing a VM (ESXi).

Creating a VM (ESXi)

In ESXi clusters, you can create a new virtual machine (VM) through the web console.

Before you begin

  • See the requirements and limitations section in VM Management through Prism Element (ESXi) in the Prism Web Console Guide before proceeding.
  • Register the vCenter Server with your cluster. For more information, see Registering a Cluster to vCenter Server.

About this task

When creating a VM, you can configure all of its components, such as number of vCPUs and memory, but you cannot attach a volume group to the VM.

To create a VM, do the following:

Procedure

  1. In the VM dashboard (see VM Dashboard), click the Create VM button.
    The Create VM dialog box appears.
  2. Do the following in the indicated fields:
    1. Name : Enter a name for the VM.
    2. Description (optional): Enter a description for the VM.
    3. Guest OS : Type and select the guest operating system.
      The guest operating system that you select affects the supported devices and number of virtual CPUs available for the virtual machine. The Create VM wizard does not install the guest operating system. For information about the list of supported operating systems, see VM Management through Prism Element (ESXi) in the Prism Web Console Guide .
    4. vCPU(s) : Enter the number of virtual CPUs to allocate to this VM.
    5. Number of Cores per vCPU : Enter the number of cores assigned to each virtual CPU.
    6. Memory : Enter the amount of memory (in GiBs) to allocate to this VM.
  3. To attach a disk to the VM, click the Add New Disk button.
    The Add Disks dialog box appears.
    Figure. Add Disk Dialog Box Click to enlarge configure a disk screen

    Do the following in the indicated fields:
    1. Type : Select the type of storage device, DISK or CD-ROM , from the pull-down list.
      The following fields and options vary depending on whether you choose DISK or CD-ROM .
    2. Operation : Specify the device contents from the pull-down list.
      • Select Clone from ADSF file to copy any file from the cluster that can be used as an image onto the disk.
      • Select Allocate on Storage Container to allocate space without specifying an image. (This option appears only when DISK is selected in the previous field.) Selecting this option means you are allocating space only. You have to provide a system image later from a CD-ROM or other source.
    3. Bus Type : Select the bus type from the pull-down list. The choices are IDE or SCSI .
    4. ADSF Path : Enter the path to the desired system image.
      This field appears only when Clone from ADSF file is selected. It specifies the image to copy. Enter the path name as / storage_container_name / vmdk_name .vmdk . For example to clone an image from myvm-flat.vmdk in a storage container named crt1 , enter /crt1/myvm-flat.vmdk . When a user types the storage container name ( / storage_container_name / ), a list appears of the VMDK files in that storage container (assuming one or more VMDK files had previously been copied to that storage container).
      Note: Make sure you are copying from a flat file.
    5. Storage Container : Select the storage container to use from the pull-down list.
      This field appears only when Allocate on Storage Container is selected. The list includes all storage containers created for this cluster.
    6. Size : Enter the disk size in GiBs.
    7. When all the field entries are correct, click the Add button to attach the disk to the VM and return to the Create VM dialog box.
    8. Repeat this step to attach more devices to the VM.
  4. To create a network interface for the VM, click the Add New NIC button.
    The Create NIC dialog box appears. Do the following in the indicated fields:
    1. VLAN Name : Select the target virtual LAN from the pull-down list.
      The list includes all defined networks. For more information, see Network Configuration for VM Interfaces in the Prism Web Console Guide .
    2. Network Adapter Type : Select the network adapter type from the pull-down list.

      For information about the list of supported adapter types, see VM Management through Prism Element (ESXi) in the Prism Web Console Guide .

    3. Network UUID : This is a read-only field that displays the network UUID.
    4. Network Address/Prefix : This is a read-only field that displays the network IP address and prefix.
    5. When all the field entries are correct, click the Add button to create a network interface for the VM and return to the Create VM dialog box.
    6. Repeat this step to create more network interfaces for the VM.
  5. When all the field entries are correct, click the Save button to create the VM and close the Create VM dialog box.
    The new VM appears in the VM table view. For more information, see VM Table View in the Prism Web Console Guide .

Managing a VM (ESXi)

You can use the web console to manage virtual machines (VMs) in the ESXi clusters.

Before you begin

  • See the requirements and limitations section in VM Management through Prism Element (ESXi) in the Prism Web Console Guide before proceeding.
  • Ensure that you have registered the vCenter Server with your cluster. For more information, see Registering a Cluster to vCenter Server.

About this task

After creating a VM, you can use the web console to manage guest tools, power operations, suspend, launch a VM console window, update the VM configuration, clone the VM, or delete the VM. To accomplish one or more of these tasks, do the following:

Note: Your available options depend on the VM status, type, and permissions. Unavailable options are unavailable.

Procedure

  1. In the VM dashboard (see VM Dashboard), click the Table view.
  2. Select the target VM in the table (top section of screen).
    The summary line (middle of screen) displays the VM name with a set of relevant action links on the right. You can also right-click on a VM to select a relevant action.

    The possible actions are Manage Guest Tools , Launch Console , Power on (or Power off actions ), Suspend (or Resume ), Clone , Update , and Delete . The following steps describe how to perform each action.

    Figure. VM Action Links Click to enlarge

  3. To manage guest tools as follows, click Manage Guest Tools .
    You can also enable NGT applications (self-service restore, volume snapshot service and application-consistent snapshots) as part of manage guest tools.
    1. Select Enable Nutanix Guest Tools check box to enable NGT on the selected VM.
    2. Select Mount Nutanix Guest Tools to mount NGT on the selected VM.
      Ensure that VM has at least one empty IDE CD-ROM or SATA slot to attach the ISO.

      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
    3. To enable self-service restore feature for Windows VMs, click Self Service Restore (SSR) check box.
      The self-service restore feature is enabled of the VM. The guest VM administrator can restore the desired file or files from the VM. For more information about the self-service restore feature, see Self-Service Restore in the Data Protection and Recovery with Prism Element guide.

    4. After you select Enable Nutanix Guest Tools check box the VSS and application-consistent snapshot feature is enabled by default.
      After this feature is enabled, Nutanix native in-guest VmQuiesced snapshot service (VSS) agent is used to take application-consistent snapshots for all the VMs that support VSS. This mechanism takes application-consistent snapshots without any VM stuns (temporary unresponsive VMs) and also enables third-party backup providers like Commvault and Rubrik to take application-consistent snapshots on Nutanix platform in a hypervisor-agnostic manner. For more information, see Conditions for Application-consistent Snapshots in the Data Protection and Recovery with Prism Element guide.

    5. To mount VMware guest tools, click Mount VMware Guest Tools check box.
      The VMware guest tools are mounted on the VM.
      Note: You can mount both VMware guest tools and Nutanix Guest Tools at the same time on a particular VM provided the VM has sufficient empty CD-ROM slots.
    6. Click Submit .
      The VM is registered with the NGT service. NGT is enabled and mounted on the selected virtual machine. A CD with volume label NUTANIX_TOOLS gets attached to the VM.
      Note:
      • If you clone a VM, by default NGT is not enabled on the cloned VM. If the cloned VM is powered off, enable NGT from the UI and start the VM. If cloned VM is powered on, enable NGT from the UI and restart the Nutanix guest agent service.
      • If you want to enable NGT on multiple VMs simultaneously, see Enabling NGT and Mounting the NGT Installer on Cloned VMs in the Prism Web Console Guide .
      If you eject the CD, you can mount the CD back again by logging into the Controller VM and running the following nCLI command.
      ncli> ngt mount vm-id=virtual_machine_id

      For example, to mount the NGT on the VM with VM_ID=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-c1601e759987, type the following command.

      ncli> ngt mount vm-id=00051a34-066f-72ed-0000-000000005400::38dc7bf2-a345-4e52-9af6-
      c1601e759987
      Caution: In AOS 4.6, for the powered-on Linux VMs on AHV, ensure that the NGT ISO is ejected or unmounted within the guest VM before disabling NGT by using the web console. This issue is specific for 4.6 version and does not occur from AOS 4.6.x or later releases.
      Note: If you have created the NGT ISO CD-ROMs prior to AOS 4.6 or later releases, the NGT functionality will not work even if you upgrade your cluster because REST APIs have been disabled. You must unmount the ISO, remount the ISO, install the NGT software again, and then upgrade to 4.6 or later version.
  4. To launch a VM console window, click the Launch Console action link.
    This opens a virtual network computing (VNC) client and displays the console in a new tab or window. This option is available only when the VM is powered on. The VM power options that you access from the Power Off Actions action link below the VM table can also be accessed from the VNC console window. To access the VM power options, click the Power button at the top-right corner of the console window.
    Note: A VNC client may not function properly on all browsers. Some keys are not recognized when the browser is Google Chrome. (Firefox typically works best.)
  5. To start (or shut down) the VM, click the Power on (or Power off ) action link.

    Power on begins immediately. If you want to shut down the VMs, you are prompted to select one of the following options:

    • Power Off . Hypervisor performs a hard shut down action on the VM.
    • Reset . Hypervisor performs an ACPI reset action through the BIOS on the VM.
    • Guest Shutdown . Operating system of the VM performs a graceful shutdown.
    • Guest Reboot . Operating system of the VM performs a graceful restart.
    Note: The Guest Shutdown and Guest Reboot options are available only when VMware guest tools are installed.
  6. To pause (or resume) the VM, click the Suspend (or Resume ) action link. This option is available only when the VM is powered on.
  7. To clone the VM, click the Clone action link.

    This displays the Clone VM dialog box, which includes the same fields as the Create VM dialog box. A cloned VM inherits the most the configurations (except the name) of the source VM. Enter a name for the clone and then click the Save button to create the clone. You can optionally override some of the configurations before clicking the Save button. For example, you can override the number of vCPUs, memory size, boot priority, NICs, or the guest customization.

    Note:
    • You can clone up to 250 VMs at a time.
    • In the Clone window, you cannot update the disks.
    Figure. Clone VM Window Click to enlarge clone VM window display

  8. To modify the VM configuration, click the Update action link.
    The Update VM dialog box appears, which includes the same fields as the Create VM dialog box. Modify the configuration as needed (see Creating a VM (ESXi)), and in addition you can enable Flash Mode for the VM.
    Note: If you delete a vDisk attached to a VM and snapshots associated with this VM exist, space associated with that vDisk is not reclaimed unless you also delete the VM snapshots.
    1. Click the Enable Flash Mode check box.
      • After you enable this feature on the VM, the status is updated in the VM table view. To view the status of individual virtual disks (disks that are flashed to the SSD), go the Virtual Disks tab in the VM table view.
      • You can disable the Flash Mode feature for individual virtual disks. To update the Flash Mode for individual virtual disks, click the update disk icon in the Disks pane and deselect the Enable Flash Mode check box.
      Figure. Update VM Resources Click to enlarge VM update resources display - VM Flash Mode

      Figure. Update VM Resources - VM Disk Flash Mode Click to enlarge VM update resources display - VM Disk Flash Mode

  9. To delete the VM, click the Delete action link. A window prompt appears; click the OK button to delete the VM.
    The deleted VM disappears from the list of VMs in the table. You can also delete a VM that is already powered on.

vDisk Provisioning Types in VMware with Nutanix Storage

You can specify the vDisk provisioning policy when you perform certain VM management operations like creating a VM, migrating a VM, or cloning a VM.

Traditionally, a vDisk is provisioned with specific allocated space (thick space) or with space allocated on an as-needed basis (thin disk). The thick disks provisions the space using either lazy zero or eager zero disk formatting method.

For traditional storage systems, the thick eager zeroed disks provide the best performance out of the three types of disk provisioning. Thick disks provide second best performance and thin disks provide the least performance. However, this does not apply to modern storage systems found in Nutanix systems.

Nutanix uses a thick Virtual Machine Disk (VMDK) to reserve the storage space using the vStorage APIs for Array Integration (VAAI) reserve space API.

On a Nutanix system, there is no performance difference between thin and thick disks. This means that a thick eager zeroed virtual disk has no performance benefits over a thin virtual disk.

Nutanix uses thick disk (VMDK) in its configuration and the resulting disk will be the same whether the disk is a thin or a thick disk (despite the configuration differences).

Note: A thick-disk reservation is required for the reservation of the disk space. Nutanix VMDK has no performance requirement to provision a thick disk. For a single Nutanix container, even when a thick disk is provisioned, no disk space is allocated to write zeroes. So, there is no requirement for provisioning a thick disk.

When using the up-to-date VAAI for cloning operations, the following behavior is expected:

  • When cloning any type of disk format (thin, thick lazy zeroed or thick eager zeroed) to the same Nutanix datastore, the resulting VM will have a thin disk regardless of the explicit choice of a disk format in the vSphere client.

    Nutanix uses a thin provisioned disk because a thin disk performs the same as a thick disk in the system. The thin disk prevents disk space from wasting. In the cloning scenario, Nutanix disallows the flow of the reservation property from the source to the destination when creating a fast clone on the same datastore. This prevents space wastage due to unnecessary reservation.

  • When cloning a VM to a different datastore, the destination VM will have the disk format that you specified in the vSphere client.
    Important: A thick disk will be shown as thick in ESXi, and within NDFS (Nutanix Distributed File System) it is shown as a thin disk with an extra field configuration.

Nutanix recommends using thin disks over any other disk type.

VM Migration

You can migrate a VM to an ESXi host in a Nutanix cluster. Usually the migration is done in the following cases.

  • Migrate VMs from existing storage platform to Nutanix.
  • Keep VMs running during disruptive upgrade or other downtime of Nutanix cluster.

In migrating VMs between Nutanix clusters running vSphere, the source host and NFS datastore are the ones presently running the VM. The target host and NFS datastore are the ones where the VM runs after migration. The target ESXi host and datastore must be part of a Nutanix cluster.

To accomplish this migration, you have to mount the NFS datastores from the target on the source. After the migration is complete, you must unmount the datastores and block access.

Migrating a VM to Another Nutanix Cluster

Before you begin

Before migrating a VM to another Nutanix cluster running vSphere, verify that you have provisioned the target Nutanix environment.

About this task

The shared storage feature in vSphere allows you to move both compute and storage resources from the source legacy environment to the target Nutanix environment at the same time without disruption. This feature also removes the need to do any sort of file systems allow lists on Nutanix.

You can use the shared storage feature through the migration wizard in the web client.

Procedure

  1. Log on to vCenter with the web client.
  2. Select the VM that you want to migrate.
  3. Right-click the VM and select Migrate .
  4. Under Select Migration Type , select Change both compute resource and storage .
  5. Select Compute Resource and then Storage and click Next .
    If necessary, change the disk format to the one that you want to use during the migration process.
  6. Select a destination network for all VM network adapters and click Next .
  7. Click Finish .
    Wait for the migration process to complete. The process performs the storage vMotion first, and then creates a temporary storage network over vmk0 for the period where the disk files are on Nutanix.

Cloning a VM

About this task

To clone a VM, you must enable the Nutanix VAAI plug-in. For steps to enable and verify Nutanix VAAI plug-in, refer KB-1868 .

Procedure

  1. Log on to vCenter with the web client.
  2. Right-click the VM and select Clone .
  3. Follow the wizard to enter a name for the clone, select a cluster, and select a host.
  4. Select the datastore that contains source VM and click Next .
    Note: If you choose a datastore other than the one that contains the source VM, the clone operation uses the VMware implementation and not the Nutanix VAAI plug-in.
  5. If desired, set the guest customization parameters. Otherwise, proceed to the next step.
  6. Click Finish .

vStorage APIs for Array Integration

To improve the vSphere cloning process, Nutanix provides a vStorage API for array integration (VAAI) plug-in. This plug-in is installed by default during the Nutanix factory process.

Without the Nutanix VAAI plug-in, the process of creating a full clone takes a significant amount of time because all the data that comprises a VM is duplicated. This duplication also results in an increase in storage consumption.

The Nutanix VAAI plug-in efficiently makes full clones without reserving space for the clone. Read requests for blocks shared between parent and clone are sent to the original vDisk that was created for the parent VM. As the clone VM writes new blocks, the Nutanix file system allocates storage for those blocks. This data management occurs completely at the storage layer, so the ESXi host sees a single file with the full capacity that was allocated when the clone was created.

vSphere ESXi Hardening Settings

Configure the following settings in /etc/ssh/sshd_config to harden an ESXi hypervisor in a Nutanix cluster.
Caution: When hardening ESXi security, some settings may impact operations of a Nutanix cluster.
HostbasedAuthentication no
PermitTunnel no
AcceptEnv
GatewayPorts no
Compression no
StrictModes yes
KerberosAuthentication no
GSSAPIAuthentication no
PermitUserEnvironment no
PermitEmptyPasswords no
PermitRootLogin no

Match Address x.x.x.11,x.x.x.12,x.x.x.13,x.x.x.14,192.168.5.0/24
PermitRootLogin yes
PasswordAuthentication yes

ESXi Host Upgrade

You can upgrade your host either automatically through Prism Element (1-click upgrade) or manually. For more information about automatic and manual upgrades, see ESXi Upgrade and ESXi Host Manual Upgrade respectively.

This paragraph describes the Nutanix hypervisor support policy for vSphere and Hyper-V hypervisor releases. Nutanix provides hypervisor compatibility and support statements that should be reviewed before planning an upgrade to a new release or applying a hypervisor update or patch:
  • Compatibility and Interoperability Matrix
  • Hypervisor Support Policy- See Support Policies and FAQs for the supported Acropolis hypervisors.

Review the Nutanix Field Advisory page also for critical issues that Nutanix may have uncovered with the hypervisor release being considered.

Note: You may need to log in to the Support Portal to view the links above.

The Acropolis Upgrade Guide provides steps that can be used to upgrade the hypervisor hosts. However, as noted in the documentation, the customer is responsible for reviewing the guidance from VMware or Microsoft, respectively, on other component compatibility and upgrade order (e.g. vCenter), which needs to be planned first.

ESXi Upgrade

These topics describe how to upgrade your ESXi hypervisor host through the Prism Element web console Upgrade Software feature (also known as 1-click upgrade). To install or upgrade VMware vCenter server or other third-party software, see your vendor documentation for this information.

AOS supports ESXi hypervisor upgrades that you can apply through the web console Upgrade Software feature (also known as 1-click upgrade).

You can view the available upgrade options, start an upgrade, and monitor upgrade progress through the web console. In the main menu, click the gear icon, and then select Upgrade Software in the Settings panel. You can see the current status of your software versions and start an upgrade.

VMware ESXi Hypervisor Upgrade Recommendations and Limitations

  • To install or upgrade VMware vCenter Server or other third-party software, see your vendor documentation.
  • Always consult the VMware web site for any vCenter and hypervisor installation dependencies. For example, a hypervisor version might require that you upgrade vCenter first.
  • If you have not enabled DRS in your environment and want to upgrade the ESXi host, you need to upgrade the ESXi host manually. For more information about upgrading ESXi hosts manually, see ESXi Host Manual Upgrade in the vSphere Administration Guide .
  • Disable Admission Control to upgrade ESXi on AOS; if enabled, the upgrade process will fail. You can enable it for normal cluster operation otherwise.
Nutanix Support for ESXi Upgrades
Nutanix qualifies specific VMware ESXi hypervisor updates and provides a related JSON metadata upgrade file on the Nutanix Support Portal for one-click upgrade through the Prism web console Software Upgrade feature.

Nutanix does not provide ESXi binary files, only related JSON metadata upgrade files. Obtain ESXi offline bundles (not ISOs) from the VMware web site.

Nutanix supports the ability to patch upgrade ESXi hosts with versions that are greater than or released after the Nutanix qualified version, but Nutanix might not have qualified those releases. See the Nutanix hypervisor support statement in our Support FAQ. For updates that are made available by VMware that do not have a Nutanix-provided JSON metadata upgrade file, obtain the offline bundle and md5sum checksum available from VMware, then use the web console Software Upgrade feature to upgrade ESXi.

Mixing nodes with different processor (CPU) types in the same cluster
If you are mixing nodes with different processor (CPU) types in the same cluster, you must enable enhanced vMotion compatibility (EVC) to allow vMotion/live migration of VMs during the hypervisor upgrade. For example, if your cluster includes a node with a Haswell CPU and other nodes with Broadwell CPUs, open vCenter and enable VMware enhanced vMotion compatibility (EVC) setting and specifically enable EVC for Intel hosts.
CPU Level for Enhanced vMotion Compatibility (EVC)

AOS Controller VMs and Prism Central VMs require a minimum CPU micro-architecture version of Intel Sandy Bridge. For AOS clusters with ESXi hosts, or when deploying Prism Central VMs on any ESXi cluster: if you have set the vSphere cluster enhanced vMotion compatibility (EVC) level, the minimum level must be L4 - Sandy Bridge .

vCenter Requirements and Limitations
Note: ENG-358564 You might be unable to log in to vCenter Server as the /storage/seat partition for vCenter Server version 7.0 and later might become full due to a large number of SSH-related events. See KB 10830 at the Nutanix Support portal for symptoms and solutions to this issue.
  • If your cluster is running the ESXi hypervisor and is also managed by VMware vCenter, you must provide vCenter administrator credentials and vCenter IP address as an extra step before upgrading. Ensure that ports 80 / 443 are open between your cluster and your vCenter instance to successfully upgrade.
  • If you have just registered your cluster in vCenter. Do not perform any cluster upgrades (AOS, Controller VM memory, hypervisor, and so on) if you have just registered your cluster in vCenter. Wait at least 1 hour before performing upgrades to allow cluster settings to become updated. Also do not register the cluster in vCenter and perform any upgrades at the same time.
  • Cluster mapped to two vCenters. Upgrading software through the web console (1-click upgrade) does not support configurations where a cluster is mapped to two vCenters or where it includes host-affinity must rules for VMs.

    Ensure that enough cluster resources are available for live migration to occur and to allow hosts to enter maintenance mode.

Mixing Different Hypervisor Versions
For ESXi hosts, mixing different hypervisor versions in the same cluster is temporarily allowed for deferring a hypervisor upgrade as part of an add-node/expand cluster operation, reimaging a node as part of a break-fix procedure, planned migrations, and similar temporary operations.

Upgrading ESXi Hosts by Uploading Binary and Metadata Files

Before you begin

About this task

Do the following steps to download Nutanix-qualified ESXi metadata .JSON files and upgrade the ESXi hosts through Upgrade Software in the Prism Element web console. Nutanix does not provide ESXi binary files, only related JSON metadata upgrade files.

Procedure

  1. Before performing any upgrade procedure, make sure you are running the latest version of the Nutanix Cluster Check (NCC) health checks and upgrade NCC if necessary.
  2. Run NCC as described in Run NCC Checks .
  3. Log on to the Nutanix support portal and navigate to the Hypervisors Support page from the Downloads menu, then download the Nutanix-qualified ESXi metadata .JSON files to your local machine or media.
    1. The default view is All . From the drop-down menu, select Nutanix - VMware ESXi , which shows all available JSON versions.
    2. From the release drop-down menu, select the available ESXi version. For example, 7.0.0 u2a .
    3. Click Download to download the Nutanix-qualified ESXi metadata .JSON file.
    Figure. Downloads Page for ESXi Metadata JSON Click to enlarge This picture shows the portal page for ESXi metadata JSON downloads
  4. Log on to the Prism Element web console for any node in the cluster.
  5. Click the gear icon in the main menu, select Upgrade Software in the Settings page, and then click the Hypervisor tab.
  6. Click the upload the Hypervisor binary link.
  7. Click Choose File for the metadata JSON (obtained from Nutanix) and binary files (obtained from VMware), respectively, browse to the file locations, select the file, and click Upload Now .
  8. When the file upload is completed, click Upgrade > Upgrade Now , then click Yes to confirm.
    [Optional] To run the pre-upgrade installation checks only on the Controller VM where you are logged on without upgrading, click Upgrade > Pre-upgrade . These checks also run as part of the upgrade procedure.
  9. Type your vCenter IP address and credentials, then click Upgrade .
    Ensure that you are using your Active Directory or LDAP credentials in the form of domain\username or username@domain .
    Note: AOS can detect if you have uploaded software that is already installed or upgraded. In this case, the Upgrade option is not displayed, because the software is already installed.
    The Upgrade Software dialog box shows the progress of your selection, including status of pre-installation checks and uploads, through the Progress Monitor .
  10. On the LCM page, click Inventory > Perform Inventory to enable LCM to check, update and display the inventory information.
    For more information, see Performing Inventory With LCM in the Acropolis Upgrade Guide .

Upgrading ESXi by Uploading An Offline Bundle File and Checksum

About this task

  • Do the following steps to download a non-Nutanix-qualified (patch) ESXi upgrade offline bundle from VMware, then upgrade ESXi through Upgrade Software in the Prism Element web console.
  • Typically you perform this procedure to patch your version of ESXi and Nutanix has not yet officially qualified that new patch version. Nutanix supports the ability to patch upgrade ESXi hosts with versions that are greater than or released after the Nutanix qualified version, but Nutanix might not have qualified those releases.

Procedure

  1. From the VMware web site, download the offline bundle (for example, update-from-esxi6.0-6.0_update02.zip ) and copy the associated MD5 checksum. Ensure that this checksum is obtained from the VMware web site, not manually generated from the bundle by you.
  2. Save the files to your local machine or media, such as a USB drive or other portable media.
  3. Log on to the Prism Element web console for any node in the cluster.
  4. Click the gear icon in the main menu of the Prism Element web console, select Upgrade Software in the Settings page, and then click the Hypervisor tab.
  5. Click the upload the Hypervisor binary link.
  6. Click enter md5 checksum and copy the MD5 checksum into the Hypervisor MD5 Checksum field.
  7. Scroll down and click Choose File for the binary file, browse to the offline bundle file location, select the file, and click Upload Now .
    Figure. ESXi 1-Click Upgrade, Unqualified Bundle Click to enlarge ESXi 1-Click Upgrade dialog box
  8. When the file upload is completed, click Upgrade > Upgrade Now , then click Yes to confirm.
    [Optional] To run the pre-upgrade installation checks only on the Controller VM where you are logged on without upgrading, click Upgrade > Pre-upgrade . These checks also run as part of the upgrade procedure.
  9. Type your vCenter IP address and credentials, then click Upgrade .
    Ensure that you are using your Active Directory or LDAP credentials in the form of domain\username or username@domain .
    Note: AOS can detect if you have uploaded software that is already installed or upgraded. In this case, the Upgrade option is not displayed, because the software is already installed.
    The Upgrade Software dialog box shows the progress of your selection, including status of pre-installation checks and uploads, through the Progress Monitor .

ESXi Host Manual Upgrade

If you have not enabled DRS in your environment and want to upgrade the ESXi host, you must upgrade the ESXi host manually. This topic describes all the requirements that you must meet before manually upgrading the ESXi host.

Tip: If you have enabled DRS and want to upgrade the ESXi host, use the one-click upgrade procedure from the Prism web console. For more information on the one-click upgrade procedure, see the ESXi Upgrade.

Nutanix supports the ability to patch upgrade the ESXi hosts with the versions that are greater than or released after the Nutanix qualified version, but Nutanix might not have qualified those releases. See the Nutanix hypervisor support statement in our Support FAQ.

Because ESXi hosts with different versions can co-exist in a single Nutanix cluster, upgrading ESXi does not require cluster downtime.

  • If you want to avoid cluster interruption, you must complete upgrading a host and ensure that the CVM is running before upgrading any other host. When two hosts in a cluster are down at the same time, all the data is unavailable.
  • If you want to minimize the duration of the upgrade activities and cluster downtime is acceptable, you can stop the cluster and upgrade all hosts at the same time.
Warning: By default, Nutanix clusters have redundancy factor 2, which means they can tolerate the failure of a single node or drive. Nutanix clusters with a configured option of redundancy factor 3 allow the Nutanix cluster to withstand the failure of two nodes or drives in different blocks.
  • Never shut down or restart multiple Controller VMs or hosts simultaneously.
  • Always run the cluster status command to verify that all Controller VMs are up before performing a Controller VM or host shutdown or restart.

ESXi Host Upgrade Process

Perform the following process to upgrade ESXi hosts in your environment.

Prerequisites and Requirements

Note: Use the following process only if you do not have DRS enabled in your Nutanix cluster.
  • If you are upgrading all nodes in the cluster at once, shut down all guest VMs and stop the cluster with the cluster stop command.
    Caution: There is downtime if you upgrade all the nodes in the Nutanix cluster at once. If you do not want downtime in your environment, you must ensure that only one CVM is shut down at a time in a redundancy factor 2 configuration.
  • If you are upgrading the nodes while keeping the cluster running, ensure that all nodes are up by logging on to a CVM and running the cluster status command. If any nodes are not running, start them before proceeding with the upgrade. Shut down all guest VMs on the node or migrate them to other nodes in the Nutanix cluster.
  • Disable email alerts in the web console under Email Alert Services or with the nCLI command.
    ncli> alerts update-alert-config enable=false
  • Run the complete NCC health check by using the health check command.
    nutanix@cvm$ ncc health_checks run_all
  • Run the cluster status command to verify that all Controller VMs are up and running, before performing a Controller VM or host shutdown or restart.
    nutanix@cvm$ cluster status
  • Place the host in the maintenance mode by using the web client.
  • Log on to the CVM with SSH and shut down the CVM.
    nutanix@cvm$ cvm_shutdown -P now
    Note: Do not reset or shutdown the CVM in any way other than the cvm_shutdown command to ensure that the cluster is aware that the CVM is unavailable.
  • Start the upgrade using vSphere Upgrade Guide or vCenter Update Manager VUM.

Upgrading ESXi Host

  • See the VMware Documentation for information about the standard ESXi upgrade procedures. If any problem occurs with the upgrade process, an alert is raised in the Alert dashboard.

Post Upgrade

Run the complete NCC health check by using the following command.

nutanix@cvm$ ncc health_checks run_all

vSphere Cluster Settings Checklist

Review the following checklist of the settings that you have to configure to successfully deploy vSphere virtual environment running Nutanix Enterprise cloud.

vSphere Availability Settings

  • Enable host monitoring.
  • Enable admission control and use the percentage-based policy with a value based on the number of nodes in the cluster.

    For more information about settings of percentage of cluster resources reserved as failover spare capacity, vSphere HA Admission Control Settings for Nutanix Environment.

  • Set the VM Restart Priority of all CVMs to Disabled .
  • Set the Host Isolation Response of the cluster to Power Off & Restart VMs .
  • Set the VM Monitoring for all CVMs to Disabled .
  • Enable datastore heartbeats by clicking Use datastores only from the specified list and choosing the Nutanix NFS datastore.

    If the cluster has only one datastore, click Advanced Options tab and add das.ignoreInsufficientHbDatastore with Value of true .

vSphere DRS Settings

  • Set the Automation Level on all CVMs to Disabled .
  • Select Automation Level to accept level 3 recommendations.
  • Leave power management disabled.

Other Cluster Settings

  • Configure advertised capacity for the Nutanix storage container (total usable capacity minus the capacity of one node for replication factor 2 or two nodes for replication factor 3).
  • Store VM swapfiles in the same directory as the VM.
  • Enable enhanced vMotion compatibility (EVC) in the cluster. For more information, see vSphere EVC Settings.
  • Configure Nutanix CVMs with the appropriate VM overrides. For more information, see VM Override Settings.
  • Check Nonconfigurable ESXi Components. Modifying the nonconfigurable components may inadvertently constrain performance of your Nutanix cluster or render the Nutanix cluster inoperable.
Read article

Security Guide

AOS Security 5.20

Product Release Date: 2021-05-17

Last updated: 2022-12-14

Audience & Purpose

This Security Guide is intended for security-minded people responsible for architecting, managing, and supporting infrastructures, especially those who want to address security without adding more human resources or additional processes to their datacenters.

This guide offers an overview of the security development life cycle (SecDL) and host of security features supported by Nutanix. It also demonstrates how Nutanix complies with security regulations to streamline infrastructure security management. In addition to this, this guide addresses the technical requirements that are site specific or compliance-standards (that should be adhered), which are not enabled by default.

Note:

Hardening of the guest OS or any applications running on top of the Nutanix infrastructure is beyond the scope of this guide. We recommend that you refer to the documentation of the products that you have deployed in your Nutanix environment.

Nutanix Security Infrastructure

Nutanix takes a holistic approach to security with a secure platform, extensive automation, and a robust partner ecosystem. The Nutanix security development life cycle (SecDL) integrates security into every step of product development, rather than applying it as an afterthought. The SecDL is a foundational part of product design. The strong pervasive culture and processes built around security harden the Enterprise Cloud Platform and eliminate zero-day vulnerabilities. Efficient one-click operations and self-healing security models easily enable automation to maintain security in an always-on hyperconverged solution.

Since traditional manual configuration and checks cannot keep up with the ever-growing list of security requirements, Nutanix conforms to RHEL 7 Security Technical Implementation Guides (STIGs) that use machine-readable code to automate compliance against rigorous common standards. With Nutanix Security Configuration Management Automation (SCMA), you can quickly and continually assess and remediate your platform to ensure that it meets or exceeds all regulatory requirements.

Nutanix has standardized the security profile of the Controller VM to a security compliance baseline that meets or exceeds the standard high-governance requirements.

The most commonly used references in United States to guide vendors to build products according to the set of technical requirements are as follows.

  • The National Institute of Standards and Technology Special Publications Security and Privacy Controls for Federal Information Systems and Organizations (NIST 800.53)
  • The US Department of Defense Information Systems Agency (DISA) Security Technical Implementation Guides (STIG)

SCMA Implementation

The Nutanix platform and all products leverage the Security Configuration Management Automation (SCMA) framework to ensure that services are constantly inspected for variance to the security policy.

Nutanix has implemented security configuration management automation (SCMA) to check multiple security entities for both Nutanix storage and AHV. Nutanix automatically reports log inconsistencies and reverts them to the baseline.

With SCMA, you can schedule the STIG to run hourly, daily, weekly, or monthly. STIG has the lowest system priority within the virtual storage controller, ensuring that security checks do not interfere with platform performance.
Note: Only the SCMA schedule can be modified. The AIDE schedule is run on a fixed weekly schedule. To change the SCMA schedule for AHV or the Controller VM, see Hardening Instructions (nCLI).

RHEL 7 STIG Implementation in Nutanix Controller VM

Nutanix leverages SaltStack and SCMA to self-heal any deviation from the security baseline configuration of the operating system and hypervisor to remain in compliance. If any component is found as non-compliant, then the component is set back to the supported security settings without any intervention. To achieve this objective, Nutanix has implemented the Controller VM to support STIG compliance with the RHEL 7 STIG as published by DISA.

The STIG rules are capable of securing the boot loader, packages, file system, booting and service control, file ownership, authentication, kernel, and logging.

Example: STIG rules for Authentication

Prohibit direct root login, lock system accounts other than root , enforce several password maintenance details, cautiously configure SSH, enable screen-locking, configure user shell defaults, and display warning banners.

Security Updates

Nutanix provides continuous fixes and updates to address threats and vulnerabilities. Nutanix Security Advisories provide detailed information on the available security fixes and updates, including the vulnerability description and affected product/version.

To see the list of security advisories or search for a specific advisory, log on to the Support Portal and select Documentation , and then Security Advisories .

Nutanix Security Landscape

This topic provides highlights on Nutanix security landscape and its highlights. The following table helps to identify the security features offered out-of-the-box in Nutanix infrastructure.

Topic Highlights
Authentication and Authorization
Network segmentation VLAN-based, data driven segmentation
Security Policy Management Implement security policies using Microsegmentation.
Data security and integrity
Hardening Instructions

Log monitoring and analysis

Flow Networking

See Flow Networking Guide

UEFI

See UEFI Support for VMs in the AHV Administration Guide

Secure Boot

See Secure Boot Support for VMs in the AHV Administration Guide

Windows Credential Guard support

See Windows Defender Credential Guard Support in AHV in the AHV Administration Guide

RBAC

See Controlling User Access (RBAC)

Hardening Instructions (nCLI)

This chapter describes how to implement security hardening features for Nutanix AHV and Controller VM.

Hardening AHV

You can use Nutanix Command Line Interface (nCLI) in order to customize the various configuration settings related to AHV as described below.

Table 1. Configuration Settings to Harden the AHV
Description Command or Settings Output
Getting the cluster-wide configuration of the SCMA policy. Run the following command:
nutanix@cvm$ ncli cluster get-hypervisor-security-config
Enable Aide : false
Enable Core : false
Enable High Strength P... : false
Enable Banner : false
Schedule : DAILY
Enabling the Advanced Intrusion Detection Environment (AIDE) to run on a weekly basis. Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params enable-aide=true
Enable Aide : true
Enable Core : false
Enable High Strength P... : false
Enable Banner : false
Schedule : DAILY 
Enabling the high-strength password policies (minlen=15, difok=8, maxclassrepeat=4). Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params \
enable-high-strength-password=true
Enable Aide : true
Enable Core : false
Enable High Strength P... : true
Enable Banner : false
Schedule : DAILY
Enabling the defense knowledge consent banner of the US department. Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params enable-banner=true
Enable Aide : true
Enable Core : false
Enable High Strength P... : true
Enable Banner : true
Schedule : DAILY
Changing the default schedule of running the SCMA. The schedule can be hourly, daily, weekly, and monthly. Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params schedule=hourly
Enable Aide : true
Enable Core : false
Enable High Strength P... : true
Enable Banner : true
Schedule : HOURLY
Enabling the settings so that AHV can generate stack traces for any cluster issue. Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params enable-core=true
Note: Nutanix recommends that Core should not be set to true unless instructed by the Nutanix support team.
Enable Aide : true
Enable Core : true
Enable High Strength P... : true
Enable Banner : true
Schedule : HOURLY
When a high governance official needs to run the hardened configuration. The settings should be as follows:
Enable Aide : true
Enable Core : false
Enable High Strength P... : true
Enable Banner : false
Schedule : HOURLY
When a federal official needs to run the hardened configuration. The settings should be as follows:
Enable Aide : true
Enable Core : false
Enable High Strength P... : true
Enable Banner : true
Schedule : HOURLY
Note: A banner file can be modified to support non-DoD customer banners.
Backing up the DoD banner file. Run the following command on the AHV host:
[root@AHV-host ~]# sudo cp -a /srv/salt/security/KVM/sshd/DODbanner \
/srv/salt/security/KVM/sshd/DODbannerbak
Modifying the DoD banner file. Run the following command on the AHV host:
[root@AHV-host ~]# sudo vi /srv/salt/security/KVM/sshd/DODbanner
Note: Repeat all the above steps on every AHV in a cluster.
Setting the banner for all nodes through nCLI. Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params enable-banner=true

The following options are configured or customized to harden the AHV:

  • Enable AIDE : Advanced Intrusion Detection Environment (AIDE) is a Linux utility that monitors a given node. After you install the AIDE package, the system will generate a database that contains all the files you selected in your configuration file by entering the aide -–init command as a root user. You can move the database to a secure location in a read-only media or on other machines. After you create the database, you can use the aide -–check command for the system to check the integrity of the files and directories by comparing the files and directories on your system with the snapshot in the database. In case there are unexpected changes, a report gets generated, which you can review. If the changes to existing files or files added are valid, you can use the aide --update command to update the database with the new changes.
  • Enable high strength password : You can run the command as shown in the table in this section to enable high-strength password policies (minlen=15, difok=8, maxclassrepeat=4).
    Note:
    • minlen is the minimum required length for a password.
    • difok is the minimum number of characters that must be different from the old password.
    • maxclassrepeat is the number of consecutive characters of same class that you can use in a password.
  • Enable Core : A core dump consists of the recorded state of the working memory of a computer program at a specific time, generally when the program gets crashed or terminated abnormally. Core dumps are used to assist in diagnosing or debugging errors in computer programs. You can enable the core for troubleshooting purposes.
  • Enable Banner : You can set a banner to display a specific message. For example, set a banner to display a warning message that the system is available to authorized users only.

Hardening Controller VM

You can use Nutanix Command Line Interface (nCLI) in order to customize the various configuration settings related to CVM as described below.

  • Run the following command to support cluster-wide configuration of the SCMA policy.

    nutanix@cvm$ ncli cluster get-cvm-security-config

    The current cluster configuration is displayed.

    Enable Aide : false
    Enable Core : false
    Enable High Strength P...: false
    Enable Banner : false
    Enable SNMPv3 Only : false
    Schedule : DAILY
  • Run the following command to schedule weekly execution of Advanced Intrusion Detection Environment (AIDE).

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-aide=true

    The following output is displayed.

    Enable Aide : true
    Enable Core : false
    Enable High Strength P... : false
    Enable Banner : false
    Enable SNMPv3 Only : false
    Schedule : DAILY
  • Run the following command to enable the strong password policy.

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-high-strength-password=true

    The following output is displayed.

    Enable Aide : true
    Enable Core : false
    Enable High Strength P... : true
    Enable Banner : false
    Enable SNMPv3 Only : false
    Schedule : DAILY
  • Run the following command to enable the defense knowledge consent banner of the US department.

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-banner=true

    The following output is displayed.

    Enable Aide : true
    Enable Core : false
    Enable High Strength P... : true
    Enable Banner : true
    Enable SNMPv3 Only : false
    Schedule : DAILY
  • Run the following command to enable the settings to allow only SNMP version 3.

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-snmpv3-only=true

    The following output is displayed.

    Enable Aide : true
    Enable Core : false
    Enable High Strength P... : true
    Enable Banner : true
    Enable SNMPv3 Only : true
    Schedule : DAILY
  • Run the following command to change the default schedule of running the SCMA. The schedule can be hourly, daily, weekly, and monthly.

    nutanix@cvm$ ncli cluster edit-cvm-security-params schedule=hourly

    The following output is displayed.

    Enable Aide : true
    Enable Core : false
    Enable High Strength P... : true
    Enable Banner : true
    Enable SNMPv3 Only : true
    Schedule : HOURLY
  • Run the following command to enable the settings so that Controller VM can generate stack traces for any cluster issue.

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-core=true

    The following output is displayed.

    Enable Aide : true
    Enable Core : true
    Enable High Strength P... : true
    Enable Banner : true
    Enable SNMPv3 Only : true
    Schedule : HOURLY
    Note: Nutanix recommends that Core should not be set to true unless instructed by the Nutanix support team.
  • When a high governance official needs to run the hardened configuration then the settings should be as follows.

    Enable Aide : true
    Enable Core : false
    Enable High Strength P... : true
    Enable Banner : false
    Enable SNMPv3 Only : true
    Schedule : HOURLY
  • When a federal official needs to run the hardened configuration then the settings should be as follows.

    Enable Aide : true
    Enable Core : false
    Enable High Strength P... : true
    Enable Banner : true
    Enable SNMPv3 Only : true
    Schedule : HOURLY
    Note: A banner file can be modified to support non-DoD customer banners.
  • Run the following command to backup the DoD banner file.

    nutanix@cvm$ sudo cp -a /srv/salt/security/CVM/sshd/DODbanner \
    /srv/salt/security/CVM/sshd/DODbannerbak
  • Run the following command to modify DoD banner file.

    nutanix@cvm$ sudo vi /srv/salt/security/CVM/sshd/DODbanner
    Note: Repeat all the above steps on every CVM in a cluster.
  • Run the following command to backup the DoD banner file of the PCVM.

    nutanix@pcvm$ sudo cp -a /srv/salt/security/PC/sshd/DODbanner \
    /srv/salt/security/PC/sshd/DODbannerbak
  • Run the following command to modify DoD banner file of the PCVM.

    nutanix@pcvm$ sudo vi /srv/salt/security/PC/sshd/DODbanner
  • Run the following command to set the banner for all nodes through nCLI.

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-banner=true

TCP Wrapper Integration

Nutanix Controller VM uses the tcp_wrappers package to allow TCP supported daemons to control the network subnets which can access the libwrapped daemons. By default, SCMA controls the /etc/hosts.allow file in /srv/salt/security/CVM/network/hosts.allow and contains a generic entry to allow access to NFS, secure shell, and SNMP.

sshd: ALL : ALLOW
rpcbind: ALL : ALLOW
snmpd: ALL : ALLOW
snmptrapd: ALL : ALLOW

Nutanix recommends that the above configuration is changed to include only the localhost entries and the management network subnet for the restricted operations; this applies to both production and high governance compliance environments. This ensures that all subnets used to communicate with the CVMs are included in the /etc/hosts.allow file.

Common Criteria

Common Criteria is an international security certification that is recognized by many countries around the world. Nutanix AOS and AHV are Common Criteria certified by default and no additional configuration is required to enable the Common Criteria mode. For more information, see the Nutanix Trust website.
Note: Nutanix uses FIPS-validated cryptography by default.

Security Management Using Prism Element (PE)

Nutanix provides several mechanisms to maintain security in a cluster using Prism Element.

Configuring Authentication

About this task

Nutanix supports user authentication. To configure authentication types and directories and to enable client authentication or to enable client authentication only, do the following:
Caution: The web console (and nCLI) does not allow the use of the not secure SSLv2 and SSLv3 ciphers. There is a possibility of an SSL Fallback situation in some browsers which denies access to the web console. To eliminate this, disable (uncheck) SSLv2 and SSLv3 in any browser used for access. However, TLS must be enabled (checked).

Procedure

  1. Click the gear icon in the main menu and then select Authentication in the Settings page.
    The Authentication Configuration window appears.
    Note: The following steps combine three distinct procedures, enabling authentication (step 2), configuring one or more directories for LDAP/S authentication (steps 3-5), and enabling client authentication (step 6). Perform the steps for the procedures you need. For example, perform step 6 only if you intend to enforce client authentication.
  2. To enable server authentication, click the Authentication Types tab and then check the box for either Local or Directory Service (or both). After selecting the authentication types, click the Save button.
    The Local setting uses the local authentication provided by Nutanix (see User Management) . This method is employed when a user enters just a login name without specifying a domain (for example, user1 instead of user1@nutanix.com ). The Directory Service setting validates user@domain entries and validates against the directory specified in the Directory List tab. Therefore, you need to configure an authentication directory if you select Directory Service in this field.
    Figure. Authentication Types Tab Click to enlarge
    Note: The Nutanix admin user can log on to the management interfaces, including the web console, even if the Local authentication type is disabled.
  3. To add an authentication directory, click the Directory List tab and then click the New Directory option.
    A set of fields is displayed. Do the following in the indicated fields:
    1. Directory Type : Select one of the following from the pull-down list.
      • Active Directory : Active Directory (AD) is a directory service implemented by Microsoft for Windows domain networks.
        Note:
        • Users with the "User must change password at next logon" attribute enabled will not be able to authenticate to the web console (or nCLI). Ensure users with this attribute first login to a domain workstation and change their password prior to accessing the web console. Also, if SSL is enabled on the Active Directory server, make sure that Nutanix has access to that port (open in firewall).
        • An Active Directory user name or group name containing spaces is not supported for Prism Element authentication.
        • Active Directory domain created by using non-ASCII text may not be supported. For more information about usage of ASCII or non-ASCII text in Active Directory configuration, see the Internationalization (i18n) section.
        • Use of the "Protected Users" group is currently unsupported for Prism authentication. For more details on the "Protected Users" group, see “Guidance about how to configure protected accounts” on Microsoft documentation website.
        • The Microsoft AD is LDAP v2 and LDAP v3 compliant.
        • The Microsoft AD servers supported are Windows Server 2012 R2, Windows Server 2016, and Windows Server 2019.
      • OpenLDAP : OpenLDAP is a free, open source directory service, which uses the Lightweight Directory Access Protocol (LDAP), developed by the OpenLDAP project. Nutanix currently supports the OpenLDAP 2.4 release running on CentOS distributions only.
    2. Name : Enter a directory name.
      This is a name you choose to identify this entry; it need not be the name of an actual directory.
    3. Domain : Enter the domain name.
      Enter the domain name in DNS format, for example, nutanix.com .
    4. Directory URL : Enter the URL address to the directory.
      The URL format is as follows for an LDAP entry: ldap:// host : ldap_port_num . The host value is either the IP address or fully qualified domain name. (In some environments, a simple domain name is sufficient.) The default LDAP port number is 389. Nutanix also supports LDAPS (port 636) and LDAP/S Global Catalog (ports 3268 and 3269). The following are example configurations appropriate for each port option:
      Note: LDAPS support does not require custom certificates or certificate trust import.
      • Port 389 (LDAP). Use this port number (in the following URL form) when the configuration is single domain, single forest, and not using SSL.
        ldap://ad_server.mycompany.com:389
      • Port 636 (LDAPS). Use this port number (in the following URL form) when the configuration is single domain, single forest, and using SSL. This requires all Active Directory Domain Controllers have properly installed SSL certificates.
        ldaps://ad_server.mycompany.com:636
        Note: The LDAP server SSL certificate must include a Subject Alternative Name (SAN) that matches the URL provided during the LDAPS setup.
      • Port 3268 (LDAP - GC). Use this port number when the configuration is multiple domain, single forest, and not using SSL.
      • Port 3269 (LDAPS - GC). Use this port number when the configuration is multiple domain, single forest, and using SSL.
        Note: When constructing your LDAP/S URL to use a Global Catalog server, ensure that the Domain Control IP address or name being used is a global catalog server within the domain being configured. If not, queries over 3268/3269 may fail.
        Note: When querying the global catalog, the users sAMAccountName field must be unique across the AD forest. If the sAMAccountName field is not unique across the subdomains, authentication may fail intermittently or consistently.
      Note: For the complete list of required ports, see Port Reference .
    5. (OpenLDAP only) Configure the following additional fields:
      1. User Object Class : Enter the value that uniquely identifies the object class of a user.
      2. User Search Base : Enter the base domain name in which the users are configured.
      3. Username Attribute : Enter the attribute to uniquely identify a user.
      4. Group Object Class : Enter the value that uniquely identifies the object class of a group.
      5. Group Search Base : Enter the base domain name in which the groups are configured.
      6. Group Member Attribute : Enter the attribute that identifies users in a group.
      7. Group Member Attribute Value : Enter the attribute that identifies the users provided as value for Group Member Attribute .
    6. Search Type . How to search your directory when authenticating. Choose Non Recursive if you experience slow directory logon performance. For this option, ensure that users listed in Role Mapping are listed flatly in the group (that is, not nested). Otherwise, choose the default Recursive option.
    7. Service Account Username : Enter the service account user name in the user_name@domain.com format that you want the web console to use to log in to the Active Directory.

      A service account is created to run only a particular service or application with the credentials specified for the account. According to the requirement of the service or application, the administrator can limit access to the service account.

      A service account is under the Managed Service Accounts in the Active Directory server. An application or service uses the service account to interact with the operating system. Enter your Active Directory service account credentials in this (username) and the following (password) field.

      Note: Be sure to update the service account credentials here whenever the service account password changes or when a different service account is used.
    8. Service Account Password : Enter the service account password.
    9. When all the fields are correct, click the Save button (lower right).
      This saves the configuration and redisplays the Authentication Configuration dialog box. The configured directory now appears in the Directory List tab.
    10. Repeat this step for each authentication directory you want to add.
    Note:
    • The Controller VMs need access to the Active Directory server, so open the standard Active Directory ports to each Controller VM in the cluster (and the virtual IP if one is configured).
    • No permissions are granted to the directory users by default. To grant permissions to the directory users, you must specify roles for the users in that directory (see Assigning Role Permissions).
    • Service account for both Active directory and openLDAP must have full read permission on the directory service. Additionally, for successful Prism Element authentication, the users must also have search or read privileges.
    Figure. Directory List Tab Click to enlarge
  4. To edit a directory entry, click the Directory List tab and then click the pencil icon for that entry.
    After clicking the pencil icon, the Directory List fields reappear (see step 3). Enter the new information in the appropriate fields and then click the Save button.
  5. To delete a directory entry, click the Directory List tab and then click the X icon for that entry.
    After clicking the X icon, a window prompt appears to verify the delete action; click the OK button. The entry is removed from the list.
  6. To enable client authentication, do the following:
    1. Click the Client tab.
    2. Select the Configure Client Chain Certificate check box.
      Client Chain Certificate is a list of certificates that includes all intermediate CA and root-CA certificates.
      Note: To authenticate on the PE with Client Chain Certificate the 'Subject name’ field must be present. The subject name should match the userPrincipalName (UPN) in the AD. The UPN is a username with domain address. For example user1@nutanix.com .
      Figure. Client Tab (1) Click to enlarge
    3. Click the Choose File button, browse to and select a client chain certificate to upload, and then click the Open button to upload the certificate.
      Note: Uploaded certificate files must be PEM encoded. The web console restarts after the upload step.
      Figure. Client Tab (2) Click to enlarge
    4. To enable client authentication, click Enable Client Authentication .
    5. To modify client authentication, do one of the following:
      Note: The web console restarts when you change these settings.
      • Click Enable Client Authentication to disable client authentication.
      • Click Remove to delete the current certificate. (This also disables client authentication.)
      • To enable OCSP or CRL based certificate revocation checking, see Certificate Revocation Checking.
      Figure. Authentication Window: Client Tab (3) Click to enlarge

    Client authentication allows you to securely access the Prism by exchanging a digital certificate. Prism will validate that the certificate is signed by your organization’s trusted signing certificate.

    Client authentication ensures that the Nutanix cluster gets a valid certificate from the user. Normally, a one-way authentication process occurs where the server provides a certificate so the user can verify the authenticity of the server (see Installing an SSL Certificate). When client authentication is enabled, this becomes a two-way authentication where the server also verifies the authenticity of the user. A user must provide a valid certificate when accessing the console either by installing the certificate on their local machine or by providing it through a smart card reader. Providing a valid certificate enables user login from a client machine with the relevant user certificate without utilizing user name and password. If the user is required to login from a client machine which does not have the certificate installed, then authentication using user name and password is still available.
    Note: The CA must be the same for both the client chain certificate and the certificate on the local machine or smart card.
  7. To specify a service account that the web console can use to log in to Active Directory and authenticate Common Access Card (CAC) users, select the Configure Service Account check box, and then do the following in the indicated fields:
    Figure. Common Access Card Authentication Click to enlarge
    1. Directory : Select the authentication directory that contains the CAC users that you want to authenticate.
      This list includes the directories that are configured on the Directory List tab.
    2. Service Username : Enter the user name in the user name@domain.com format that you want the web console to use to log in to the Active Directory.
    3. Service Password : Enter the password for the service user name.
    4. Click Enable CAC Authentication .
      Note: For federal customers only.
      Note: The web console restarts after you change this setting.

    The Common Access Card (CAC) is a smart card about the size of a credit card, which some organizations use to access their systems. After you insert the CAC into the CAC reader connected to your system, the software in the reader prompts you to enter a PIN. After you enter a valid PIN, the software extracts your personal certificate that represents you and forwards the certificate to the server using the HTTP protocol.

    Nutanix Prism verifies the certificate as follows:
    • Validates that the certificate has been signed by your organization’s trusted signing certificate.
    • Extracts the Electronic Data Interchange Personal Identifier (EDIPI) from the certificate and uses the EDIPI to check the validity of an account within the Active Directory. The security context from the EDIPI is used for your PRISM session.
    • Prism Element supports both certificate authentication and basic authentication in order to handle both Prism Element login using a certificate and allowing REST API to use basic authentication. It is physically not possible for REST API to use CAC certificates. With this behavior, if the certificate is present during Prism Element login, the certificate authentication is used. However, if the certificate is not present, basic authentication is enforced and used.
    Note: Nutanix Prism does not support OpenLDAP as directory service for CAC.
    If you map a Prism role to a CAC user and not to an Active Directory group or organizational unit to which the user belongs, specify the EDIPI (User Principal Name, or UPN) of that user in the role mapping. A user who presents a CAC with a valid certificate is mapped to a role and taken directly to the web console home page. The web console login page is not displayed.
    Note: If you have logged on to Prism by using CAC authentication, to successfully log out of Prism, close the browser after you click Log Out .
  8. Click the Close button to close the Authentication Configuration dialog box.

Assigning Role Permissions

About this task

When user authentication is enabled for a directory service (see Configuring Authentication), the directory users do not have any permissions by default. To grant permissions to the directory users, you must specify roles for the users (with associated permissions) to organizational units (OUs), groups, or individuals within a directory.

If you are using Active Directory, you must also assign roles to entities or users, especially before upgrading from a previous AOS version.

To assign roles, do the following:

Procedure

  1. In the web console, click the gear icon in the main menu and then select Role Mapping in the Settings page.
    The Role Mapping window appears.
    Figure. Role Mapping Window Click to enlarge
  2. To create a role mapping, click the New Mapping button.

    The Create Role Mapping window appears. Do the following in the indicated fields:

    1. Directory : Select the target directory from the pull-down list.

      Only directories previously defined when configuring authentication appear in this list. If the desired directory does not appear, add that directory to the directory list (see Configuring Authentication) and then return to this procedure.

    2. LDAP Type : Select the desired LDAP entity type from the pull-down list.

      The entity types are GROUP , USER , and OU .

    3. Role : Select the user role from the pull-down list.
      There are three roles from which to choose:
      • Viewer : This role allows a user to view information only. It does not provide permission to perform any administrative tasks.
      • Cluster Admin : This role allows a user to view information and perform any administrative task (but not create or modify user accounts).
      • User Admin : This role allows the user to view information, perform any administrative task, and create or modify user accounts.
    4. Values : Enter the case-sensitive entity names (in a comma separated list with no spaces) that should be assigned this role.
      The values are the actual names of the organizational units (meaning it applies to all users in those OUs), groups (all users in those groups), or users (each named user) assigned this role. For example, entering value " admin-gp,support-gp " when the LDAP type is GROUP and the role is Cluster Admin means all users in the admin-gp and support-gp groups should be assigned the cluster administrator role.
      Note:
      • Do not include a domain in the value, for example enter just admin-gp , not admin-gp@nutanix.com . However, when users log into the web console, they need to include the domain in their user name.
      • The AD user UPN must be in the user@domain_name format.
      • When an admin defines user role mapping using an AD with forest setup, the admin can map to the user with the same name from any domain in the forest setup. To avoid this case, set up the user-role mapping with AD that has a specific domain setup.
    5. When all the fields are correct, click Save .
      This saves the configuration and redisplays the Role Mapping window. The new role map now appears in the list.
      Note: All users in an authorized service directory have full administrator permissions when role mapping is not defined for that directory. However, after creating a role map, any users in that directory that are not explicitly granted permissions through the role mapping are denied access (no permissions).
    6. Repeat this step for each role map you want to add.
      You can create a role map for each authorized directory. You can also create multiple maps that apply to a single directory. When there are multiple maps for a directory, the most specific rule for a user applies. For example, adding a GROUP map set to Cluster Admin and a USER map set to Viewer for select users in that group means all users in the group have administrator permission except those specified users who have viewing permission only.
    Figure. Create Role Mapping Window Click to enlarge
  3. To edit a role map entry, click the pencil icon for that entry.
    After clicking the pencil icon, the Edit Role Mapping window appears, which contains the same fields as the Create Role Mapping window (see step 2). Enter the new information in the appropriate fields and then click the Save button.
  4. To delete a role map entry, click the "X" icon for that entry.
    After clicking the X icon, a window prompt appears to verify the delete action; click the OK button. The entry is removed from the list.
  5. Click the Close button to close the Role Mapping window.

Certificate Revocation Checking

Enabling Certificate Revocation Checking using Online Certificate Status Protocol (nCLI)

About this task

OCSP is the recommended method for checking certificate revocation in client authentication. You can enable certificate revocation checking using the OSCP method through the command line interface (nCLI).

To enable certificate revocation checking using OCSP for client authentication, do the following.

Procedure

  1. Set the OCSP responder URL.
    ncli authconfig set-certificate-revocation set-ocsp-responder=<ocsp url> <ocsp url> indicates the location of the OCSP responder.
  2. Verify if OCSP checking is enabled.
    ncli authconfig get-client-authentication-config

    The expected output if certificate revocation checking is enabled successfully is as follows.

    Auth Config Status: true
    File Name: ca.cert.pem
    OCSP Responder URI: http://<ocsp-responder-url>

Enabling Certificate Revocation Checking using Certificate Revocation Lists (nCLI)

About this task

Note: OSCP is the recommended method for checking certificate revocation in client authentication.

You can use the CRL certificate revocation checking method if required, as described in this section.

To enable certificate revocation checking using CRL for client authentication, do the following.

Procedure

Specify all the CRLs that are required for certificate validation.
ncli authconfig set-certificate-revocation set-crl-uri=<uri 1>,<uri 2> set-crl-refresh-interval=<refresh interval in seconds>
  • The above command resets any previous OCSP or CRL configurations.
  • The URIs must be percent-encoded and comma separated.
  • The CRLs are updated periodically as specified by the crl-refresh-interval value. This interval is common for the entire list of CRL distribution points. The default value for this is 86400 seconds (1 day).

Authentication Best Practices

The authentication best practices listed here are guidance to secure the Nutanix platform by using the most common authentication security measures.

Emergency Local Account Usage

You must use the admin account as a local emergency account. The admin account ensures that both the Prism Web Console and the Controller VM are available when the external services such as Active Directory is unavailable.

Note: Local emergency account usage does not support any external access mechanisms, specifically for the external application authentication or external Rest API authentication.

For all the external authentication, you must configure the cluster to use an external IAM service such as Active Directory. You must create service accounts on the IAM and the accounts must have access grants to the cluster through Prism web console user account management configuration for authentication.

Modifying Default Passwords

You must change the default Controller VM password for nutanix user account by adhering to the password complexity requirements.

Procedure

  1. SSH to the Controller VM.
  2. Change the "nutanix" user account password.
    nutanix@cvm$ passwd nutanix
  3. Respond to the prompts and provide the current and new root password.
    Changing password for nutanix.
    New password:
    Retype new password:
    passwd: all authentication tokens updated successfully.
    Note:
    • Changing the user account password on one of the Controller VMs is applied to all Controller VMs in the cluster.
    • Ensure that you preserve the modified nutanix user password, since the local authentication (PAM) module requires the previous password of the nutanix user to successfully start the password reset process.
    • For the root account, both the console and SSH direct login is disabled.
    • It is recommended to use the admin user as the administrative emergency account.

Controlling Cluster Access

About this task

Nutanix supports the Cluster lockdown feature. This feature enables key-based SSH access to the Controller VM and AHV on the Host (only for nutanix/admin users).

Enabling cluster lockdown mode ensures that password authentication is disabled and only the keys you have provided can be used to access the cluster resources. Thus making the cluster more secure.

You can create a key pair (or multiple key pairs) and add the public keys to enable key-based SSH access. However, when site security requirements do not allow such access, you can remove all public keys to prevent SSH access.

To control key-based SSH access to the cluster, do the following:
Note: Use this procedure to lock down access to the Controller VM and hypervisor host. In addition, it is possible to lock down access to the hypervisor.

Procedure

  1. Click the gear icon in the main menu and then select Cluster Lockdown in the Settings page.
    The Cluster Lockdown dialog box appears. Enabled public keys (if any) are listed in this window.
    Figure. Cluster Lockdown Window Click to enlarge
  2. To disable (or enable) remote login access, uncheck (check) the Enable Remote Login with Password box.
    Remote login access is enabled by default.
  3. To add a new public key, click the New Public Key button and then do the following in the displayed fields:
    1. Name : Enter a key name.
    2. Key : Enter (paste) the key value into the field.
    Note: Prism supports the following key types.
    • RSA
    • ECDSA
    1. Click the Save button (lower right) to save the key and return to the main Cluster Lockdown window.
    There are no public keys available by default, but you can add any number of public keys.
  4. To delete a public key, click the X on the right of that key line.
    Note: Deleting all the public keys and disabling remote login access locks down the cluster from SSH access.

Setup Admin Session Timeout

By default, the users are logged out automatically after being idle for 15 minutes. You can change the session timeout for users and configure to override the session timeout by following the steps shown below.

Procedure

  1. Click the gear icon in the main menu and then select UI Settings in the Settings page.
  2. Select the session timeout for the current user from the Session Timeout For Current User drop-down list.
    Figure. Session Timeout Settings Click to enlarge displays the window for setting an idle logout value and for disabling the logon background animation

  3. Select the appropriate option from the Session Timeout Override drop-down list to override the session timeout.

Password Retry Lockout

For enhanced security, Prism Element locks out the admin account for a period of 15 minutes after a default number of unsuccessful login attempts. Once the account is locked out, the following message is displayed at the logon screen.

Account locked due to too many failed attempts

You can attempt entering the password after the 15 minutes lockout period, or contact Nutanix Support in case you have forgotten your password.

Internationalization (i18n)

The following table lists all the supported and unsupported entities in UTF-8 encoding.

Table 1. Internationalization Support
Supported Entities Unsupported Entities
Cluster name Acropolis file server
Storage Container name Share path
Storage pool Internationalized domain names
VM name E-mail IDs
Snapshot name Hostnames
Volume group name Integers
Protection domain name Password fields
Remote site name Any Hardware related names ( for example, vSwitch, iSCSCI initiator, vLAN name)
User management
Chart name
Caution: The creation of none of the above entities are supported on Hyper-V because of the DR limitations.

Entities Support (ASCII or non-ASCII) for the Active Directory Server

  • In the New Directory Configuration, Name field is supported in non-ASCII.
  • In the New Directory Configuration, Domain field is not supported in non-ASCII.
  • In Role mapping, Values field is supported in non-ASCII.
  • User names and group names are supported in non-ASCII.

User Management

Nutanix user accounts can be created or updated as needed using the Prism web console.

  • The web console allows you to add (see Creating a User Account), edit (see Updating a User Account), or delete (see Deleting a User Account) local user accounts at any time.
  • You can reset the local user account password using nCLI if you are locked out and cannot login to the Prism Element or Prism Central web console ( see Resetting Password (CLI)).
  • You can also configure user accounts through Active Directory and LDAP (see Configuring Authentication). Active Directory domain created by using non-ASCII text may not be supported.
Note: In addition to the Nutanix user account, there are IPMI, Controller VM, and hypervisor host users. Passwords for these accounts cannot be changed through the web console.

Creating a User Account

About this task

The admin user is created automatically when you get a Nutanix system, but you can add more users as needed. Note that you cannot delete the admin user. To create a user, do the following:
Note: You can also configure user accounts through Active Directory (AD) and LDAP (see Configuring Authentication).

Procedure

  1. Click the gear icon in the main menu and then select Local User Management in the Settings page.
    The User Management dialog box appears.
    Figure. User Management Window Click to enlarge
  2. To add a user, click the New User button and do the following in the displayed fields:
    1. Username : Enter a user name.
    2. First Name : Enter a first name.
    3. Last Name : Enter a last name.
    4. Email : Enter a valid user email address.
      Note: AOS uses the email address for client authentication and logging when the local user performs user and cluster tasks in the web console.
    5. Password : Enter a password (maximum of 255 characters).
      A second field to verify that the password is not included, so be sure to enter the password correctly in this field.
    6. Language : Select the language setting for the user.
      By default English is selected. You can select Simplified Chinese or Japanese . Depending on the language that you select here, the cluster locale is be updated for the new user. For example, if you select Simplified Chinese , the next time that the new user logs on to the web console, the user interface is displayed in Simplified Chinese.
    7. Roles : Assign a role to this user.
      • Select the User Admin box to allow the user to view information, perform any administrative task, and create or modify user accounts. (Checking this box automatically selects the Cluster Admin box to indicate that this user has full permissions. However, a user administrator has full permissions regardless of whether the cluster administrator box is checked.)
      • Select the Cluster Admin box to allow the user to view information and perform any administrative task (but not create or modify user accounts).
      • Select the Backup Admin box to allow the user to perform backup-related administrative tasks. This role does not have permission to perform cluster or user tasks.

        Note: Backup admin user is designed for Nutanix Mine integrations as of AOS version 5.19 and has minimal functionality in cluster management. This role has restricted access to the Nutanix Mine cluster.
        • Health , Analysis , and Tasks features are available in read-only mode.
        • The File server and Data Protection options in the web console are not available for this user.
        • The following features are available for Backup Admin users with limited functionality.
            • Home - The user cannot a register a cluster with Prism Central. The registration widget is disabled. Other read-only data is displayed and available.
            • Alerts - Alerts and events are displayed. However, the user cannot resolve or acknowledge any alert or event. The user cannot configure Alert Policy or Email configuration .
            • Hardware - The user cannot expand the cluster or remove hosts from the cluster. Read-only data is displayed and available.
            • Network - Networking data or configuration is displayed but configuration options are not available.
            • Settings - The user can only upload a new image using the Settings page.
            • VM - The user cannot configure options like Create VM and Network Configuration in the VM page. The following options are available for the user in the VM page:
              • Launch console
              • Power On
              • Power Off
      • Leaving all the boxes unchecked allows the user to view information, but it does not provide permission to perform cluster or user tasks.
    8. When all the fields are correct, click Save .
      This saves the configuration and the web console redisplays the dialog box with the new user-administrative appearing in the list.
    Figure. Create User Window Click to enlarge

Updating a User Account

About this task

Update credentials and change the role for an existing user by using this procedure.
Note: To update your account credentials (that is, the user you are currently logged on as), see Updating My Account. Changing the password for a different user is not supported; you must log in as that user to change the password.

Procedure

  1. Click the gear icon in the main menu and then select Local User Management in the Settings page.
    The User Management dialog box appears.
  2. Enable or disable the login access for a user by clicking the toggle text Yes (enabled) or No (disabled) in the Enabled column.
    A Yes value in the Enabled column means that the login is enabled; a No value in the Enabled column means it is disabled.
    Note: A user account is enabled (login access activated) by default.
  3. To edit the user credentials, click the pencil icon for that user and update one or more of the values in the displayed fields:
    1. Username : The username is fixed when the account is created and cannot be changed.
    2. First Name : Enter a different first name.
    3. Last Name : Enter a different last name.
    4. Email : Enter a different valid email address.
      Note: AOS Prism uses the email address for client authentication and logging when the local user performs user and cluster tasks in the web console.
    5. Roles : Change the role assigned to this user.
      • Select the User Admin box to allow the user to view information, perform any administrative task, and create or modify user accounts. (Checking this box automatically selects the Cluster Admin box to indicate that this user has full permissions. However, a user administrator has full permissions regardless of whether the cluster administrator box is checked.)
      • Select the Cluster Admin box to allow the user to view information and perform any administrative task (but not create or modify user accounts).
      • Select the Backup Admin box to allow the user to perform backup-related administrative tasks. This role does not have permission to perform cluster or user administrative tasks.
      • Leaving all the boxes unchecked allows the user to view information, but it does not provide permission to perform cluster or user-administrative administrative tasks.
    6. Reset Password : Change the password of this user.
      Enter the new password for Password and Confirm Password fields. Click the info icon to view the password complexity requirements.
    7. When all the fields are correct, click Save .
      This saves the configuration and redisplays the dialog box with the new user appearing in the list.
    Figure. Update User Window Click to enlarge

Updating My Account

About this task

To update your account credentials (that is, credentials for the user you are currently logged in as), do the following:

Procedure

  1. To update your password, select Change Password from the user icon pull-down list in the web console.
    The Change Password dialog box appears. Do the following in the indicated fields:
    1. Current Password : Enter the current password.
    2. New Password : Enter a new password.
    3. Confirm Password : Re-enter the new password.
    4. When the fields are correct, click the Save button (lower right). This saves the new password and closes the window.
    Figure. Change Password Window Click to enlarge
    Note: You can change the password for the "admin" account only once per day. Please contact Nutanix support if you need to update the password multiple times in one day
  2. To update other details of your account, select Update Profile from the user icon pull-down list.
    The Update Profile dialog box appears. Update (as desired) one or more of the following fields:
    1. First Name : Enter a different first name.
    2. Last Name : Enter a different last name.
    3. Email : Enter a different valid user email address.
    4. Language : Select a language for your account.
    5. API Key : Enter the key value to use a new API key.
    6. Public Key : Click the Choose File button to upload a new public key file.
    7. When all the fields are correct, click the Save button (lower right). This saves the changes and closes the window.
    Figure. Update Profile Window Click to enlarge

Deleting a User Account

About this task

To delete an existing user, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Local User Management in the Settings page.
    The User Management dialog box appears.
    Figure. User Management Window Click to enlarge
  2. Click the X icon for that user. Note that you cannot delete the admin user.
    A window prompt appears to verify the action; click the OK button. The user account is removed and the user no longer appears in the list.

Certificate Management

This chapter describes how to install and replace an SSL certificate for configuration and use on the Nutanix Controller VM.

Note: Nutanix recommends that you check for the validity of the certificate periodically, and replace the certificate if it is invalid.

Installing an SSL Certificate

About this task

Nutanix supports SSL certificate-based authentication for console access. To install a self-signed or custom SSL certificate, do the following:
Important: Ensure that SSL certificates are not password protected.
Note:
  • Nutanix recommends that customers replace the default self-signed certificate with a CA signed certificate.
  • SSL certificate (self-signed or signed by CA) can only be installed cluster-wide from Prism. SSL certificates can not be customized for individual Controller VM.

Procedure

  1. Click the gear icon in the main menu and then select SSL Certificate in the Settings page.
    The SSL Certificate dialog box appears.
    Figure. SSL Certificate Window Click to enlarge
  2. To replace (or install) a certificate, click the Replace Certificate button.
  3. To create a new self-signed certificate, click the Regenerate Self Signed Certificate option and then click the Apply button.
    A dialog box appears to verify the action; click the OK button. This generates and applies a new RSA 2048-bit self-signed certificate for the Prism user interface.
    Figure. SSL Certificate Window: Regenerate Click to enlarge
  4. To apply a custom certificate that you provide, do the following:
    1. Click the Import Key and Certificate option and then click the Next button.
      Figure. SSL Certificate Window: Import Click to enlarge
    2. Do the following in the indicated fields, and then click the Import Files button.
      Note:
      • All the three imported files for the custom certificate must be PEM encoded.
      • Ensure that the private key does not have any extra data (or custom attributes) before the beginning (-----BEGIN CERTIFICATE-----) or after the end (-----END CERTIFICATE-----) of the private key block.
      • Private Key Type : Select the appropriate type for the signed certificate from the pull-down list (RSA 4096 bit, RSA 2048 bit, EC DSA 256 bit, or EC DSA 384 bit).
      • Private Key : Click the Browse button and select the private key associated with the certificate to be imported.
      • Public Certificate : Click the Browse button and select the signed public portion of the server certificate corresponding to the private key.
      • CA Certificate/Chain : Click the Browse button and select the certificate or chain of the signing authority for the public certificate.
      Figure. SSL Certificate Window: Select Files Click to enlarge
      In order to meet the high security standards of NIST SP800-131a compliance, the requirements of the RFC 6460 for NSA Suite B, and supply the optimal performance for encryption, the certificate import process validates the correct signature algorithm is used for a given key/cert pair. Refer to the following table to ensure the proper set of key types, sizes/curves, and signature algorithms. The CA must sign all public certificates with proper type, size/curve, and signature algorithm for the import process to validate successfully.
      Note: There is no specific requirement for the subject name of the certificates (subject alternative names (SAN) or wildcard certificates are supported in Prism).
      Table 1. Recommended Key Configurations
      Key Type Size/Curve Signature Algorithm
      RSA 4096 SHA256-with-RSAEncryption
      RSA 2048 SHA256-with-RSAEncryption
      EC DSA 256 prime256v1 ecdsa-with-sha256
      EC DSA 384 secp384r1 ecdsa-with-sha384
      EC DSA 521 secp521r1 ecdsa-with-sha512
      Note: RSA 4096 bit certificates might not work with certain AOS and Prism Central releases. Please see the release notes for your AOS and Prism Central versions. Specifying an RSA 4096 bit certificate might cause multiple cluster services to restart frequently. To work around the issue, see KB 12775.
      You can use the cat command to concatenate a list of CA certificates into a chain file.
      $ cat signer.crt inter.crt root.crt > server.cert
      Order is essential. The total chain should begin with the certificate of the signer and end with the root CA certificate as the final entry.

Results

After generating or uploading the new certificate, the interface gateway restarts. If the certificate and credentials are valid, the interface gateway uses the new certificate immediately, which means your browser session (and all other open browser sessions) will be invalid until you reload the page and accept the new certificate. If anything is wrong with the certificate (such as a corrupted file or wrong certificate type), the new certificate is discarded, and the system reverts back to the original default certificate provided by Nutanix.
Note: The system holds only one custom SSL certificate. If a new certificate is uploaded, it replaces the existing certificate. The previous certificate is discarded.

Replacing a Certificate

Nutanix simplifies the process of certificate replacement to support the need of Certificate Authority (CA) based chains of trust. Nutanix recommends you to replace the default supplied self-signed certificate with a CA signed certificate.

Procedure

  1. Login to the Prism and click the gear icon.
  2. Click SSL Certificate .
  3. Select Replace Certificate to replace the certificate.
  4. Do one of the following.
    • Select Regenerate self signed certificate to generate a new self-signed certificate.
      Note:
      • This automatically generates and applies a certificate.
    • Select Import key and certificate to import the custom key and certificate. RSA 4096 bit, RSA 2048 bit, Elliptic Curve DSA 256 bit, and Elliptic Curve DSA 384 bit types of key and certificate are supported.

    The following files are required and should be PEM encoded to import the keys and certificate.

    • The private key associated with the certificate. The below section describes generating a private key in detail.
    • The signed public portion of the server certificate corresponding to the private key.
    • The CA certificate or chain of the signing authority for the certificate.
    Note:

    You must obtain the Public Certificate and CA Certificate/Chain from the certificate authority.

    Figure. Importing Certificate Click to enlarge

    Generating an RSA 4096 and RSA 2048 private key

    Tip: You can run the OpenSSL commands for generating private key and CSR on a Linux client with OpenSSL installed.
    Note: Some OpenSSL command parameters might not be supported on older OpenSSL versions and require OpenSSL version 1.1.1 or above to work.
    • Run the following OpenSSL command to generate a RSA 4096 private key and the Certificate Signing Request (CSR).
      openssl req -out server.csr -new -newkey rsa:40966
              -nodes -sha256 -keyout server.key
    • Run the following OpenSSL command to generate an RSA 2048 private key and the Certificate Signing Request (CSR).
      openssl req -out server.csr -new -newkey rsa:2048
              -nodes -sha256 -keyout server.key

      After executing the openssl command, the system prompts you to provide more details that will be incorporated into your certificate. The mandatory fields are - Country Name, State or Province Name, and Organization Name. The optional fields are - Locality Name, Organizational Unit Name, Email Address, and Challenge Password.

    Nutanix recommends including a DNS name for all CVMs in the certificate using the Subject Alternative Name (SAN) extension. This avoids SSL certificate errors when you access a CVM by direct DNS instead of the shared cluster IP. This example shows how to include a DNS name while generating an RSA 4096 private key:

    openssl req -out server.csr -new -newkey rsa:4096 -sha256 -nodes 
    -addext "subjectAltName = DNS:example.com" 
    -keyout server.key 

    For a 3-node cluster you can provide DNS name for all three nodes in a single command. For example:

    openssl req -out server.csr -new -newkey rsa:4096 -sha256 -nodes 
    -addext "subjectAltName = DNS:example1.com,DNS:example2.com,DNS:example3.com" 
    -keyout server.key 

    If you have added a SAN ( subjectAltName ) extension to your certificate, then every time you add or remove a node from the cluster, you must add the DNS name when you generate or sign a new certificate.

    Generating an EC DSA 256 and EC DSA 384 private key

    • Run the following OpenSSL command to generate a EC DSA 256 private key and the Certificate Signing Request (CSR).
      openssl ecparam -out dsakey.pem -name prime256v1 –genkey 
      openssl req -out dsacert.csr -new -key dsakey.pem -nodes -sha256 
    • Run the following OpenSSL command to generate a EC DSA 384 private key and the Certificate Signing Request (CSR).
      openssl ecparam -out dsakey.pem -name secp384r1 –genkey
      openssl req -out dsacert.csr -new -key dsakey.pem -nodes –sha384 
      
    Note: To adhere the high security standards of NIST SP800-131a compliance, requirements of the RFC 6460 for NSA Suite B, provide the optimal performance for encryption. The certificate import process validates the correct signature algorithm used for a given key or certificate pair.
  5. If the CA chain certificate provided by the certificate authority is not in a single file, then run the following command to concatenate the list of CA certificates into a chain file.
    cat signer.crt inter.crt root.crt > server.cert
    Note: The chain should start with the certificate of the signer and ends with the root CA certificate.
  6. Browse and add the Private Key, Public Certificate, and CA Certificate/Chain.
  7. Click Import Files .

What to do next

Prism restarts and you must login to use the application.

Exporting an SSL Certificate for Third-party Backup Applications

Nutanix allows you to export an SSL certificate for Prism Element on a Nutanix cluster and use it with third-party backup applications.

Procedure

  1. Log on to a Controller VM in the cluster using SSH.
  2. Run the following command to obtain the virtual IP address of the cluster:
    nutanix@cvm$ ncli cluster info

    The current cluster configuration is displayed.

        Cluster Id           : 0001ab12-abcd-efgh-0123-012345678m89::123456
        Cluster Uuid         : 0001ab12-abcd-efgh-0123-012345678m89
        Cluster Name         : three
        Cluster Version      : 6.0
        Cluster Full Version : el7.3-release-fraser-6.0-a0b1c2345d6789ie123456fg789h1212i34jk5lm6
        External IP address  : 10.10.10.10
        Node Count           : 3
        Block Count          : 1
        . . . . 
    Note: The external IP address in the output is the virtual IP address of the cluster.
  3. Run the following command to enter into the Python prompt:
    nutanix@cvm$ python

    The Python prompt appears.

  4. Run the following command to import the SSL library.
    $ import ssl
  5. From the Python console, run the following command to print the SSL certificate.
    $ print ssl.get_server_certificate(('virtual_IP_address',9440), ssl_version=ssl.PROTOCOL_TLSv1_2)
    Example: Refer to the following example where virtual_IP_address value is replaced by 10.10.10.10.
    $ print ssl.get_server_certificate(('10.10.10.10', 9440), ssl_version=ssl.PROTOCOL_TLSv1_2)
    The SSL certificate is displayed on the console.
    -----BEGIN CERTIFICATE-----
    0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01
    23456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123
    456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz012345
    6789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01234567
    89ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
    ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789AB
    CDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCD
    EFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEF
    GHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGH
    IJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJ
    KLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKL
    MNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMN
    OPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOP
    QRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQR
    STUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRST
    UVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUV
    WXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWX
    YZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
    abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZab
    cdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcd
    efghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef
    ghij
    -----END CERTIFICATE-----

Controlling Cluster Access

About this task

Nutanix supports the Cluster lockdown feature. This feature enables key-based SSH access to the Controller VM and AHV on the Host (only for nutanix/admin users).

Enabling cluster lockdown mode ensures that password authentication is disabled and only the keys you have provided can be used to access the cluster resources. Thus making the cluster more secure.

You can create a key pair (or multiple key pairs) and add the public keys to enable key-based SSH access. However, when site security requirements do not allow such access, you can remove all public keys to prevent SSH access.

To control key-based SSH access to the cluster, do the following:
Note: Use this procedure to lock down access to the Controller VM and hypervisor host. In addition, it is possible to lock down access to the hypervisor.

Procedure

  1. Click the gear icon in the main menu and then select Cluster Lockdown in the Settings page.
    The Cluster Lockdown dialog box appears. Enabled public keys (if any) are listed in this window.
    Figure. Cluster Lockdown Window Click to enlarge
  2. To disable (or enable) remote login access, uncheck (check) the Enable Remote Login with Password box.
    Remote login access is enabled by default.
  3. To add a new public key, click the New Public Key button and then do the following in the displayed fields:
    1. Name : Enter a key name.
    2. Key : Enter (paste) the key value into the field.
    Note: Prism supports the following key types.
    • RSA
    • ECDSA
    1. Click the Save button (lower right) to save the key and return to the main Cluster Lockdown window.
    There are no public keys available by default, but you can add any number of public keys.
  4. To delete a public key, click the X on the right of that key line.
    Note: Deleting all the public keys and disabling remote login access locks down the cluster from SSH access.

Data-at-Rest Encryption

Nutanix provides an option to secure data while it is at rest using either self-encrypted drives or software-only encryption and key-based access management (cluster's native or external KMS for software-only encryption).

Encryption Methods

Nutanix provides you with the following options to secure your data.

  • Self Encrypting Drives (SED) Encryption - You can use a combination of SEDs and an external KMS to secure your data while it is at rest.
  • Software-only Encryption - Nutanix AOS uses the AES-256 encryption standard to encrypt your data. Once enabled, software-only data-at-rest encryption cannot be disabled, thus protecting against accidental data leaks due to human errors. Software-only encryption supports both Nutanix Native Key Manager (local and remote) and External KMS to secure your keys.

Note the following points regarding data-at-rest encryption.

  • Encryption is supported for AHV, ESXi, and Hyper-V.
    • For ESXi and Hyper-V, software-only encryption can be implemented at a cluster level or container level. For AHV, encryption can be implemented at the cluster level only.
  • Nutanix recommends using cluster-level encryption. With the cluster-level encryption, the administrative overhead of selecting different containers for the data storage gets eliminated.
  • Encryption cannot be disabled once it is enabled at a cluster level or container level.
  • Encryption can be implemented on an existing cluster with data that exists. If encryption is enabled on an existing cluster (AHV, ESXi, or Hyper-V), the unencrypted data is transformed into an encrypted format in a low priority background task that is designed not to interfere with other workload running in the cluster.
  • Data can be encrypted using either self-encrypted drives (SEDs) or software-only encryption. You can change the encryption method from SEDs to software-only. You can perform the following configurations.
    • For ESXi and Hyper-V clusters, you can switch from SEDs and External Key Management (EKM) combination to software-only encryption and EKM combination. First, you must disable the encryption in the cluster where you want to change the encryption method. Then, select the cluster and enable encryption to transform the unencrypted data into an encrypted format in the background.
    • For AHV, background encryption is supported.
  • Once the task to encrypt a cluster begins, you cannot cancel the operation. Even if you stop and restart the cluster, the system resumes the operation.
  • In the case of mixed clusters with ESXi and AHV nodes, where the AHV nodes are used for storage only, the encryption policies consider the cluster as an ESXi cluster. So, the cluster-level and container-level encryption are available.
  • You can use a combination of SED and non-SED drives in a cluster. After you encrypt a cluster using the software-only encryption, all the drives are considered as unencrypted drives. In case you switch from the SED encryption to the software-only encryption, you can add SED or non-SED drives to the cluster.
  • Data is not encrypted when it is replicated to another cluster. You must enable the encryption for each cluster. Data is encrypted as a part of the write operation and decrypted as a part of the read operation. During the replication process, the system reads, decrypts, and then sends the data over to the other cluster. You can use a third-party network solution if there a requirement to encrypt the data during the transmission to another cluster.
  • Software-only encryption does not impact the data efficiency features such as deduplication, compression, erasure coding, zero block suppression, and so on. The software encryption is the last data transformation performed. For example, during the write operation, compression is performed first, followed by encryption.

Key Management

Nutanix supports a Native Key Management Server, also called Local Key Manager (LKM), thus avoiding the dependency on an External Key Manager (EKM). Cluster localised Key Management Service support requires a minimum of 3-node in a cluster and is supported only for software-only encryption. So, 1-node and 2-node clusters can use either the Native KMS (remote) option or an EKM. .

The following types of keys are used for encryption.

  • Data Encryption Key (DEK) - A symmetric key, such as AES-256, that is used to encrypt the data.
  • Key Encryption Key (KEK) - This key is used to encrypt or decrypt the DEK.

Note the following points regarding the key management.

  • Nutanix does not support the use of the Local Key Manager with a third party External Key Manager.
  • Dual encryption (both SED and software-only encryption) requires an EKM. For more information, see Configuring Dual Encryption.
  • You can switch from an EKM to LKM, and inversely. For more information, see Switching between Native Key Manager and External Key Manager.
  • Rekey of keys stored in the Native KMS is supported for the Leader Keys. For more information, see Changing Key Encryption Keys (SEDs) and Changing Key Encryption Keys (Software Only).
  • You must back up the keys stored in the Native KMS. For more information, see Backing up Keys.
  • You must backup the encryption keys whenever you create a new container or remove an existing container. Nutanix Cluster Check (NCC) checks the status of the backup and sends an alert if you do not take a backup at the time of creating or removing a container.

Data-at-Rest Encryption (SEDs)

For customers who require enhanced data security, Nutanix provides a data-at-rest security option using Self Encrypting Drives (SEDs) included in the Ultimate license.

Note: If you are running the AOS Pro License on G6 platforms and above, you can use SED encryption by installing an add-on license.

Following features are supported:

  • Data is encrypted on all drives at all times.
  • Data is inaccessible in the event of drive or node theft.
  • Data on a drive can be securely destroyed.
  • A key authorization method allows password rotation at arbitrary times.
  • Protection can be enabled or disabled at any time.
  • No performance penalty is incurred despite encrypting all data.
  • Re-key of the leader encryption key (MEK) at arbitrary times is supported.
Note: If an SED cluster is present, then while executing the data-at-rest encryption, you will get an option to either select data-at-rest encryption using SEDs or data-at-rest encryption using AOS.
Figure. SED and AOS Options Click to enlarge

Note: This solution provides enhanced security for data on a drive, but it does not secure data in transit.

Data Encryption Model

To accomplish these goals, Nutanix implements a data security configuration that uses SEDs with keys maintained through a separate key management device. Nutanix uses open standards (TCG and KMIP protocols) and FIPS validated SED drives for interoperability and strong security.

Figure. Cluster Protection Overview Click to enlarge Graphical overview of the Nutanix data encryption methodology

This configuration involves the following workflow:

  1. The security implementation begins by installing SEDs for all data drives in a cluster.

    The drives are FIPS 140-2 validated and use FIPS 140-2 validated cryptographic modules.

    Creating a new cluster that includes SEDs only is straightforward, but an existing cluster can be converted to support data-at-rest encryption by replacing the existing drives with SEDs (after migrating all the VMs/vDisks off of the cluster while the drives are being replaced).

    Note: Contact Nutanix customer support for assistance before attempting to convert an existing cluster. A non-protected cluster can contain both SED and standard drives, but Nutanix does not support a mixed cluster when protection is enabled. All the disks in a protected cluster must be SED drives.
  2. Data on the drives is always encrypted but read or write access to that data is open. By default, the access to data on the drives is protected by the in-built manufacturer key. However, when data protection for the cluster is enabled, the Controller VM must provide the proper key to access data on a SED. The Controller VM communicates with the SEDs through a Trusted Computing Group (TCG) Security Subsystem Class (SSC) Enterprise protocol.

    A symmetric data encryption key (DEK) such as AES 256 is applied to all data being written to or read from the disk. The key is known only to the drive controller and never leaves the physical subsystem, so there is no way to access the data directly from the drive.

    Another key, known as a key encryption key (KEK), is used to encrypt/decrypt the DEK and authenticate to the drive. (Some vendors call this the authentication key or PIN.)

    Each drive has a separate KEK that is generated through the FIPS compliant random number generator present in the drive controller. The KEK is 32 bytes long to resist brute force attacks. The KEKs are sent to the key management server for secure storage and later retrieval; they are not stored locally on the node (even though they are generated locally).

    In addition to the above, the leader encryption key (MEK) is used to encrypt the KEKs.

    Each node maintains a set of certificates and keys in order to establish a secure connection with the external key management server.

  3. Keys are stored in a key management server that is outside the cluster, and the Controller VM communicates with the key management server using the Key Management Interoperability Protocol (KMIP) to upload and retrieve drive keys.

    Only one key management server device is required, but it is recommended that multiple devices are employed so the key management server is not a potential single point of failure. Configure the key manager server devices to work in clustered mode so they can be added to the cluster configuration as a single entity that is resilient to a single failure.

  4. When a node experiences a full power off and power on (and cluster protection is enabled), the controller VM retrieves the drive keys from the key management server and uses them to unlock the drives.

    If the Controller VM cannot get the correct keys from the key management server, it cannot access data on the drives.

    If a drive is re-seated, it becomes locked.

    If a drive is stolen, the data is inaccessible without the KEK (which cannot be obtained from the drive). If a node is stolen, the key management server can revoke the node certificates to ensure they cannot be used to access data on any of the drives.

Preparing for Data-at-Rest Encryption (External KMS for SEDs and Software Only)

About this task

Caution: DO NOT HOST A KEY MANAGEMENT SERVER VM ON THE ENCRYPTED CLUSTER THAT IS USING IT!

Doing so could result in complete data loss if there is a problem with the VM while it is hosted in that cluster.

If you are using an external KMS for encryption using AOS, preparation steps outside the web console are required. The information in this section is applicable if you choose to use an external KMS for configuring encryption.

You must install the license of the external key manager for all nodes in the cluster. See Compatibility and Interoperability Matrix for a complete list of the supported key management servers. For instructions on how to configure a key management server, refer to the documentation from the appropriate vendor.

The system accesses the EKM under the following conditions:

  • Starting a cluster

  • Regenerating a key (key regeneration occurs automatically every year by default)

  • Adding or removing a node (only when Self Encrypting Drives is used for encryption)

  • Switching between Native to EKM or EKM to Native

  • Starting, and restarting a service (only if Software-based encryption is used)

  • Upgrading AOS (only if Software-based encryption is used)

  • NCC heartbeat check if EKM is alive

Procedure

  1. Configure a key management server.

    The key management server devices must be configured into the network so the cluster has access to those devices. For redundant protection, it is recommended that you employ at least two key management server devices, either in active-active cluster mode or stand-alone.

    Note: The key management server must support KMIP version 1.0 or later.
    • SafeNet

      Ensure that Security > High Security > Key Security > Disable Creation and Use of Global Keys is checked.

    • Vormetric

      Set the appliance to compatibility mode. Suite B mode causes the SSL handshake to fail.

  2. Generate a certificate signing request (CSR) for each node in the cluster.
    • The Common Name field of the CSR is populated automatically with unique_node_identifier .nutanix.com to identify the node associated with the certificate.
      Tip: After generating the certificate from Prism, (if required) you can update the custom common name (CN) setting by running the following command using nCLI.
      ncli data-at-rest-encryption-certificate update-csr-information domain-name=abcd.test.com

      In the above command example, replace "abcd.test.com" with the actual domain name.

    • A UID field is populated with a value of Nutanix . This can be useful when configuring a Nutanix group for access control within a key management server, since it is based on fields within the client certificates.
    Note: Some vendors when doing client certificate authentication expect the client username to be a field in the CSR. While the CN and UID are pre-generated, many of the user populated fields can be used instead if desired. If a node-unique field such as CN is chosen, users must be created on a per node basis for access control. If a cluster-unique field is chosen, customers must create a user for each cluster.
  3. Send the CSRs to a certificate authority (CA) and get them signed.
    • Safenet

      The SafeNet KeySecure key management server includes a local CA option to generate signed certificates, or you can use other third-party vendors to create the signed certificates.

      To enable FIPS compliance, add user nutanix to the CA that signed the CSR. Under Security > High Security > FIPS Compliance click Set FIPS Compliant .

    Note: Some CAs strip the UID field when returning a signed certificate.
    To comply with FIPS, Nutanix does not support the creation of global keys.

    In the SafeNet KeySecure management console, go to Device > Key Server > Key Server > KMIP Properties > Authentication Settings .

    Then do the following:

    • Set the Username Field in Client Certificate option to UID (User ID) .
    • Set the Client Certificate Authentication option to Used for SSL session and username .

    If you do not perform these settings, the KMS creates global keys and fails to encrypt the clusters or containers using the software only method.

  4. Upload the signed SSL certificates (one for each node) and the certificate for the CA to the cluster. These certificates are used to authenticate with the key management server.
  5. Generate keys (KEKs) for the SED drives and upload those keys to the key management server.

Configuring Data-at-Rest Encryption (SEDs)

Nutanix offers an option to use self-encrypting drives (SEDs) to store data in a cluster. When SEDs are used, there are several configuration steps that must be performed to support data-at-rest encryption in the cluster.

Before you begin

A separate key management server is required to store the keys outside of the cluster. Each key management server device must be configured and addressable through the network. It is recommended that multiple key manager server devices be configured to work in clustered mode so they can be added to the cluster configuration as a single entity (see step 5) that is resilient to a single failure.

About this task

To configure cluster encryption, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
    The Data at Rest Encryption dialog box appears. Initially, encryption is not configured, and a message to that effect appears.
    Figure. Data at Rest Encryption Screen (initial) Click to enlarge initial screen of the data-at-rest encryption window

  2. Click the Create Configuration button.
    Clicking the Continue Configuration button, configure it link, or Edit Config button does the same thing, which is display the Data-at-Rest Encryption configuration page.
  3. Select the Encryption Type as Drive-based Encryption . This option is displayed only when SEDs are detected.
  4. In the Certificate Signing Request Information section, do the following:
    Figure. Certificate Signing Request Section Click to enlarge section of the data-at-rest encryption window for configuring a certificate signing request

    1. Enter appropriate credentials for your organization in the Email , Organization , Organizational Unit , Country Code , City , and State fields and then click the Save CSR Info button.
      The entered information is saved and is used when creating a certificate signing request (CSR). To specify more than one Organization Unit name, enter a comma separated list.
      Note: You can update this information until an SSL certificate for a node is uploaded to the cluster, at which point the information cannot be changed (the fields become read only) without first deleting the uploaded certificates.
    2. Click the Download CSRs button, and then in the new screen click the Download CSRs for all nodes to download a file with CSRs for all the nodes or click a Download link to download a file with the CSR for that node.
      Figure. Download CSRs Screen Click to enlarge screen to download a certificate signing request

    3. Send the files with the CSRs to the desired certificate authority.
      The certificate authority creates the signed certificates and returns them to you. Store the returned SSL certificates and the CA certificate where you can retrieve them in step 6.
      • The certificates must be X.509 format. (DER, PKCS, and PFX formats are not supported.)
      • The certificate and the private key should be in separate files.
  5. In the Key Management Server section, do the following:
    Figure. Key Management Server Section Click to enlarge section of the data-at-rest encryption window for configuring a key management server

    1. Click the Add New Key Management Server button.
    2. In the Add a New Key Management Server screen, enter a name, IP address, and port number for the key management server in the appropriate fields.
      The port is where the key management server is configured to listen for the KMIP protocol. The default port number is 5696. For the complete list of required ports, see Port Reference.
      • If you have configured multiple key management servers in cluster mode, click the Add Address button to provide the addresses for each key management server device in the cluster.
      • If you have stand-alone key management servers, click the Save button. Repeat this step ( Add New Key Management Server button) for each key management server device to add.
        Note: If your key management servers are configured into a leader/follower (active/passive) relationship and the architecture is such that the follower cannot accept write requests, do not add the follower into this configuration. The system sends requests (read or write) to any configured key management server, so both read and write access is needed for key management servers added here.
        Note: To prevent potential configuration problems, always use the Add Address button for key management servers configured into cluster mode. Only a stand-alone key management server should be added as a new server.
      Figure. Add Key Management Server Screen Click to enlarge screen to provide an address for a key management server

    3. To edit any settings, click the pencil icon for that entry in the key management server list to redisplay the add page and then click the Save button after making the change. To delete an entry, click the X icon.
  6. In the Add a New Certificate Authority section, enter a name for the CA, click the Upload CA Certificate button, and select the certificate for the CA used to sign your node certificates (see step 4c). Repeat this step for all CAs that were used in the signing process.
    Figure. Certificate Authority Section Click to enlarge screen to identify and upload a certificate authority certificate

  7. Go to the Key Management Server section (see step 5) and do the following:
    1. Click the Manage Certificates button for a key management server.
    2. In the Manage Signed Certificates screen, upload the node certificates either by clicking the Upload Files button to upload all the certificates in one step or by clicking the Upload link (not shown in the figure) for each node individually.
    3. Test that the certificates are correct either by clicking the Test all nodes button to test the certificates for all nodes in one step or by clicking the Test CS (or Re-Test CS ) link for each node individually. A status of Verified indicates the test was successful for that node.
    Note: Before removing a drive or node from an SED cluster, ensure that the testing is successful and the status is Verified . Otherwise, the drive or node will be locked.
    1. Repeat this step for each key management server.
    Note: Before removing a drive or node from an SED cluster, ensure that the testing is successful and the status is Verified . Otherwise, the drive or node will be locked.
    Figure. Upload Signed Certificates Screen Click to enlarge screen to upload and test signed certificates

  8. When the configuration is complete, click the Protect button on the opening page to enable encryption protection for the cluster.
    A clear key icon appears on the page.
    Figure. Data-at-Rest Encryption Screen (unprotected) Click to enlarge

    The key turns gold when cluster encryption is enabled.
    Note: If changes are made to the configuration after protection has been enabled, such as adding a new key management server, you must rekey the disks for the modification to take full effect (see Changing Key Encryption Keys (SEDs)).
    Figure. Data-at-Rest Encryption Screen (protected) Click to enlarge

Enabling/Disabling Encryption (SEDs)

Data on a self encrypting drive (SED) is always encrypted, but enabling/disabling data-at-rest encryption for the cluster determines whether a separate (and secured) key is required to access that data.

About this task

To enable or disable data-at-rest encryption after it has been configured for the cluster (see Configuring Data-at-Rest Encryption (SEDs)), do the following:
Note: The key management server must be accessible to disable encryption.

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
  2. In the Cluster Encryption page, do one of the following:
    • If cluster encryption is enabled currently, click the Unprotect button to disable it.
    • If cluster encryption is disabled currently, click the Protect button to enable it.
    Enabling cluster encryption enforces the use of secured keys to access data on the SEDs in the cluster; disabling cluster encryption means the data can be accessed without providing a key.

Changing Key Encryption Keys (SEDs)

The key encryption key (KEK) can be changed at any time. This can be useful as a periodic password rotation security precaution or when a key management server or node becomes compromised. If the key management server is compromised, only the KEK needs to be changed, because the KEK is independent of the drive encryption key (DEK). There is no need to re-encrypt any data, just to re-encrypt the DEK.

About this task

To change the KEKs for a cluster, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
  2. In the Cluster Encryption page, select Manage Keys and click the Rekey All Disks button under Hardware Encryption .
    Rekeying a cluster under heavy workloads may result in higher-than-normal IO latency, and some data may become temporarily unavailable. To continue with the rekey operation, click Confirm Rekey .
    This step resets the KEKs for all the self encrypting disks in the cluster.
    Note:
    • The Rekey All Disks button appears only when cluster protection is active.
    • If the cluster is already protected and a new key management server is added, you must press the Rekey All Disks button to use this new key management server for storing secrets.
    Figure. Cluster Encryption Screen Click to enlarge

Destroying Data (SEDs)

Data on a self encrypting drive (SED) is always encrypted, and the data encryption key (DEK) used to read the encrypted data is known only to the drive controller. All data on the drive can effectively be destroyed (that is, become permanently unreadable) by having the controller change the DEK. This is known as a crypto-erase.

About this task

To crypto-erase a SED, do the following:

Procedure

  1. In the web console, go to the Hardware dashboard and select the Diagram tab.
  2. Select the target disk in the diagram (upper section of screen) and then click the Remove Disk button (at the bottom right of the following diagram).

    As part of the disk removal process, the DEK for that disk is automatically cycled on the drive controller. The previous DEK is lost and all new disk reads are indecipherable. The key encryption key (KEK) is unchanged, and the new DEK is protected using the current KEK.

    Note: When a node is removed, all SEDs in that node are crypto-erased automatically as part of the node removal process.
    Figure. Removing a Disk Click to enlarge screen shot of the diagram tab of the hardware dashboard demonstrating how to remove a disk

Data-at-Rest Encryption (Software Only)

For customers who require enhanced data security, Nutanix provides a software-only encryption option for data-at-rest security (SEDs not required) included in the Ultimate license.
Note: On G6 platforms running the AOS Pro license, you can use software encryption by installing an add-on license.
Software encryption using a local key manager (LKM) supports the following features:
  • For AHV, the data can be encrypted on a cluster level. This is applicable to an empty cluster or a cluster with existing data.
  • For ESXi and Hyper-V, the data can be encrypted on a cluster or container level. The cluster or container can be empty or contain existing data. Consider the following points for container level encryption.
    • Once you enable container level encryption, you can not change the encryption type to cluster level encryption later.
    • After the encryption is enabled, the administrator needs to enable encryption for every new container.
  • Data is encrypted at all times.
  • Data is inaccessible in the event of drive or node theft.
  • Data on a drive can be securely destroyed.
  • Re-key of the leader encryption key at arbitrary times is supported.
  • Cluster’s native KMS is supported.
Note: In case of mixed hypervisors, only the following combinations are supported.
  • ESXi and AHV
  • Hyper-V and AHV
Note: This solution provides enhanced security for data on a drive, but it does not secure data in transit.

Data Encryption Model

To accomplish the above mentioned goals, Nutanix implements a data security configuration that uses AOS functionality along with the cluster’s native or an external key management server. Nutanix uses open standards (KMIP protocols) for interoperability and strong security.

Figure. Cluster Protection Overview Click to enlarge graphical overview of the Nutanix data encryption methodology

This configuration involves the following workflow:

  • For software encryption, data protection must be enabled for the cluster before any data is encrypted. Also, the Controller VM must provide the proper key to access the data.
  • A symmetric data encryption key (DEK) such as AES 256 is applied to all data being written to or read from the disk. The key is known only to AOS, so there is no way to access the data directly from the drive.
  • In case of an external KMS:

    Each node maintains a set of certificates and keys in order to establish a secure connection with the key management server.

    Only one key management server device is required, but it is recommended that multiple devices are employed so the key management server is not a potential single point of failure. Configure the key manager server devices to work in clustered mode so they can be added to the cluster configuration as a single entity that is resilient to a single failure.

Configuring Data-at-Rest Encryption (Software Only)

Nutanix offers a software-only option to perform data-at-rest encryption in a cluster or container.

Before you begin

  • Nutanix provides the option to choose the KMS type as the Native KMS (local), Native KMS (remote), or External KMS.
  • Cluster Localised Key Management Service (Native KMS (local)) requires a minimum of 3-node cluster. 1-node and 2-node clusters are not supported.
  • Software encryption using Native KMS is supported for remote office/branch office (ROBO) deployments using the Native KMS (remote) KMS type.
  • For external KMS, a separate key management server is required to store the keys outside of the cluster. Each key management server device must be configured and addressable through the network. It is recommended that multiple key manager server devices be configured to work in clustered mode so they can be added to the cluster configuration as a single entity that is resilient to a single failure.
    Caution: DO NOT HOST A KEY MANAGEMENT SERVER VM ON THE ENCRYPTED CLUSTER THAT IS USING IT!!

    Doing so could result in complete data loss if there is a problem with the VM while it is hosted in that cluster.

    Note: You must install the license of the external key manager for all nodes in the cluster. See Compatibility and Interoperability Matrix for a complete list of the supported key management servers. For instructions on how to configure a key management server, refer to the documentation from the appropriate vendor.
  • This feature requires an Ultimate license, or as an Add-On to the PRO license (for the latest generation of products). Ensure that you have procure the add-on license key to use the data-at-rest encryption using AOS, contact Sales team to procure the license.
  • Caution: For security, you can't disable software-only data-at-rest encryption once it is enabled.

About this task

To configure cluster or container encryption, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
    The Data at Rest Encryption dialog box appears. Initially, encryption is not configured, and a message to that effect appears.
    Figure. Data at Rest Encryption Screen (initial) Click to enlarge initial screen of the data-at-rest encryption window

  2. Click the Create Configuration button.
    Clicking the Continue Configuration button, configure it link, or Edit Config button does the same thing, which is display the Data-at-Rest Encryption configuration page
  3. Select the Encryption Type as Encrypt the entire cluster or Encrypt storage containers . Then click Save Encryption Type .
    Caution: You can enable encryption for the entire cluster or just the container. However, if you enable encryption on a container; and there are any encryption key issue like loss of encryption key, you can encounter the following:
    • The entire cluster data is affected, not just the encrypted container.
    • All the user VMs of the cluster will not able to access the data.
    The hardware option is displayed only when SEDs are detected. Else, software based encryption type will be used by default.
    Figure. Select encryption type Click to enlarge select KMS type

    Note: For ESXi and Hyper-V, the data can be encrypted on a cluster or container level. The cluster or container can be empty or contain existing data. Consider the following points for container level encryption.
    • Once you enable container level encryption, you can not change the encryption type to cluster level encryption later.
    • After the encryption is enabled, the administrator needs to enable encryption for every new container.
    To enable encryption for every new storage container, do the following:
    1. In the web console, select Storage from the pull-down main menu (upper left of screen) and then select the Table and Storage Container tabs.
    2. To enable encryption, select the target storage container and then click the Update link.
      The Update Storage Container window appears.
    3. In the Advanced Settings area, select the Enable check box to enable encryption for the storage container you selected.
      Figure. Update storage container Click to enlarge Selecting encryption check box

    4. Click Save to complete.
  4. Select the Key Management Service.
    To keep the keys safe with the native KMS, select Native KMS (local) or Native KMS (remote) and click Save KMS type . If you select this option, skip to step 9 to complete the configuration.
    Note:
    • Cluster Localised Key Management Service ( Native KMS (local) ) requires a minimum of 3-node cluster. 1-node and 2-node clusters are not supported.
    • For enhanced security of ROBO environments (typically, 1 or 2 node clusters), select the Native KMS (remote) for software based encryption of ROBO clusters managed by Prism Central.
      Note: This is option is available only if the cluster is registered to Prism Central.
    For external KMS type, select the External KMS option and click Save KMS type . Continue to step 5 for further configuration.
    Figure. Select KMS Type Click to enlarge section of the data-at-rest encryption window for selecting KMS type

    Note: You can switch between the KMS types at a later stage if the specific KMS prerequisites are met, see Switching between Native Key Manager and External Key Manager.
  5. In the Certificate Signing Request Information section, do the following:
    Figure. Certificate Signing Request Section Click to enlarge section of the data-at-rest encryption window for configuring a certificate signing request

    1. Enter appropriate credentials for your organization in the Email , Organization , Organizational Unit , Country Code , City , and State fields and then click the Save CSR Info button.
      The entered information is saved and is used when creating a certificate signing request (CSR). To specify more than one Organization Unit name, enter a comma separated list.
      Note: You can update this information until an SSL certificate for a node is uploaded to the cluster, at which point the information cannot be changed (the fields become read only) without first deleting the uploaded certificates.
    2. Click the Download CSRs button, and then in the new screen click the Download CSRs for all nodes to download a file with CSRs for all the nodes or click a Download link to download a file with the CSR for that node.
      Figure. Download CSRs Screen Click to enlarge screen to download a certificate signing request

    3. Send the files with the CSRs to the desired certificate authority.
      The certificate authority creates the signed certificates and returns them to you. Store the returned SSL certificates and the CA certificate where you can retrieve them in step 5.
      • The certificates must be X.509 format. (DER, PKCS, and PFX formats are not supported.)
      • The certificate and the private key should be in separate files.
  6. In the Key Management Server section, do the following:
    Figure. Key Management Server Section Click to enlarge section of the data-at-rest encryption window for configuring a key management server

    1. Click the Add New Key Management Server button.
    2. In the Add a New Key Management Server screen, enter a name, IP address, and port number for the key management server in the appropriate fields.
      The port is where the key management server is configured to listen for the KMIP protocol. The default port number is 5696. For the complete list of required ports, see Port Reference.
      • If you have configured multiple key management servers in cluster mode, click the Add Address button to provide the addresses for each key management server device in the cluster.
      • If you have stand-alone key management servers, click the Save button. Repeat this step ( Add New Key Management Server button) for each key management server device to add.
        Note: If your key management servers are configured into a master/slave (active/passive) relationship and the architecture is such that the follower cannot accept write requests, do not add the follower into this configuration. The system sends requests (read or write) to any configured key management server, so both read and write access is needed for key management servers added here.
        Note: To prevent potential configuration problems, always use the Add Address button for key management servers configured into cluster mode. Only a stand-alone key management server should be added as a new server.
      Figure. Add Key Management Server Screen Click to enlarge screen to provide an address for a key management server

    3. To edit any settings, click the pencil icon for that entry in the key management server list to redisplay the add page and then click the Save button after making the change. To delete an entry, click the X icon.
  7. In the Add a New Certificate Authority section, enter a name for the CA, click the Upload CA Certificate button, and select the certificate for the CA used to sign your node certificates (see step 3c). Repeat this step for all CAs that were used in the signing process.
    Figure. Certificate Authority Section Click to enlarge screen to identify and upload a certificate authority certificate

  8. Go to the Key Management Server section (see step 4) and do the following:
    1. Click the Manage Certificates button for a key management server.
    2. In the Manage Signed Certificates screen, upload the node certificates either by clicking the Upload Files button to upload all the certificates in one step or by clicking the Upload link (not shown in the figure) for each node individually.
    3. Test that the certificates are correct either by clicking the Test all nodes button to test the certificates for all nodes in one step or by clicking the Test CS (or Re-Test CS ) link for each node individually. A status of Verified indicates the test was successful for that node.
    4. Repeat this step for each key management server.
    Note: Before removing a drive or node from an SED cluster, ensure that the testing is successful and the status is Verified . Otherwise, the drive or node will be locked.
    Figure. Upload Signed Certificates Screen Click to enlarge screen to upload and test signed certificates

  9. When the configuration is complete, click the Enable Encryption button.
    Enable Encryption window is displayed.
    Figure. Data-at-Rest Encryption Screen (unprotected) Click to enlarge

    Caution: To help ensure that your data is secure, you cannot disable software-only data-at-rest encryption once it is enabled. Nutanix recommends regularly backing up your data, encryption keys, and key management server.
  10. Enter ENCRYPT .
  11. Click Encrypt button.
    The data-at-rest encryption is enabled. To view the status of the encrypted cluster or container, go to Data at Rest Encryption in the Settings menu.

    When you enable encryption, a low priority background task runs to encrypt all the unencrypted data. This task is designed to take advantage of any available CPU space to encrypt the unencrypted data within a reasonable time. If the system is occupied with other workloads, the background task consumes less CPU space. Depending on the amount of data in the cluster, the background task can take 24 to 36 hours to complete.

    Note: If changes are made to the configuration after protection has been enabled, such as adding a new key management server, you must do the rekey operation for the modification to take full effect. In case of EKM, rekey to change the KEKs stored in the EKM. In case of LKM, rekey to change the leader key used by native key manager, see Changing Key Encryption Keys (Software Only)) for details.
    Note: Once the task to encrypt a cluster begins, you cannot cancel the operation. Even if you stop and restart the cluster, the system resumes the operation.
    Figure. Data-at-Rest Encryption Screen (protected) Click to enlarge

Switching between Native Key Manager and External Key Manager

After Software Encryption has been established, Nutanix supports the ability to switch the KMS type from the External Key Manager to the Native Key Manager or from the Native Key Manager to an External Key Manager, without any down time.

Note:
  • The Native KMS requires a minimum of 3-node cluster.
  • For external KMS, a separate key management server is required to store the keys outside of the cluster. Each key management server device must be configured and addressable through the network. It is recommended that multiple key manager server devices be configured to work in clustered mode so they can be added to the cluster configuration as a single entity that is resilient to a single failure.
  • It is recommended that you backup and save the encryption keys with identifiable names before and after changing the KMS type. For backing up keys, see Backing up Keys.
To change the KMS type, change the KMS selection by editing the encryption configuration. For details, see step 3 in Configuring Data-at-Rest Encryption (Software Only) section.
Figure. Select KMS type Click to enlarge select KMS type

After you change the KMS type and save the configuration, the encryption keys are re-generated on the selected KMS storage medium and data is re-encrypted with the new keys. The old keys are destroyed.
Note: This operation completes in a few minutes, depending on the number of encrypted objects and network speed.

Changing Key Encryption Keys (Software Only)

The key encryption key (KEK) can be changed at any time. This can be useful as a periodic password rotation security precaution or when a key management server or node becomes compromised. If the key management server is compromised, only the KEK needs to be changed, because the KEK is independent of the drive encryption key (DEK). There is no need to re-encrypt any data, just to re-encrypt the DEK.

About this task

To change the KEKs for a cluster, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
  2. In the Cluster Encryption page, select Manage Keys and click the Rekey button under Software Encryption .
    Note: The Rekey button appears only when cluster protection is active.
    Note: If the cluster is already protected and a new key management server is added, you must press the Rekey button to use this new key management server for storing secrets.
    Figure. Cluster Encryption Screen Click to enlarge

    Note: The system automatically regenerates the leader key yearly.

Destroying Data (Software Only)

Data on the AOS cluster is always encrypted, and the data encryption key (DEK) used to read the encrypted data is known only to the AOS. All data on the drive can effectively be destroyed (that is, become permanently unreadable) by deleting the container or cluster. This is known as a crypto-erase.

About this task

Note: To help ensure that your data is secure, you cannot disable software-only data-at-rest encryption once it is enabled. Nutanix recommends regularly backing up your data, encryption keys, and key management server.

To crypto-erase the container or cluster, do the following:

Procedure

  1. Delete the storage container or destroy the cluster.
    • For information on how to delete a storage container, see Modifying a Storage Container in the Prism Web Console Guide .
    • For information on how to destroy a cluster, see Destroying a Cluster in the Acropolis Advanced Administration Guide .
    Note:

    When you delete a storage container, the Curator scans and deletes the DEK and KEK keys automatically.

    When you destroy a cluster, then:

    • the Native Key Manager (local) destroys the master key shares and the encrypted DEKs/KEKs.
    • the Native Key Manager (remote) retains the root key on the PC if the cluster is still registered to a PC when it is destroyed. You must unregister a cluster from the PC and then destroy the cluster to delete the root key.
    • the External Key Manager deletes the encrypted DEKs. However, the KEKs remain on the EKM. You must use an external key manager UI to delete the KEKs.
  2. Delete the key backup files, if any.

Switching from SED-EKM to Software-LKM

This section describes the steps to switch from SED and External KMS combination to software-only and LKM combination.

About this task

To switch from SED-EKM to Software-LKM, do the following.

Procedure

  1. Perform the steps for the software-only encryption with External KMS. For more information, see Configuring Data-at-Rest Encryption (Software Only).
    After the background task completes, all the data gets encrypted by the software. The time taken to complete the task depends on the amount of data and foreground I/O operations in the cluster.
  2. Disable the SED encryption. Ensure that all the disks are unprotected.
    For more information, see Enabling/Disabling Encryption (SEDs).
  3. Switch the key management server from the External KMS to Local Key Manager. For more information, see Switching between Native Key Manager and External Key Manager.

Configuring Dual Encryption

About this task

Dual Encryption protects the data on the clusters using both SED and software-only encryption. An external key manager is used to store the keys for dual encryption, the Native KMS is not supported.

To configure dual encryption, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
  2. In the Cluster Encryption page, check to enable both Drive-based and Software-based encryption.
  3. Click Save Encryption Type .
    Figure. Dual Encryption Click to enlarge selecting encryption types

  4. Continue with the rest of the encryption configuration, see:
    • Configuring Data-at-Rest Encryption (Software Only)
    • Configuring Data-at-Rest Encryption (SEDs)

Backing up Keys

About this task

You can take a backup of encryption keys:

  • when you enable Software-only Encryption for the first time
  • after you regenerate the keys

Backing up encryption keys is critical in the very unlikely situation in which keys get corrupted.

You can download key backup file for a cluster on a PE or all clusters on a PC. To download key backup file for all clusters, see Taking a Consolidated Backup of Keys (Prism Central) .

To download the key backup file for a cluster, do the following:

Procedure

  1. Log on to the Prism Element web console.
  2. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
  3. In the Cluster Encryption page, select Manage Keys .
  4. Enter and confirm the password.
  5. Click the Download Key Backup button.

    The backup file is saved in the default download location on your local machine.

    Note: Ensure you move the backup key file to a safe location.

Taking a Consolidated Backup of Keys (Prism Central)

If you are using the Native KMS option with software encryption for your clusters, you can take a consolidated backup of all the keys from Prism Central.

About this task

To take a consolidated backup of keys for software encryption-enabled clusters (Native KMS-only), do the following:

Procedure

  1. Log on to the Prism Central web console.
  2. Click the hamburger icon, then select Clusters > List view.
  3. Select a cluster, go to Actions , then select Manage & Backup Keys .
  4. Download the backup keys:
    1. In Password , enter your password.
    2. In Confirm Password , reenter your password.
    3. To change the encryption key, select the Rekey Encryption Key (KEK) box .
    4. To download the backup key, click Backup Key .
    Note: Ensure that you move the backup key file to a safe location.

Importing Keys

You can import the encryption keys from backup. You must note the specific commands in this topic if you backed up your keys to an external key manager (EKM)

About this task

Note: Nutanix recommends that you contact Nutanix Support for this operation. Extended cluster downtime might result if you perform this task incorrectly.

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. Retrieve the encryption keys stored on the cluster and verify that all the keys you want to retrieve are listed.
    In this example, the password is Nutanix.123 . date is the timestamp portion of the backup file name.
    mantle_recovery_util --backup_file_path=/home/nutanix/encryption_key_backup_date \
    --password=Nutanix.123 --list_key_ids=true 
  3. Import the keys into the cluster.
    mantle_recovery_util --backup_file_path=/home/nutanix/key_backup \
    --password=Nutanix.123 --interactive_mode 
  4. If you are using an external key manager such as IBM Security Key Lifecycle Manager, Gemalto Safenet, or Vormetric Data Security Manager, use the --store_kek_remotely option to import the keys into the cluster.
    In this example, date is the timestamp portion of the backup file name.
    mantle_recovery_util --backup_file_path path/encryption_key_backup_date \
     --password key_password --store_kek_remotely

Securing Traffic Through Network Segmentation

Network segmentation enhances security, resilience, and cluster performance by isolating a subset of traffic to its own network.

You can achieve traffic isolation in one or more of the following ways:

Isolating Backplane Traffic by using VLANs (Logical Segmentation)
You can separate management traffic from storage replication (or backplane) traffic by creating a separate network segment (LAN) for storage replication. For more information about the types of traffic seen on the management plane and the backplane, see Traffic Types In a Segmented Network.

To enable the CVMs in a cluster to communicate over these separated networks, the CVMs are multihomed. Multihoming is facilitated by the addition of a virtual network interface card (vNIC) to the Controller VM and placing the new interface on the backplane network. Additionally, the hypervisor is assigned an interface on the backplane network.

The traffic associated with the CVM interfaces and host interfaces on the backplane network can be secured further by placing those interfaces on a separate VLAN.

In this type of segmentation, both network segments continue to use the same external bridge and therefore use the same set of physical uplinks. For physical separation, see Isolating the Backplane Traffic Physically on an Existing Cluster.

Isolating backplane traffic from management traffic requires minimal configuration through the Prism web console. No manual host (hypervisor) configuration steps are required.

For information about isolating backplane traffic, see Isolating the Backplane Traffic Logically on an Existing Cluster (VLAN-Based Segmentation Only).

Isolating Backplane Traffic Physically (Physical Segementation)

You can physically isolate the backplane traffic (intra cluster traffic) from the management traffic (Prism, SSH, SNMP) in to a separate vNIC on the CVM and using a dedicated virtual network that has its own physical NICs. This type of segmentation therefore offers true physical separation of the backplane traffic from the management traffic.

You can use Prism to configure the vNIC on the CVM and configure the backplane traffic to communicate over the dedicated virtual network. However, you must first manually configure the virtual network on the hosts and associate it with the physical NICs that it requires for true traffic isolation.

For more information about physically isolating backplane traffic, see Isolating the Backplane Traffic Physically on an Existing Cluster.

Isolating service-specific traffic
You can also secure traffic associated with a service (for example, Nutanix Volumes) by confining its traffic to a separate vNIC on the CVM and using a dedicated virtual network that has its own physical NICs. This type of segmentation therefore offers true physical separation for service-specific traffic.

You can use Prism to create the vNIC on the CVM and configure the service to communicate over the dedicated virtual network. However, you must first manually configure the virtual network on the hosts and associate it with the physical NICs that it requires for true traffic isolation. You need one virtual network for each service you want to isolate. For a list of the services whose traffic you can isolate in the current release, see Cluster Services That Support Traffic Isolation.

For information about isolating service-specific traffic, see Isolating Service-Specific Traffic.

Isolating Stargate-to-Stargate traffic over RDMA
Some Nutanix platforms support remote direct memory access (RDMA) for Stargate-to-Stargate service communication. You can create a separate virtual network for RDMA-enabled network interface cards. If a node has RDMA-enabled NICs, Foundation passes the NICs through to the CVMs during imaging. The CVMs use only the first of the two RDMA-enabled NICs for Stargate-to-Stargate communications. The virtual NIC on the CVM is named rdma0. Foundation does not configure the RDMA LAN. After creating a cluster, you need to enable RDMA by creating an RDMA LAN from the Prism web console. For more information about RDMA support, see Remote Direct Memory Access in the NX Series Hardware Administration Guide .

For information about isolating backplane traffic on an RDMA cluster, see Isolating the Backplane Traffic on an Existing RDMA Cluster.

Traffic Types In a Segmented Network

The traffic entering and leaving a Nutanix cluster can be broadly classified into the following types:

Backplane traffic
Backplane traffic is intra-cluster traffic that is necessary for the cluster to function, and it comprises traffic between CVMs and traffic between CVMs and hosts for functions such as storage RF replication, host management, high availability, and so on. This traffic uses eth2 on the CVM. In AHV, VM live migration traffic is also backplane, and uses the AHV backplane interface, VLAN, and virtual switch when configured. For nodes that have RDMA-enabled NICs, the CVMs use a separate RDMA LAN for Stargate-to-Stargate communications.
Management traffic
Management traffic is administrative traffic, or traffic associated with Prism and SSH connections, remote logging, SNMP, and so on. The current implementation simplifies the definition of management traffic to be any traffic that is not on the backplane network, and therefore also includes communications between user VMs and CVMs. This traffic uses eth0 on the CVM.

Traffic on the management plane can be further isolated per service or feature. An example of this type of traffic is the traffic that the cluster receives from external iSCSI initiators (Nutanix Volumes iSCSI traffic). For a list of services supported in the current release, see Cluster Services That Support Traffic Isolation.

Segmented and Unsegmented Networks

In the default unsegmented network in a Nutanix cluster (ESXi and AHV), the Controller VM has two virtual network interfaces—eth0 and eth1.

Interface eth0 is connected to the default external virtual switch, which is in turn connected to the external network through a bond or NIC team that contains the host physical uplinks.

Interface eth1 is connected to an internal network that enables the CVM to communicate with the hypervisor.

In the below unsegmented network (see Unsegmented Network - ESXi Cluster , and Unsegmented Network - AHV Cluster images) all external CVM traffic, whether backplane or management traffic, uses interface eth0. These interfaces are on the default VLAN on the default virtual switch.

Figure. Unsegmented Network- ESXi Cluster Click to enlarge

This figure shows an unsegmented network (AHV cluster).

In AHV, VM live migration traffic is also backplane, and uses the AHV backplane interface, VLAN, and virtual switch when configured.

Figure. Unsegmented Network- AHV Cluster Click to enlarge

If you further isolate service-specific traffic, additional vNICs are created on the CVM. Each service requiring isolation is assigned a dedicated virtual NIC on the CVM. The NICs are named ntnx0, ntnx1, and so on. Each service-specific NIC is placed on a configurable existing or new virtual network (vSwitch or bridge) and a VLAN and IP subnet are specified.

Network with Segmentation

In a segmented network, management traffic uses CVM interface eth0 and additional services can be isolated to different VLANs or virtual switches. In backplane segmentation, the backplane traffic uses interface eth2. The backplane network uses either the default VLAN or, optionally, a separate VLAN that you specify when segmenting the network. In ESXi, you must select a port group for the new vmkernel interface. In AHV this internal interface is created automatically in the selected virtual switch. For physical separation of the backplane network, create this new port group on a separate virtual switch in ESXi, or select the desired virtual switch in the AHV GUI.

If you want to isolate service-specific traffic such as Volumes or Disaster Recovery as well as backplane traffic, then additional vNICs are needed on the CVM, but no new vmkernel adapters or internal interfaces are required. AOS creates additional vNICs on the CVM. Each service that requires isolation is assigned a dedicated vNIC on the CVM. The NICs are named ntnx0, ntnx1, and so on. Each service-specific NIC is placed on a configurable existing or new virtual network (vSwitch or bridge) and a VLAN and IP subnet are specified.

You can choose to perform backplane segmentation alone, with no other forms of segmentation. You can also choose to use one or more types of service specific segmentation with or without backplane segmentation. In all of these cases, you can choose to segment any service to either the existing, or a new virtual switch for further physical traffic isolation. The combination selected is driven by the security and networking requirements of the deployment. In most cases, the default configuration with no segmentation of any kind is recommended due to simplicity and ease of deployment.

The following figure shows an implementation scenario where the backplane and service specific segmentation is configured with two vSwitches on ESXi hypervisors.

Figure. Backplane and Service Specific Segmentation Configured with two vSwitches on an ESXi Cluster Click to enlarge

Here are the CVM to ESXi hypervisor connection details:

  • The eth0 vNIC on the CVM and vmk0 on the host are carrying management traffic and connected to the hypervisor through the existing PGm (portgroup) on vSwitch0.
  • The eth2 vNIC on the CVM and vmk2 on the host are carrying backplane traffic and connected to the hypervisor through a new user created PGb on the existing vSwitch.
  • The ntnx0 vNIC on the CVM is carrying iSCSI traffic and connected to the hypervisor through PGi on the vSwitch1. No new vmkernel adapter is required.
  • The ntnx1 vNIC on the CVM is carrying DR traffic and connected to the hypervisor through PGd on the vSwitch2. Here as well, there is no new vmkernel adapter required.

The following figure shows an implementation scenario where the backplane and service specific segmentation is configured with two vSwitches on an AHV hypervisors.

Figure. Backplane and Service Specific Segmentation Configured with two vSwitches on an AHV Cluster Click to enlarge

Here are the CVM to AHV hypervisor connection details:

  • The eth0 vNIC on the CVM is carrying management traffic and connected to the hypervisor through the existing vnet0.
  • Other vNICs such as eth2, ntnx0, and ntnx1 are connected to the hypervisor through the auto created interfaces on either the existing or new vSwitch.
Note: In the above figure the interface name 'br0-bp' is read as 'br0-backplane'.

The following table describes the vNIC, port group (PG), VM kernel (vmk), virtual network (vnet) and virtual switch connections for CVM and hypervisor in different implementation scenarios. The tables capture information for ESXi and AHV hypervisors:

Table 1.
Implementation Scenarios vNICs on CVM Connected to ESXi Hypervisor Connected to AHV Hypervisor
Backplane Segmentation with 1 vSwitch

eth0:

DR, iSCSI, andManagement traffic

vmk0 via existing PGm on vSwitch Existing vnet0

eth2:

Backplane traffic

New vmk2 via PGb on vSwitch0

CVM vNIC via PGb on vSwitch0

Auto created interfaces on bridge br0
Backplane Segmentation with 2 vSwitches

eth0:

Management traffic

vmk0 via existing PGm on vSwitch0 Existing vnet0

eth2:

Backplane traffic

New vmk2 via PGb on new vSwitch

CVM vNIC via PGb on new vSwitch

Auto created interfaces on new virtual switch
Service Specific Segmentation for Volumes with 1 vSwitch

eth0:

DR, Backplane, and Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

ntnx0:

iSCSI (Volumes) traffic

CVM vNIC via PGi on vSwitch0

Auto created interface on existing br0
Service Specific Segmentation for Volumes with 2 vSwitches

eth0:

DR, Backplane, and Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

ntnx0:

iSCSI (Volumes) traffic

CVM vNIC via PGi on new vSwitch

Auto created interface on new virtual switch
Service Specific Segmentation for DR with 1 vSwitch

eth0:

iSCSI, Backplane, and Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

ntnx1:

DR traffic

CVM vNIC via PGd on vSwitch0

Auto created interface on existing br0

Service Specific Segmentation for DR with 2 vSwitches

eth0:

iSCSI, Backplane, and Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

ntnx1:

DR traffic

CVM vNIC via PGd on new vSwitch

Auto created interface on new virtual switch

Backplane and Service Specific Segmentation with 1 vSwitch

eth0:

Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

eth2:

Backplane traffic

New vmk2 via PGb on vSwitch0

CVM vNIC via PGb on vSwitch0

Auto created interfaces on br0

ntnx0:

iSCSI traffic

CVM vNIC via PGi on vSwitch0 Auto created interface on br0

ntnx1:

DR traffic

CVM vNIC via PGd on vSwitch0

Auto created interface on br0
Backplane and Service Specific Segmentation with 2 vSwitches

eth0:

Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

eth2:

Backplane traffic

New vmk2 via PGb on new vSwitch

CVM vNIC via PGb on new vSwitch

Auto created interfaces on new virtual switch

ntnx0:

iSCSI traffic

CVM vNIC via PGi on vSwitch1

No new user defined vmkernel adapter is required.

Auto created interface on new virtual switch

ntnx1:

DR traffic

CVM vNIC via PGd on vSwitch2.

No new user defined vmkernel adapter is required.

Auto created interface in new virtual switch

Implementation Considerations

Supported Environment

Network segmentation is supported in the following environment:

  • The hypervisor must be one of the following:
    • For network segmentation by traffic type (separating backplane traffic from management traffic):
      • AHV
      • ESXi
      • Hyper-V
        Note: Only logical segmentation (or VLAN-based segmentation) is supported on Hyper-V. Physical network segmentation is not supported on Hyper-V.
    • For service-specific traffic isolation:
      • AHV
      • ESXi
  • For logical network segmentation, AOS version must be 5.5 or later. For physical segmentation and service-specific traffic isolation, the AOS version must be 5.11 or later.
  • From the 5.11.1 release, you cannot enable network segmentation on mixed-hypervisor clusters. However, if you enable network segmentation on a mixed-hypervisor cluster running a release earlier than 5.11.1 and you upgrade that cluster to 5.11.1 or later, network segmentation continues to work seamlessly.
  • RDMA requirements:
    • Network segmentation is supported with RDMA for AHV and ESXi hypervisors only.
    • For more information about RDMA, see Remote Direct Memory Access in the NX Series Hardware Administration Guide .

Prerequisites

For Nutanix Volumes

Stargate does not monitor the health of a segmented network. If physical network segmentation is configured, network failures or connectivity issues are not tolerated. To overcome this issue, configure redundancy in the network. That is, use two or more uplinks in a fault tolerant configuration, connected to two separate physical switches.

For Disaster Recovery

  • Ensure that the VLAN and subnet that you plan to use for the network segment are routable.
  • Make sure that you have a pool of IP addresses to specify when configuring segmentation. For each cluster, you need n+1 IP addresses, where n is the number of nodes in the cluster. The additional IP address is for the virtual IP address requirement.
  • Enable network segmentation for disaster recovery at both sites (local and remote) before configuring remote sites at those sites.

Limitations

For Nutanix Volumes

  • If network segmentation is enabled for Volumes, volume group attachments are not recovered during VM recovery.
  • Nutanix service VMs such as Files and Buckets continue to communicate with the CVM eth0 interface when using Volumes for iSCSI traffic. Other external clients use the new service-specific CVM interface.

For Disaster Recovery

The system does not support configuring a Leap DR and DR service specific traffic isolation together.

Cluster Services That Support Traffic Isolation

In this release, you can isolate traffic associated with the following services to its own virtual network:

  • Management (The default network that cannot be moved from CVM eth0)

  • Backplane

  • RDMA

  • Service Specific Disaster Recovery

  • Service Specific Volumes

Configurations in Which Network Segmentation Is Not Supported

Network segmentation is not supported in the following configurations:

  • Clusters on which the CVMs have a manually created eth2 interface.
  • Clusters on which the eth2 interface on one or more CVMs have been assigned an IP address manually. During an upgrade to an AOS release that supports network segmentation, an eth2 interface is created on each CVM in the cluster. Even though the cluster does not use these interfaces until you configure network segmentation, you must not manually configure these interfaces in any way.
Caution:

Nutanix has deprecated support for manual multi-homed CVM network interfaces from AOS version 5.15 and later. Such a manual configuration can lead to unexpected issues on these releases. If you have configured an eth2 interface on the CVM manually, refer to the KB-9479 and Nutanix Field Advisory #78 for details on how to remove the eth2 interface.

Configuring the Network on an AHV Host

These steps describe how to configure host networking for physical and service-specific network segmentation on an AHV host. These steps are prerequisites for physical and service-specific network segmentation and you must perform these steps before you perform physical or service-specific traffic isolation. If you are configuring networking on an ESXi host, perform the equivalent steps by referring to the ESXi documentation. On ESXi, you create vSwitches and port groups to achieve the same results.

About this task

For information about the procedures to create, update and delete a virtual switch in Prism Element Web Console, see Configuring a Virtual Network for Guest VMs in the Prism Web Console Guide .

Note: The term unconfigured node in this procedure refers to a node that is not part of a cluster and is being prepared for cluster expansion.

To configure host networking for physical and service-specific network segmentation, do the following:

Note: If you are segmenting traffic on nodes that are already part of a cluster, perform the first step. If you are segmenting traffic on an unconfigured node that is not part of a cluster, perform the second step directly.

Procedure

  1. If you are segmenting traffic on nodes that are already part of a cluster, do the following:
    1. From the default virtual switch vs0, remove the uplinks that you want to add to the virtual switch you created by updating the default virtual switch.

      For information about updating the default virtual switch vs0 to remove the uplinks, see Creating or Updating a Virtual Switch in the Prism Web Console Guide .

    2. Create a virtual switch for the backplane traffic or service whose traffic you want to isolate.
      Add the uplinks to the new virtual switch.

      For information about creating a new virtual switch, see Creating or Updating a Virtual Switch in the Prism Web Console Guide .

  2. If you are segmenting traffic on an unconfigured node (new host) that is not part of a cluster, do the following:
    1. Create a bridge for the backplane traffic or service whose traffic you want to isolate by logging on to the new host.
      ovs-vsctl add-br br1
    2. From the default bridge br0, log on to the host CVM and keep only eth0 and eth1 in br0.
      manage_ovs --bridge_name br0 --interfaces eth0,eth1 --bond_name br0-up --bond_mode active-backup update_uplinks
    3. Log on to the host CVM and then add eth2 and eth3 to the uplink bond of br1
      manage_ovs --bridge_name br1 --interfaces eth2,eth3 --bond_name br1-up --bond_mode active-backup update_uplinks
      Note: If this step is not done correctly, a network loop can be created that causes a network outage. Ensure that no other uplink interfaces exist on this bridge before adding the new interfaces, and always add interfaces into a bond.

What to do next

Prism can configure a VLAN only on AHV hosts. Therefore, if the hypervisor is ESXi, in addition to configuring the VLAN on the physical switch, make sure to configure the VLAN on the port group.

If you are performing physical network segmentation, see Isolating the Backplane Traffic Physically on an Existing Cluster.

If you are performing service-specific traffic isolation, see Service-Specific Traffic Isolation.

Network Segmentation for Traffic Types (Backplane, Management, and RDMA)

You can segment the network on a Nutanix cluster in the following ways:

  • You can segment the network on an existing cluster by using the Prism web console.
  • You can segment the network when creating a cluster by using Nutanix Foundation 3.11.2 or higher versions.

The following topics describe network segmentation procedures for existing clusters and changes during AOS upgrade and cluster expansion. For more information about segmenting the network when creating a cluster, see the Field Installation Guide.

Isolating the Backplane Traffic Logically on an Existing Cluster (VLAN-Based Segmentation Only)

You can segment the network on an existing cluster by using the Prism web console. You must configure a separate VLAN for the backplane network to achieve logical segmentation. The network segmentation process creates a separate network for backplane communications on the existing default virtual switch. The process then places the eth2 interfaces (that the process creates on the CVMs during upgrade) and the host interfaces on the newly created network. This method allows you to achieve logical segmentation of traffic over the selected VLAN. From the specified subnet, assign IP addresses to each new interface. You, therefore, need two IP addresses per node. When you specify the VLAN ID, AHV places the newly created interfaces on the specified VLAN.

Before you begin

If your cluster has RDMA-enabled NICs, follow the procedure in Isolating the Backplane Traffic on an Existing RDMA Cluster.

  • For ESXi clusters, it is mandatory to create and manage port groups that networking uses for CVM and backplane networking. Therefore, ensure that you create port groups on the default virtual switch vs0 for the ESXi hosts and CVMs.

    Since backplane traffic segmentation is logical, it is based on the VLAN that is tagged for the port groups. Therefore, while creating the port groups ensure that you tag the new port groups created for the ESXi hosts and CVMs with the appropriate VLAN ID. Consult your networking team to acquire the necessary VLANs for use with Nutanix nodes.

  • For new backplane networks, you must specify a non-routable subnet. The interfaces on the backplane network are automatically assigned IP addresses from this subnet, so reserve the entire subnet for the backplane network segmentation.

About this task

You need separate VLANs for Management network and Backplane network. For example, configure VLAN 100 as Management network VLAN and VLAN 200 as Backplane network VLAN on the Ethernet links that connect the Nutanix nodes to the physical switch.
Note: Nutanix does not control these VLAN IDs. Consult your networking team to acquire VLANs for the Management and Backplane networks.

To segment the network on an existing cluster for a backplane LAN, do the following:

Note:

In this method, for AHV nodes, logical segmentation (VLAN-based segmentation) is done on the default bridge. The process creates the host backplane interface on the Backplane Network port group on ESXi or br0-backplane (interface) on br0 bridge in case of AHV. The eth2 interface on the CVM is on CVM Backplane Network by default.

Procedure

  1. Log on to the Prism web console, click the gear icon in the top-right corner, and then click Network Configuration in the Settings page.
    The Network Configuration dialog box appears.
  2. In the Network Configuration > Controller VM Interfaces > Backplane LAN row, click Configure .
    The Create Interface dialog box appears.
  3. In the Create Interface dialog box, provide the necessary information.
    • In Subnet IP , specify a non-routable subnet.

      Ensure that the subnet has sufficient IP addresses. The segmentation process requires two IP addresses per node. Reconfiguring the backplane to increase the size of the subnet involves cluster downtime, so you might also want to make sure that the subnet can accommodate new nodes in the future.

    • In Netmask , specify the netmask.
    • If you want to assign the interfaces on the network to a VLAN, specify the VLAN ID in the VLAN ID field.

      Nutanix recommends that you use a VLAN. If you do not specify a VLAN ID, the default VLAN on the virtual switch is used.

  4. Click Verify and Save .
    The network segmentation process creates the backplane network if the network settings that you specified pass validation.

Isolating the Backplane Traffic on an Existing RDMA Cluster

Segment the network on an existing RDMA cluster by using the Prism web console.

About this task

The network segmentation process creates a separate network for RDMA communications on the existing default virtual switch and places the rdma0 interface (created on the CVMs during upgrade) and the host interfaces on the newly created network. From the specified subnet, IP addresses are assigned to each new interface. Two IP addresses are therefore required per node. If you specify the optional VLAN ID, the newly created interfaces are placed on the VLAN. A separate VLAN is highly recommended for the RDMA network to achieve true segmentation.

Before you begin

  • For new RDMA networks, you must specify a non-routable subnet. The interfaces on the backplane network are automatically assigned IP addresses from this subnet, so reserve the entire subnet for the backplane network alone.
  • If you plan to specify a VLAN for the RDMA network, make sure that the VLAN is configured on the physical switch ports to which the nodes are connected.
  • Configure the switch interface as a Trunk port.
  • Ensure that this cluster is configured to support RDMA during installation using the Foundation.

Procedure

  1. Log on to the Prism web console, click the gear icon in the top-right corner, and then click Network Configuration in the Settings page.
    The Network Configuration dialog box is displayed.
  2. Click the Internal Interfaces tab.
  3. Click Configure in the RDMA row.
    Ensure that you have configured the switch interface as a trunk port.
    The Create Interface dialog box is displayed.
    Figure. Create Interface Dialog Box Click to enlarge

  4. In the Create Interface dialog box, do the following:
    1. In Subnet IP and Netmask , specify a non-routable subnet and netmask, respectively. Make sure that the subnet can accommodate cluster expansion in the future.
    2. In VLAN , specify a VLAN ID for the RDMA LAN.
      A VLAN ID is optional but highly recommended for true network segmentation and enhanced security.
    3. c. From the PFC list, select the priority flow control value configured on the physical switch port.
  5. Click Verify and Save .
  6. Click Close .

Isolating the Backplane Traffic Physically on an Existing Cluster

By using the Prism web console, you can configure the eth2 interface on a separate vSwitch (ESXi) or bridge (AHV). The network segmentation process creates a separate network for backplane communications on the new bridge or vSwitch and places the eth2 interfaces (that are created on the CVMs during upgrade) and the host interfaces on the newly created network. From the specified subnet, IP addresses are assigned to each new interface. Two IP addresses are therefore required per node. If you specify the optional VLAN ID, the newly created interfaces are placed on the VLAN. A separate VLAN is highly recommended for the backplane network to achieve true segmentation.

Before you begin

The physical isolation of backplane traffic is supported from the AOS 5.11.1 release.

Before you enable physical isolation of the backplane traffic, configure the network (port groups or bridges) on the hosts and associate the network with the required physical NICs.

Note: Nutanix does not support physical network segmentation on Hyper-V.

On the AHV hosts, do the following:

  1. Create a bridge for the backplane traffic.
  2. From the default bridge br0, remove the uplinks (physical NICs) that you want to add to the bridge you created for the backplane traffic.
  3. Add the uplinks to a new bond.
  4. Add the bond to the new bridge.

See Configuring the Network on an AHV Host for instructions about how to perform these tasks on a host.

On the ESXi hosts, do the following:

  1. Create a vSwitch for the backplane traffic.
  2. From vSwitch0, remove the uplinks (physical NICs) that you want to add to the vSwitch you created for the backplane traffic.
  3. On the backplane vSwitch, create one port group for the CVM and another for the host. Ensure that at least one uplink is present in the Active Adaptors list for each port group if you have overridden the failover order.

See the ESXi documentation for instructions about how to perform these tasks.

Note: Before you perform the following procedure, ensure that the uplinks you added to the vSwitch or bridge are in the UP state.

About this task

Perform the following procedure to physically segment the backplane traffic.

Procedure

  1. Log on to the Prism web console, click the gear icon in the top-right corner, and then click Network Configuration in the Settings page.
  2. On the Controller VM Interfaces tab, in the Backplane LAN row, click Configure .
  3. In the Backplane LAN dialog box, do the following:
    1. In Subnet IP , specify a non-routable subnet.
      Make sure that the subnet has a sufficient number of IP addresses. Two IP addresses are required per node. Reconfiguring the backplane to increase the size of the subnet involves cluster downtime, so you might also want to make sure that the subnet can accommodate new nodes in the future.
    2. In Netmask , specify the netmask.
    3. If you want to assign the interfaces on the network to a VLAN, specify the VLAN ID in the VLAN ID field.
      A VLAN is strongly recommended. If you do not specify a VLAN ID, the default VLAN on the virtual switch is implied.
    4. (AHV only) In the Host Node list, select the bridge you created for the backplane traffic.
    5. (ESXi only) In the Host Port Group list, select the port group you created for the host.
    6. (ESXi only) In the CVM Port Group list, select the port group you created for the CVM.
    Note:

    Nutanix clusters support both vSphere Standard Switches and vSphere Distributed Switches. However, you must mandatorily configure only one type of virtual switches in one cluster. Configure all the backplane and management traffic in one cluster on either vSphere Standard Switches or vSphere Distributed Switches. Do not mix Standard and Distributed vSwitches on a single cluster.

  4. Click Verify and Save .
    If the network settings you specified pass validation, the backplane network is created and the CVMs perform a reboot in a rolling fashion (one at a time), after which the services use the new backplane network. The progress of this operation can be tracked on the Prism tasks page.
    Note: Segmenting backplane traffic can involve up to two rolling reboots of the CVMs. The first rolling reboot is done to move the backplane interface (eth2) of the CVM to the selected port group or bridge. This is done only for CVM(s) whose backplane interface is not already connected to the selected port group or bridge. The second rolling reboot is done to migrate the cluster services to the newly configured backplane interface.
  5. Restart the Acropolis service on all the nodes in the cluster.
    Note: Perform this step only if your AOS version is 5.17.0.x. This step is not required if your AOS version is 5.17.1 or later.
    1. Log on to any CVM in the cluster with SSH.
    2. Stop the Acropolis service.
      nutanix@cvm$ allssh genesis stop acropolis
      Note: You cannot manage your guest VMs after the Acropolis service is stopped.
    3. Verify if the Acropolis service is DOWN on all the CVMs.
      nutanix@cvm$ cluster status | grep -v UP 

      An output similar to the following is displayed:

      
      nutanix@cvm$ cluster status | grep -v UP 
      
      2019-09-04 14:43:18 INFO zookeeper_session.py:143 cluster is attempting to connect to Zookeeper 
      
      2019-09-04 14:43:18 INFO cluster:2774 Executing action status on SVMs X.X.X.1, X.X.X.2, X.X.X.3 
      
      The state of the cluster: start 
      
      Lockdown mode: Disabled 
              CVM: X.X.X.1 Up 
                                 Acropolis DOWN       [] 
              CVM: X.X.X.2 Up, ZeusLeader 
                                 Acropolis DOWN       [] 
              CVM: X.X.X.3 Maintenance
    4. From any CVM in the cluster, start the Acropolis service.
      nutanix@cvm$ cluster start 

Reconfiguring the Backplane Network

Backplane network reconfiguration is a CLI-driven procedure that you perform on any one of the CVMs in the cluster. The change is propagated to the remaining CVMs.

About this task

Caution: At the end of this procedure, the cluster stops and restarts, even if only the VLAN is changed, and therefore involves cluster downtime.

To reconfigure the cluster, do the following:

Procedure

  1. Log on to any CVM in the cluster with SSH.
  2. Reconfigure the backplane network.
    nutanix@cvm$ backplane_ip_reconfig [--backplane_vlan=vlan-id] \
    [--backplane_subnet=subnet_ip_address --backplane_netmask=netmask]

    Replace vlan-id with the new VLAN ID, subnet_ip_address with the new subnet IP address, and netmask with the new netmask.

    For example, reconfigure the backplane network to use VLAN ID 10 and subnet 172.30.25.0 with netmask 255.255.255.0 .

    nutanix@cvm$ backplane_ip_reconfig --backplane_vlan=10 \
    --backplane_subnet=172.30.25.0 --backplane_netmask=255.255.255.0

    Output similar to the following is displayed:

    This operation will do a 'cluster stop', resulting in disruption of 
    cluster services. Do you still want to continue? (Type "yes" (without quotes) 
    to continue)
    Type yes to confirm that you want to reconfigure the backplane network.
    Caution: During the reconfiguration process, you might receive an error message similar to the following.
    Failed to reach a node.
    You can safely ignore this error message and therefore do not stop the script manually.
    Note: The backplane_ip_reconfig command is not supported on ESXi clusters with vSphere Distributed Switches. To reconfigure the backplane network on a vSphere Distributed Switch setup, disable the backplane network (see Disabling Network Segmentation on an ESXi and Hyper-V Cluster) and enable again with a different subnet or VLAN.
  3. Type yes to confirm that you want to reconfigure the backplane network.
    The reconfiguration procedure takes a few minutes and includes a cluster restart. If you type anything other than yes , network reconfiguration is aborted.
  4. After the process completes, verify that the backplane was reconfigured.
    1. Verify that the IP addresses of the eth2 interfaces on the CVM are set correctly.
      nutanix@cvm$ svmips -b
      Output similar to the following is displayed:
      172.30.25.1 172.30.25.3 172.30.25.5
    2. Verify that the IP addresses of the backplane interfaces of the hosts are set correctly.
      nutanix@cvm$ hostips -b
      Output similar to the following is displayed:
      172.30.25.2 172.30.25.4 172.30.25.6
    The svmips and hostips commands, when used with the option b , display the IP addresses assigned to the interfaces on the backplane.

Disabling Network Segmentation on an ESXi and Hyper-V Cluster

Backplane network reconfiguration is a CLI-driven procedure that you perform on any one of the CVMs in the cluster. The change is propagated to the remaining CVMs.

About this task

Procedure

  1. Log on to any CVM in the cluster with SSH.
  2. Disable the backplane network.
    • Use this CLI to disable network segmentation on an ESXi and Hyper-V cluster:
      nutanix@cvm$ network_segmentation --backplane_network --disable

      Output similar to the following appears:

      Operation type : Disable
      Network type : kBackplane
      Params : {}
      Please enter [Y/y] to confirm or any other key to cancel the operation

      Type Y/y to confirm that you want to reconfigure the backplane network.

      If you type Y/y, network segmentation is disabled and the cluster restarts in a rolling manner, one CVM at a time. If you type anything other than Y/y, network segmentation is not disabled.

  3. Verify that network segmentation was successfully disabled. You can verify this in one of two ways:
    • Verify that the backplane is disabled.
      nutanix@cvm$ network_segment_status

      Output similar to the following is displayed:

      2017-11-23 06:18:23 INFO zookeeper_session.py:110 network_segment_status is attempting to connect to Zookeeper

      Network segmentation is disabled

    • Verify that the commands to show the backplane IP addresses of the CVMs and hosts list the management IP addresses (run the svmips and hostips commands once without the b option and once with the b option, and then compare the IP addresses shown in the output).
      Important:
      nutanix@cvm$ svmips
      192.127.3.2 192.127.3.3 192.127.3.4
      nutanix@cvm$ svmips -b
      192.127.3.2 192.127.3.3 192.127.3.4
      nutanix@cvm$ hostips
      192.127.3.5 192.127.3.6 192.127.3.7
      nutanix@cvm$ hostips -b
      192.127.3.5 192.127.3.6 192.127.3.7

      In the example above, the outputs of the svmips and hostips commands with and without the b option are the same, indicating that the backplane network segmentation is disabled.

Disabling Network Segmentation on an AHV Cluster

About this task

You perform backplane network reconfiguration procedure on any one of the CVMs in the cluster. The change propagates to the remaining CVMs.

Procedure

  1. Shut down all the guest VMs in the cluster from within the guest OS or use the Prism Element web console.
  2. Place all nodes of a cluster into the maintenance mode.
    1. Use SSH to log on to a Controller VM in the cluster
    2. Determine the IP address of the node you want to put into the maintenance mode:
      nutanix@cvm$ acli host.list
      Note the value of Hypervisor IP for the node you want to put in the maintenance mode.
    3. Put the node into the maintenance mode:
      nutanix@cvm$ acli host.enter_maintenance_mode hypervisor-IP-address [wait="{ true | false }" ] [non_migratable_vm_action="{ acpi_shutdown | block }" ]
      Note: Never put Controller VM and AHV hosts into maintenance mode on single-node clusters. It is recommended to shutdown user VMs before proceeding with disruptive changes.

      Replace host-IP-address with either the IP address or host name of the AHV host you want to shut down.

      The following are optional parameters for running the acli host.enter_maintenance_mode command:

      • wait
      • non_migratable_vm_action

      Do not continue if the host has failed to enter the maintenance mode.

    4. Verify if the host is in the maintenance mode:
      nutanix@cvm$ acli host.get host-ip

      In the output that is displayed, ensure that node_state equals to EnteredMaintenanceMode and schedulable equals to False .

  3. Disable backplane network segmentation from the Prism Web Console.
    1. Log on to the Prism web console, click the gear icon in the top-right corner, and then click Network Configuration under the Settings .
    2. In the Internal Interfaces tab, in the Backplane LAN row, click Disable .
      Figure. Disable Network Configuration Click to enlarge

    3. Click Yes to disable Backplane LAN.

      This involves a rolling reboot of CVMs to migrate the cluster services back to the external interface.

  4. Log on to a CVM in the cluster with SSH and stop Acropolis cluster-wide:
    nutanix@cvm$ allssh genesis stop acropolis 
  5. Restart Acropolis cluster-wide:
    nutanix@cvm$ cluster start 
  6. Remove all nodes from the maintenance mode.
    1. From any CVM in the cluster, run the following command to exit the AHV host from the maintenance mode:
      nutanix@cvm$ acli host.exit_maintenance_mode host-ip

      Replace host-ip with the new IP address of the host.

      This command migrates (live migration) all the VMs that were previously running on the host back to the host.

    2. Verify if the host has exited the maintenance mode:
      nutanix@cvm$ acli host.get host-ip

      In the output that is displayed, ensure that node_state equals to kAcropolisNormal or AcropolisNormal and schedulable equals to True .

  7. Power on the guest VMs from the Prism Element web console.

Service-Specific Traffic Isolation

Isolating the traffic associated with a specific service is a two-step process. The process is as follows:

  • Configure the networks and uplinks on each host manually. Prism only creates the VNIC that the service requires, and it places that VNIC on the bridge or port group that you specify. Therefore, you must manually create the bridge or /port group on each host and add the required physical NICs as uplinks to that bridge or port group.
  • Configure network segmentation for the service by using Prism. Create an extra VNIC for the service, specify any additional parameters that are required (for example, IP address pools), and the bridge or port group that you want to dedicate to the service.

Isolating Service-Specific Traffic

Before you begin

  • Ensure to configure each host as described in Configuring the Network on an AHV Host.
  • Review Prerequisites.

About this task

To isolate a service to a separate virtual network, do the following:

Procedure

  1. Log on to the Prism web console and click the gear icon at the top-right corner of the page.
  2. In the left pane, click Network Configuration .
  3. In the details pane, on the Internal Interfaces tab, click Create New Interface .
    The Create New Interface dialog box is displayed.
  4. On the Interface Details tab, do the following:
    1. Specify a descriptive name for the network segment.
    2. (On AHV) Optionally, in VLAN ID , specify a VLAN ID.
      Make sure that the VLAN ID is configured on the physical switch.
    3. In Bridge (on AHV) or CVM Port Group (on ESXi), select the bridge or port group that you created for the network segment.
    4. To specify an IP address pool for the network segment, click Create New IP Pool , and then, in the IP Pool dialog box, do the following:
      • In Name , specify a name for the pool.
      • In Netmask , specify the network mask for the pool.
      • Click Add an IP Range , specify the start and end IP addresses in the IP Range dialog box that is displayed.
      • Use Add an IP Range to add as many IP address ranges as you need.
        Note: Add at least n+1 IP addresses in an IP range considering n is the number of nodes in the cluster.
      • Click Save .
      • Use Add an IP Pool to add more IP address pools. You can use only one IP address pool at any given time.
      • Select the IP address pool that you want to use, and then click Next .
        Note: You can also use an existing unused IP address pool.
  5. On the Feature Selection tab, do the following:
    You cannot enable network segmentation for multiple services at the same time. Complete the configuration for one service before you enable network segmentation for another service.
    1. Select the service whose traffic you want to isolate.
    2. Configure the settings for the selected service.
      The settings on this page depend on the service you select. For information about service-specific settings, see Service-Specific Settings and Configurations.
    3. Click Save .
  6. In the Create Interface dialog box, click Save .
    The CVMs are rebooted multiple times, one after another. This procedure might trigger more tasks on the cluster. For example, if you configure network segmentation for disaster recovery, the firewall rules are added on the CVM to allow traffic on the specified ports through the new CVM interface and updated when a new recovery cluster is added or an existing cluster is modified.

What to do next

See Service-Specific Settings and Configurations for any additional tasks that are required after you segment the network for a service.

Modifying Network Segmentation Configured for a Service

To modify network segmentation configured for a service, you must first disable network segmentation for that service and then create the network interface again for that service with the new IP address pool and VLAN.

About this task

For example, if the interface of the service you want to modify is ntnx0, after the reconfiguration, the same interface (ntnx0) is assigned to that service if that interface is not assigned to any other service. If ntnx0 is assigned to another service, a new interface (for example ntnx1) is created and assigned to that service.

Perform the following to reconfigure network segmentation configured for a service.

Procedure

  1. Disable the network segmentation configured for a service by following the instructions in Disabling Network Segmentation Configured for a Service.
  2. Create the network again by following the instructions in Isolating Service-Specific Traffic.

Disabling Network Segmentation Configured for a Service

To disable network segmentation configured for a service, you must disable the dedicated VNIC. Disabling network segmentation frees up the name of the VNIC. Disabling network segmentation frees up the vNIC’s name. The free name is reused in a subsequent network segmentation configuration.

About this task

At the end of this procedure, the cluster performs a rolling restart. Disabling network segmentation might also disrupt the functioning of the associated service. To restore normal operations, you might have to perform other tasks immediately after the cluster has completed the rolling restart. For information about the follow-up tasks, see Service-Specific Settings and Configurations.

To disable the network segmentation configured for a service, do the following:

Procedure

  1. Log on to the Prism web console and click the gear icon at the top-right corner of the page.
  2. In the left pane, click Network Configuration .
  3. On the Internal Interfaces tab, for the interface that you want to disable, click Disable .
    Note: The defined IP address pool is available even after disabling the network segmentation.

Deleting a vNIC Configured for a Service

If you disable network segmentation for a service, the vNIC for that service is not deleted. AOS reuses the vNIC if you enable network segmentation again. However, you can manually delete a vNIC by logging into any CVM in the cluster with SSH.

Before you begin

Ensure that the following prerequisites are met before you delete the vNIC configured for a Service:
  • Disable the network segmentation configured for a service by following the instructions in Disabling Network Segmentation Configured for a Service.
  • Observe the Limitation specified in Limitation for vNIC Hot-Unplugging topic in AHV Admin Guide .
you

About this task

Perform the following to delete a vNIC.

Procedure

  1. Log on to any CVM in the cluster with SSH.
  2. Delete the vNIC.
    nutanix@cvm$ network_segmentation --service_network --interface="interface-name" --delete

    Replace interface-name with the name of the interface you want to delete. For example, ntnx0.

Service-Specific Settings and Configurations

The following sections describe the settings required by the services that support network segmentation.

Nutanix Volumes

Network segmentation for Volumes also requires you to migrate iSCSI client connections to the new segmented network. If you no longer require segmentation for Volumes traffic, you must also migrate connections back to eth0 after disabling the vNIC used for Volumes traffic.

You can create two different networks for Nutanix Volumes with different IP pools, VLANs, and data services IP addresses. For example, you can create two iSCSI networks for production and non-production traffic on the same Nutanix cluster.

Follow the instructions in Isolating Service-Specific Traffic again to create the second network for Volumes after you create the first network.

Table 1. Settings to be Specified When Configuring Traffic Isolation
Parameter or Setting Description
Virtual IP (Optional) Virtual IP address for the service. If specified, the IP address must be picked from the specified IP address pool. If not specified, an IP address from the specified IP address pool is selected for you.
Client Subnet The network (in CIDR notation) that hosts the iSCSI clients. Required If the vNIC created for the service on the CVM is not on the same network as the clients.
Gateway Gateway to the subnetwork that hosts the iSCSI clients. Required If you specify the client subnet.
Migrating iSCSI Connections to the Segmented Network

After you enable network segmentation for Volumes, you must manually migrate connections from existing iSCSI clients to the newly segmented network.

Before you begin

Make sure that the task for enabling network segmentation for the service succeeds.

About this task

Note: Even though support is available to run iSCSI traffic on both the segmented and management networks at the same time, Nutanix recommends that you move the iSCSI traffic for guest VMs to the segmented network to achieve true isolation.

To migrate iSCSI connections to the segmented network, do the following:

Procedure

  1. Log out from all the clients connected to iSCSI targets that are using CVM eth0 or the Data Service IP address.
  2. Optionally, remove all the discovery records for the Data Services IP address (DSIP) on eth0.
  3. If the clients are allowlisted by their IP address, remove the client IP address that is on the management network from the allowlist, and then add the client IP address on the new network to the allowlist.
    nutanix@cvm$ acli vg.detach_external vg_name initiator_network_id=old_vm_IP
    nutanix@cvm$ acli vg.attach_external vg_name initiator_network_id=new_vm_IP
    

    Replace vg_name with the name of the volume group and old_vm_IP and new_vm_IP with the old and new client IP addresses, respectively.

  4. Discover the virtual IP address specified for Volumes.
  5. Connect to the iSCSI targets from the client.
Migrating Existing iSCSI Connections to the Management Network (Controller VM eth0)

About this task

To migrate existing iSCSI connections to eth0, do the following:

Procedure

  1. Log out from all the clients connected to iSCSI targets using the CVM vNIC dedicated to Volumes.
  2. Remove all the discovery records for the DSIP on the new interface.
  3. Discover the DSIP for eth0.
  4. Connect the clients to the iSCSI targets.
Disaster Recovery with Protection Domains

The settings for configuring network segmentation for disaster recovery apply to all Asynchronous, NearSync, and Metro Availability replication schedules. You can use disaster recovery with Asynchronous, NearSync, and Metro Availability replications only if both the primary site and the recovery site is configured with Network Segmentation. Before enabling or disabling the network segmentation on a host, disable all the disaster recovery replication schedules running on that host.

Note: Network segmentation does not support disaster recovery with Leap.
Table 1. Settings to be Specified When Configuring Traffic Isolation
Parameter or Setting Description
Virtual IP (Optional) Virtual IP address for the service. If specified, the IP address must be picked from the specified IP address pool. If not specified, an IP address from the specified IP address pool is selected for you.
Note: Virtual IP address is different from the external IP address and the data services IP address of the cluster.
Gateway Gateway to the subnetwork.
Remote Site Configuration

After configuring network segmentation for disaster recovery, configure remote sites at both locations. You also need to reconfigure remote sites if you disable network segmentation.

For information about configuring remote sites, see Remote Site Configuration in the Data Protection and Recovery with Prism Element Guide.

Segmenting a Stretched Layer 2 Network for Disaster Recovery

A stretched Layer 2 network configuration allows the source and remote metro clusters to be in the same broadcast domain and communicate without a gateway.

About this task

You can enable network segmentation for disaster recovery on a stretched Layer 2 network that does not have a gateway. A stretched Layer 2 network is usually configured across the physically remote clusters such as a metro availability cluster deployment. A stretched Layer 2 network allows the source and remote clusters to be configured in the same broadcast domain without the usual gateway.

See AOS 5.19.2 Release Notes for minimum AOS version required to configure a stretched Layer 2 network between metro clusters.

To configure a network segment as a stretched L2 network, do the following.

Procedure

Run the following command:
network_segmentation --service_network --service_name=kDR --ip_pool="DR-ip-pool-name" --service_vlan="DR-vlan-id" --desc_name="Description" --host_physical_network='"portgroup/bridge"' --stretched_metro

Replace the following: (See Isolating Service-Specific Traffic for the information)

  • DR-ip-pool-name with the name of the IP Pool created for the DR service or any existing unused IP address pool.
  • DR-vlan-id with the VLAN ID being used for the DR service.
  • Description with a suitable description of this stretched L2 network segment.
  • portgroup/bridge with the details of Bridge or CVM Port Group used for the DR service.

For more information about the network_segmentation command, see the Command Reference guide.

Network Segmentation During Cluster Expansion

When you expand a cluster on which service-specific traffic isolation is configured, ensure to meet the following prerequisites before adding new (unconfigured) nodes to the cluster:

  • Manually configure bridges for segmented networks on the hypervisor host of the new nodes. The bridges used for the segmented network should be identical to the other nodes. For information about the steps to perform on an unconfigured node, see Configuring the Network on a Host . The steps involve logging on to the host by using SSH and running the ovs-vsctl commands. For instructions about how to add nodes to your Nutanix cluster, see Expanding a Cluster in the Prism Web Console Guide .
  • Configure the network settings on the physical switch to which the new nodes connect are identical to the other nodes in the cluster. New nodes should be able to communicate with current nodes using the same VLAN ID for segmented networks.
  • Prepare new nodes imaged with AOS and hypervisor identical to the other nodes of the cluster. Reimaging the nodes as part of the cluster expansion is not supported if the cluster network is segmented.
Note: For ESXi clusters with vSphere Distributed Switches (DVS):
  • Before you expand the cluster, ensure that the node you want to add is part of the same vCenter cluster, the same DVS as the other nodes in the cluster, and is not in a disconnected state.

  • Ensure that the nodes that you add have more than 20 GB of memory.

Network Segmentation–Related Changes During an AOS Upgrade

When you upgrade from an AOS version which does not support network segmentation to an AOS version that does, the eth2 interface (used to segregate backplane traffic) is automatically created on each CVM. However, the network remains unsegmented, and the cluster services on the CVM continue to use eth0 until you configure network segmentation.

The vNICs ntnx0, ntnx1, and so on, are not created during an upgrade to a release that supports service-specific traffic isolation. They are created when you configure traffic isolation for a service.

Note:

Do not delete the eth2 interface that is created on the CVMs, even if you are not using the network segmentation feature.

Firewall Requirements

Ports and Protocols describes detailed port information (like protocol, service description, source, destination, and associated service) for Nutanix products and services. It includes port and protocol information for 1-click upgrades and LCM updates.

Log management

This chapter describes how to configure cluster-wide setting for log-forwarding and documenting the log fingerprint.

Log Forwarding

The Nutanix Controller VM provides a method for log integrity by using a cluster-wide setting to forward all the logs to a central log host. Due to the appliance form factor of the Controller VM, system and audit logs does not support local log retention periods as a significant increase in log traffic can be used to orchestrate a distributed denial of service attack (DDoS).

Nutanix recommends deploying a central log host in the management enclave to adhere to any compliance or internal policy requirement for log retention. In case of any system compromise, a central log host serves as a defense mechanism to preserve log integrity.

Note: The audit in the Controller VM uses the audisp plugin by default to ship all the audit logs to the rsyslog daemon (stored in /home/log/messages ). Searching for audispd in the central log host provides the entire content of the audit logs from the Controller VM. The audit daemon is configured with a rules engine that adheres to the auditing requirements of the Operating System Security Requirements Guide (OS SRG), and is embedded as part of the Controller VM STIG.

Use the nCLI to enable forwarding of system, audit, aide, and SCMA logs of all the Controller nodes in a cluster at the required log level. For more information, see Send Logs to Remote Syslog Server in the Acropolis Advanced Administration Guide

Documenting the Log Fingerprint

For forensic analysis, non-repudiation is established by verifying the fingerprint of the public key for the log file entry.

Procedure

  1. Login to the CVM.
  2. Run the following command to document the fingerprint for each public key assigned to an individual admin.
    nutanix@cvm$ ssh-keygen -lf /<location of>/id_rsa.pub

    The fingerprint is then compared to the SSH daemon log entries and forwarded to the central log host ( /home/log/secure in the Controller VM).

    Note: After completion of the ssh public key inclusion in Prism and verification of connectivity, disable the password authentication for all the Controller VMs and AHV hosts. From the Prism main menu, de-select Cluster Lockdown configuration > Enable Remote Login with password check box from the gear icon drop-down list.

Security Management Using Prism Central (PC)

Prism Central provides several mechanisms and features to enforce security of your multi-cluster environment.

If you enable Identity and Access Management (IAM), see Security Management Using Identity and Access Management (Prism Central).

Configuring Authentication

Caution: Prism Central does not allow the use of the (not secure) SSLv2 and SSLv3 ciphers. To eliminate the possibility of an SSL Fallback situation and denied access to Prism Central, disable (uncheck) SSLv2 and SSLv3 in any browser used for access. However, TLS must be enabled (checked).

Prism Central supports user authentication with these authentication options:

  • Active Directory authentication. Users can authenticate using their Active Directory (or OpenLDAP) credentials when Active Directory support is enabled for Prism Central.
  • Local user authentication. Users can authenticate if they have a local Prism Central account. For more information, see Managing Local User Accounts.
  • SAML authentication. Users can authenticate through a supported identity provider when SAML support is enabled for Prism Central. The Security Assertion Markup Language (SAML) is an open standard for exchanging authentication and authorization data between two parties: an identity provider (IDP) and Prism Central as the service provider.

    If you do not enable Nutanix Identity and Access Management (IAM) on Prism Central, ADFS is the only supported IDP for Single Sign-on. If you enable IAM, additional IDPs are available. For more information, see Security Management Using Identity and Access Management (Prism Central) and Updating ADFS When Using SAML Authentication.

  • Local user authentication. Users can authenticate if they have a local Prism Central account. For more information, see Managing Local User Accounts .
  • Active Directory authentication. Users can authenticate using their Active Directory (or OpenLDAP) credentials when Active Directory support is enabled for Prism Central.

Adding An Authentication Directory (Prism Central)

Before you begin

Caution: Prism Central does not allow the use of the (not secure) SSLv2 and SSLv3 ciphers. To eliminate the possibility of an SSL Fallback situation and denied access to Prism Central, disable (uncheck) SSLv2 and SSLv3 in any browser used for access. However, TLS must be enabled (checked).

Procedure

  1. In the web console, click the gear icon in the main menu and then select Authentication in the Settings page.

    The Authentication Configuration window appears.

    Figure. Authentication Configuration Window Click to enlarge Authentication Configuration window main display

  2. To add an authentication directory, click the New Directory button.

    A set of fields is displayed. Do the following in the indicated fields:

    1. Directory Type : Select one of the following from the pull-down list.
      • Active Directory : Active Directory (AD) is a directory service implemented by Microsoft for Windows domain networks.
        Note:
        • Users with the "User must change password at next logon" attribute enabled will not be able to authenticate to Prism Central. Ensure users with this attribute first login to a domain workstation and change their password prior to accessing Prism Central. Also, if SSL is enabled on the Active Directory server, make sure that Nutanix has access to that port (open in firewall).
        • Use of the "Protected Users" group is currently unsupported for Prism authentication. For more details on the "Protected Users" group, see “Guidance about how to configure protected accounts” on Microsoft documentation website.
        • An Active Directory user name or group name containing spaces is not supported for Prism Central authentication.
        • The Microsoft AD is LDAP v2 and LDAP v3 compliant.
        • The Microsoft AD servers supported are Windows Server 2012 R2, Windows Server 2016, and Windows Server 2019.
      • OpenLDAP : OpenLDAP is a free, open source directory service, which uses the Lightweight Directory Access Protocol (LDAP), developed by the OpenLDAP project.
        Note: Prism Central uses a service account to query OpenLDAP directories for user information and does not currently support certificate-based authentication with the OpenLDAP directory.
    2. Name : Enter a directory name.

      This is a name you choose to identify this entry; it need not be the name of an actual directory.

    3. Domain : Enter the domain name.

      Enter the domain name in DNS format, for example, nutanix.com .

    4. Directory URL : Enter the URL address to the directory.

      The URL format is as follows for an LDAP entry: ldap:// host : ldap_port_num . The host value is either the IP address or fully qualified domain name. (In some environments, a simple domain name is sufficient.) The default LDAP port number is 389. Nutanix also supports LDAPS (port 636) and LDAP/S Global Catalog (ports 3268 and 3269). The following are example configurations appropriate for each port option:

      Note: LDAPS support does not require custom certificates or certificate trust import.
      • Port 389 (LDAP). Use this port number (in the following URL form) when the configuration is single domain, single forest, and not using SSL.
        ldap://ad_server.mycompany.com:389
      • Port 636 (LDAPS). Use this port number (in the following URL form) when the configuration is single domain, single forest, and using SSL. This requires all Active Directory Domain Controllers have properly installed SSL certificates.
        ldaps://ad_server.mycompany.com:636
      • Port 3268 (LDAP - GC). Use this port number when the configuration is multiple domain, single forest, and not using SSL.
      • Port 3269 (LDAPS - GC). Use this port number when the configuration is multiple domain, single forest, and using SSL.
        Note:
        • When constructing your LDAP/S URL to use a Global Catalog server, ensure that the Domain Control IP address or name being used is a global catalog server within the domain being configured. If not, queries over 3268/3269 may fail.
        • Cross-forest trust between multiple AD forests is not supported.

      For the complete list of required ports, see Port Reference.
    5. [OpenLDAP only] Configure the following additional fields:
      • User Object Class : Enter the value that uniquely identifies the object class of a user.
      • User Search Base : Enter the base domain name in which the users are configured.
      • Username Attribute : Enter the attribute to uniquely identify a user.
      • Group Object Class : Enter the value that uniquely identifies the object class of a group.
      • Group Search Base : Enter the base domain name in which the groups are configured.
      • Group Member Attribute : Enter the attribute that identifies users in a group.
      • Group Member Attribute Value : Enter the attribute that identifies the users provided as value for Group Member Attribute .
    6. Search Type . How to search your directory when authenticating. Choose Non Recursive if you experience slow directory logon performance. For this option, ensure that users listed in Role Mapping are listed flatly in the group (that is, not nested). Otherwise, choose the default Recursive option.
    7. Service Account Username : Depending upon the Directory type you select in step 2.a, the service account user name format as follows:
      • For Active Directory , enter the service account user name in the user_name@domain.com format.
      • For OpenLDAP , enter the service account user name in the following Distinguished Name (DN) format:

        cn=username, dc=company, dc=com

        A service account is created to run only a particular service or application with the credentials specified for the account. According to the requirement of the service or application, the administrator can limit access to the service account.

        A service account is under the Managed Service Accounts in the Active Directory and openLDAP server. An application or service uses the service account to interact with the operating system. Enter your Active Directory and openLDAP service account credentials in this (username) and the following (password) field.

        Note: Be sure to update the service account credentials here whenever the service account password changes or when a different service account is used.
    8. Service Account Password : Enter the service account password.
    9. When all the fields are correct, click the Save button (lower right).

      This saves the configuration and redisplays the Authentication Configuration dialog box. The configured directory now appears in the Directory List tab.

    10. Repeat this step for each authentication directory you want to add.
    Note:
    • No permissions are granted to the directory users by default. To grant permissions to the directory users, you must specify roles for the users in that directory (see Configuring Role Mapping).
    • Service account for both Active directory and openLDAP must have full read permission on the directory service.
    Figure. Directory List Fields Click to enlarge Directory List tab display

  3. To edit a directory entry, click the pencil icon for that entry.

    After clicking the pencil icon, the relevant fields reappear. Enter the new information in the appropriate fields and then click the Save button.

  4. To delete a directory entry, click the X icon for that entry.

    After clicking the X icon, a window prompt appears to verify the delete action; click the OK button. The entry is removed from the list.

Adding a SAML-based Identity Provider

About this task

If you do not enable Nutanix Identity and Access Management (IAM) on Prism Central, ADFS is the only supported identity provider (IDP) for Single Sign-on and only one IDP is allowed at a time. If you enable IAM, additional IDPs are available. See Security Management Using Identity and Access Management (Prism Central) and also Updating ADFS When Using SAML Authentication.

Before you begin

  • An identity provider (typically a server or other computer) is the system that provides authentication through a SAML request. There are various implementations that can provide authentication services in line with the SAML standard.
  • If you enable IAM by enabling CMSP, you can specify other tested standard-compliant IDPs in addition to ADFS. See also the Prism Central release notes topic Identity and Access Management Software Support for specific support requirements..

    Only one identity provider is allowed at a time, so if one was already configured, the + New IDP link does not appear.

  • You must configure the identity provider to return the NameID attribute in SAML response. The NameID attribute is used by Prism Central for role mapping. See Configuring Role Mapping for details.

Procedure

  1. In the web console, click the gear icon in the main menu and then select Authentication in the Settings page.
  2. To add a SAML-based identity provider, click the + New IDP link.

    A set of fields is displayed. Do the following in the indicated fields:

    1. Configuration name : Enter a name for the identity provider. This name will appear in the log in authentication screen.
    2. Import Metadata : Click this radio button to upload a metadata file that contains the identity provider information.

      Identity providers typically provide an XML file on their website that includes metadata about that identity provider, which you can download from that site and then upload to Prism Central. Click + Import Metadata to open a search window on your local system and then select the target XML file that you downloaded previously. Click the Save button to save the configuration.

      Figure. Identity Provider Fields (metadata configuration) Click to enlarge

    This completes configuring an identity provider in Prism Central, but you must also configure the callback URL for Prism Central on the identity provider. To do this, click the Download Metadata link just below the Identity Providers table to download an XML file that describes Prism Central and then upload this metadata file to the identity provider.
  3. To edit a identity provider entry, click the pencil icon for that entry.

    After clicking the pencil icon, the relevant fields reappear. Enter the new information in the appropriate fields and then click the Save button.

  4. To delete an identity provider entry, click the X icon for that entry.

    After clicking the X icon, a window prompt appears to verify the delete action; click the OK button. The entry is removed from the list.

Enabling and Configuring Client Authentication

Procedure

  1. In the web console, click the gear icon in the main menu and then select Authentication in the Settings page.
  2. Click the Client tab, then do the following steps.
    1. Select the Configure Client Chain Certificate check box.

      The Client Chain Certificate is a list of certificates that includes all intermediate CA and root-CA certificates.

    2. Click the Choose File button, browse to and select a client chain certificate to upload, and then click the Open button to upload the certificate.
      Note:
      • Client and CAC authentication only supports RSA 2048 bit certificate.
      • Uploaded certificate files must be PEM encoded. The web console restarts after the upload step.
    3. To enable client authentication, click Enable Client Authentication .
    4. To modify client authentication, do one of the following:
      Note: The web console restarts when you change these settings.
      • Click Enable Client Authentication to disable client authentication.
      • Click Remove to delete the current certificate. (This also disables client authentication.)
      • To enable OCSP or CRL based certificate revocation checking, see Certificate Revocation Checking.

    Client authentication allows you to securely access the Prism by exchanging a digital certificate. Prism will validate that the certificate is signed by your organization’s trusted signing certificate.

    Client authentication ensures that the Nutanix cluster gets a valid certificate from the user. Normally, a one-way authentication process occurs where the server provides a certificate so the user can verify the authenticity of the server (see Installing an SSL Certificate). When client authentication is enabled, this becomes a two-way authentication where the server also verifies the authenticity of the user. A user must provide a valid certificate when accessing the console either by installing the certificate on the local machine or by providing it through a smart card reader.
    Note: The CA must be the same for both the client chain certificate and the certificate on the local machine or smart card.
  3. To specify a service account that the Prism Central web console can use to log in to Active Directory and authenticate Common Access Card (CAC) users, select the Configure Service Account check box, and then do the following in the indicated fields:
    1. Directory : Select the authentication directory that contains the CAC users that you want to authenticate.
      This list includes the directories that are configured on the Directory List tab.
    2. Service Username : Enter the user name in the user name@domain.com format that you want the web console to use to log in to the Active Directory.
    3. Service Password : Enter the password for the service user name.
    4. Click Enable CAC Authentication .
      Note: For federal customers only.
      Note: The Prism Central console restarts after you change this setting.

    The Common Access Card (CAC) is a smart card about the size of a credit card, which some organizations use to access their systems. After you insert the CAC into the CAC reader connected to your system, the software in the reader prompts you to enter a PIN. After you enter a valid PIN, the software extracts your personal certificate that represents you and forwards the certificate to the server using the HTTP protocol.

    Nutanix Prism verifies the certificate as follows:

    • Validates that the certificate has been signed by your organization’s trusted signing certificate.
    • Extracts the Electronic Data Interchange Personal Identifier (EDIPI) from the certificate and uses the EDIPI to check the validity of an account within the Active Directory. The security context from the EDIPI is used for your PRISM session.
    • Prism Central supports both certificate authentication and basic authentication in order to handle both Prism Central login using a certificate and allowing REST API to use basic authentication. It is physically not possible for REST API to use CAC certificates. With this behavior, if the certificate is present during Prism Central login, the certificate authentication is used. However, if the certificate is not present, basic authentication is enforced and used.
    Note: Nutanix Prism does not support OpenLDAP as directory service for CAC.
    If you map a Prism Central role to a CAC user and not to an Active Directory group or organizational unit to which the user belongs, specify the EDIPI (User Principal Name, or UPN) of that user in the role mapping. A user who presents a CAC with a valid certificate is mapped to a role and taken directly to the web console home page. The web console login page is not displayed.
    Note: If you have logged on to Prism Central by using CAC authentication, to successfully log out of Prism Central, close the browser after you click Log Out .

Certificate Revocation Checking

Enabling Certificate Revocation Checking using Online Certificate Status Protocol (nCLI)

About this task

OCSP is the recommended method for checking certificate revocation in client authentication. You can enable certificate revocation checking using the OSCP method through the command line interface (nCLI).

To enable certificate revocation checking using OCSP for client authentication, do the following.

Procedure

  1. Set the OCSP responder URL.
    ncli authconfig set-certificate-revocation set-ocsp-responder=<ocsp url> <ocsp url> indicates the location of the OCSP responder.
  2. Verify if OCSP checking is enabled.
    ncli authconfig get-client-authentication-config

    The expected output if certificate revocation checking is enabled successfully is as follows.

    Auth Config Status: true
    File Name: ca.cert.pem
    OCSP Responder URI: http://<ocsp-responder-url>

Enabling Certificate Revocation Checking using Certificate Revocation Lists (nCLI)

About this task

Note: OSCP is the recommended method for checking certificate revocation in client authentication.

You can use the CRL certificate revocation checking method if required, as described in this section.

To enable certificate revocation checking using CRL for client authentication, do the following.

Procedure

Specify all the CRLs that are required for certificate validation.
ncli authconfig set-certificate-revocation set-crl-uri=<uri 1>,<uri 2> set-crl-refresh-interval=<refresh interval in seconds>
  • The above command resets any previous OCSP or CRL configurations.
  • The URIs must be percent-encoded and comma separated.
  • The CRLs are updated periodically as specified by the crl-refresh-interval value. This interval is common for the entire list of CRL distribution points. The default value for this is 86400 seconds (1 day).

User Management

Managing Local User Accounts

About this task

The Prism Central admin user is created automatically, but you can add more (locally defined) users as needed. To add, update, or delete a user account, do the following:

Note:
  • To add user accounts through Active Directory, see Configuring Authentication. If you enable the Prism Self Service feature, an Active Directory is assigned as part of that process.
  • Changing the Prism Central admin user password does not impact registration (re-registering clusters is not required).

Procedure

  • Click the gear icon in the main menu and then select Local User Management in the Settings page.

    The Local User Management dialog box appears.

    Figure. User Management Window Click to enlarge displays user management window

  • To add a user account, click the New User button and do the following in the displayed fields:
    1. Username : Enter a user name.
    2. First Name : Enter a first name.
    3. Last Name : Enter a last name.
    4. Email : Enter a valid user email address.
    5. Password : Enter a password (maximum of 255 characters).
      Note: A second field to verify the password is not included, so be sure to enter the password correctly in this field.
    6. Language : Select the language setting for the user.

      English is selected by default. You have an option to select Simplified Chinese or Japanese . If you select either of these, the cluster locale is updated for the new user. For example, if you select Simplified Chinese , the user interface is displayed in Simplified Chinese when the new user logs in.

    7. Roles : Assign a role to this user.

      There are three options:

      • Checking the User Admin box allows the user to view information, perform any administrative task, and create or modify user accounts.
      • Checking the Prism Central Admin (formerly "Cluster Admin") box allows the user to view information and perform any administrative task, but it does not provide permission to manage (create or modify) other user accounts.
      • Leaving both boxes unchecked allows the user to view information, but it does not provide permission to perform any administrative tasks or manage other user accounts.
    8. When all the fields are correct, click the Save button (lower right).

      This saves the configuration and redisplays the dialog box with the new user appearing in the list.

    Figure. Create User Window Click to enlarge displats create user window

  • To modify a user account, click the pencil icon for that user and update one or more of the values as desired in the Update User window.
    Figure. Update User Window Click to enlarge displays update user window

  • To disable login access for a user account, click the Yes value in the Enabled field for that user; to enable the account, click the No value.

    A Yes value means the login is enabled; a No value means it is disabled. A user account is enabled (login access activated) by default.

  • To delete a user account, click the X icon for that user.
    A window prompt appears to verify the action; click the OK button. The user account is removed and the user no longer appears in the list.

Updating My Account

About this task

To update your account credentials (that is, credentials for the user you are currently logged in as), do the following:

Procedure

  1. To update your password, select Change Password from the user icon pull-down list of the main menu.
    The Change Password dialog box appears. Do the following in the indicated fields:
    1. Current Password : Enter the current password.
    2. New Password : Enter a new password.
    3. Confirm Password : Re-enter the new password.
    4. When the fields are correct, click the Save button (lower right). This saves the new password and closes the window.
    Note: Password complexity requirements might appear above the fields; if they do, your new password must comply with these rules.
    Figure. Change Password Window Click to enlarge change password window

  2. To update other details of your account, select Update Profile from the user icon pull-down list.
    The Update Profile dialog box appears. Do the following in the indicated fields for any parameters you want to change:
    1. First Name : Enter a different first name.
    2. Last Name : Enter a different last name.
    3. Email Address : Enter a different valid user email address.
    4. Language : Select a different language for your account from the pull-down list.
    5. API Key : Enter a new API key.
      Note: Your keys can be managed from the API Keys page on the Nutanix support portal. Your connection will be secure without the optional public key (following field), and the public key option is provided in the event that your default public key expires.
    6. Public Key : Click the Choose File button to upload a new public key file.
    7. When all the fields are correct, click the Save button (lower right). This saves the changes and closes the window.
    Figure. Update Profile Window Click to enlarge

Resetting Password (CLI)

This procedure describes how to reset a local user’s password on the Prism Element or the Prism Central web consoles.

About this task

To reset the password using nCLI, do the following:

Note:

Only a user with admin privileges can reset a password for other users.

Procedure

  1. Access the CVM via SSH.
  2. Log in with the admin credentials.
  3. Use the ncli user reset-password command and specify the username and password of the user whose password is to be reset:
    nutanix@cvm$ ncli user reset-password user-name=xxxxx password=yyyyy
    
    • Replace user-name=xxxxx with the name of the user whose password is to be reset.

    • Replace password=yyyyy with the new password.

What to do next

You can relaunch the Prism Element or the Prism Central web console and verify the new password setting.

Deleting a Directory User Account

About this task

To delete a directory-authenticated user, do the following:

Procedure

  1. Click the Hamburger icon, and go to Administration > Projects
    The Project page appears. This page lists all existing projects.
  2. Select the project that the user is associated with and go to Actions > Update Projects
    The Edit Projects page appears.
  3. Go to Users, Groups, Roles tab.
  4. Click the X icon to delete the user.
    Figure. Edit Project Window Click to enlarge

  5. Click Save

    Prism deletes the user account and also removes the user from any associated projects.

    Repeat the same steps if the user is associated with multiple projects.

Controlling User Access (RBAC)

Prism Central supports role-based access control (RBAC) that you can configure to provide customized access permissions for users based on their assigned roles. The roles dashboard allows you to view information about all defined roles and the users and groups assigned to those roles.

  • Prism Central includes a set of predefined roles (see Built-in Role Management).
  • You can also define additional custom roles (see Custom Role Management).
  • Configuring authentication confers default user permissions that vary depending on the type of authentication (full permissions from a directory service or no permissions from an identity provider). You can configure role maps to customize these user permissions (see Configuring Role Mapping).
  • You can refine access permissions even further by assigning roles to individual users or groups that apply to a specified set of entities (see Assigning a Role).
    Note: Please note that the entities are treated as separate instances. For example, if you want to grant a user or a group the permission to manage cluster and images, an administrator must add both of these entities to the list of assignments.
  • With RBAC, user roles do not depend on the project membership. You can use RBAC and log in to Prism Central even without a project membership.
Note: Defining custom roles and assigning roles are supported on AHV only.

Built-in Role Management

The following built-in roles are defined by default. You can see a more detailed list of permissions for any of the built-in roles through the details view for that role (see Displaying Role Permissions). The Project Admin, Developer, Consumer, and Operator roles are available when assigning roles in a project.

Role Privileges
Super Admin Full administrator privileges
Prism Admin Full administrator privileges except for creating or modifying the user accounts
Prism Viewer View-only privileges
Self-Service Admin Manages all cloud-oriented resources and services
Note: This is the only cloud administration role available.
Project Admin Manages cloud objects (roles, VMs, Apps, Marketplace) belonging to a project
Note: You can specify a role for a user when you assign a user to a project, so individual users or groups can have different roles in the same project.
Developer Develops, troubleshoots, and tests applications in a project
Consumer Accesses the applications and blueprints in a project
Operator Accesses the applications in a project
Note: Previously, the Super Admin role was called User Admin , the Prism Admin role was called Prism Central Admin and Cluster Admin , and the Prism Viewer was called Viewer .

Custom Role Management

If the built-in roles are not sufficient for your needs, you can create one or more custom roles (AHV only).

Creating a Custom Role

About this task

To create a custom role, do the following:

Procedure

  1. Go to the roles dashboard (select Administration > Roles in the pull-down menu) and click the Create Role button.

    The Roles page appears. See Custom Role Permissions for a list of the permissions available for each custom role option.

    Figure. Roles Page Click to enlarge displays the roles page

  2. In the Roles page, do the following in the indicated fields:
    1. Role Name : Enter a name for the new role.
    2. Description (optional): Enter a description of the role.
      Note: All entity types are listed by default, but you can display just a subset by entering a string in the Filter Entities search field.
      Figure. Filter Entities Click to enlarge Filters the available entities

    3. App : Click the radio button for the desired application permissions ( No Access , Basic Access , or Set Custom Permissions ). If you specify custom permissions, click the Change link to display the Custom App Permissions window, check all the permissions you want to enable, and then click the Save button.
      Figure. Custom App Permissions Window Click to enlarge displays the custom app permissions window

    4. VM : Click the radio button for the desired VM permissions ( No Access , View Access , Basic Access , Edit Access , or Set Custom Permissions ). Check the Allow VM Creation box to allow this role to create VMs. If you specify custom permissions, click the Change link to display the Custom VM Permissions window, check all the permissions you want to enable, and then click the Save button.
      Figure. Custom VM Permissions Window Click to enlarge displays the custom VM permissions window

    5. Recovery Plan : Click the radio button for the desired permissions for recovery plan operations ( No Access , View Access , Test Execution Access , Full Execution Access , or Set Custom Permissions ). If you specify custom permissions, click the Change link to display the Custom Recovery Plan Permissions window, check all the permissions you want to enable (see Custom Role Permissions), and then click the Save button.
      Figure. Custom Recovery Plan Permissions Window Click to enlarge displays the custom recovery plan permissions window

    6. Blueprint : Click the radio button for the desired blueprint permissions ( No Access , View Access , Basic Access , or Set Custom Permissions ). Check the Allow Blueprint Creation box to allow this role to create blueprints. If you specify custom permissions, click the Change link to display the Custom Blueprint Permissions window, check all the permissions you want to enable, and then click the Save button.
      Figure. Custom Blueprint Permissions Window Click to enlarge displays the custom blueprint permissions window

    7. Marketplace Item : Click the radio button for the desired marketplace permissions ( No Access , View marketplace and published blueprints , View marketplace and publish new blueprints , or Set custom permissions ). If you specify custom permissions, click the Change link to display the Custom Marketplace Item Permissions window, check all the permissions you want to enable, and then click the Save button.
      Note: The permission you enable for a Marketplace Item implicitly applies to a Catalog Item entity. For example, if you select No Access permission for the Marketplace Item entity while creating the custom role, the custom role will not have access to the Catalog Item entity as well.

      Figure. Custom Marketplace Permissions Window Click to enlarge displays the custom marketplace item permissions window

    8. Report : Click the radio button for the desired report permissions ( No Access , View Only , Edit Access , or Set Custom Permissions ). If you specify custom permissions, click the Change link to display the Custom Report Permissions window, check all the permissions you want to enable, and then click the Save button.
      Figure. Custom VM Permissions Window Click to enlarge displays the custom report permissions window

    9. Cluster : Click the radio button for the desired cluster permissions ( No Access or Cluster Access ).
    10. Subnet : Click the radio button for the desired subnet permissions ( No Access or Subnet Access ).
    11. Image : Click the radio button for the desired image permissions ( No Access , View Only , or Set Custom Permissions ). If you specify custom permissions, click the Change link to display the Custom Image Permissions window, check all the permissions you want to enable, and then click the Save button.
      Figure. Custom Image Permissions Window Click to enlarge displays the custom image permissions window

  3. Click Save to add the role. The page closes and the new role appears in the Roles view list.
Modifying a Custom Role

About this task

Perform the following procedure to modify or delete a custom role.

Procedure

  1. Go to the roles dashboard and select (check the box for) the desired role from the list.
  2. Do one of the following:
    • To modify the role, select Update Role from the Actions pull-down list. The Roles page for that role appears. Update the field values as desired and then click Save . See Creating a Custom Role for field descriptions.
    • To delete the role, select Delete from the Action pull-down list. A confirmation message is displayed. Click OK to delete and remove the role from the list.
Custom Role Permissions

A selection of permission options are available when creating a custom role.

The following table lists the permissions you can grant when creating or modifying a custom role. When you select an option for an entity, the permissions listed for that option are granted. If you select Set custom permissions , a complete list of available permissions for that entity appears. Select the desired permissions from that list.

Entity Option Permissions
App (application) No Access (none)
Basic Access Abort App Runlog, Access Console VM, Action Run App, Clone VM, Create AWS VM, Create Image, Create VM, Delete AWS VM, Delete VM, Download App Runlog, Update AWS VM, Update VM, View App, View AWS VM, View VM
Set Custom Permissions (select from list) Abort App Runlog, Access Console VM, Action Run App, Clone VM, Create App, Create AWS VM, Create Image, Create VM, Delete App, Delete AWS VM, Delete VM, Download App Runlog, Update App, Update AWS VM, Update VM, View App, View AWS VM, View VM
VM Recovery Point No Access (none)
View Only View VM Recovery Point
Full Access Delete VM Recovery Point, Restore VM Recovery Point, Snapshot VM, Update VM Recovery Point, View VM Recovery Point, Allow VM Recovery Point creation
Set Custom Permissions (Change) Abort App Runlog, Access Console VM, Action Run App, Clone VM, Create App, Create AWS VM, Create Image, Create VM, Delete App, Delete AWS VM, Delete VM, Download App Runlog, Update App, Update AWS VM, Update VM, View App, View AWS VM, View VM
Note:

You can assign permissions for the VM Recovery Point entity to users or user groups in the following two ways.

  • Manually assign permission for each VM where the recovery point is created.
  • Assign permission using Categories in the Role Assignment workflow.
Tip: When a recovery point is created, it is associated with the same category as the VM.
VM No Access (none)
View Access Access Console VM, View VM
Basic Access Access Console VM, Update VM Power State, View VM
Edit Access Access Console VM, Update VM, View Subnet, View VM
Set Custom Permissions (select from list) Access Console VM, Clone VM, Create VM, Delete VM, Update VM, Update VM Boot Config, Update VM CPU, Update VM Categories, Update VM Disk List, Update VM GPU List, Update VM Memory, Update VM NIC List, Update VM Owner, Update VM Power State, Update VM Project, View Cluster, View Subnet, View VM
Allow VM creation (additional option) (n/a)
Blueprint No Access (none)
View Access View Account, View AWS Availability Zone, View AWS Elastic IP, View AWS Image, View AWS Key Pair, View AWS Machine Type, View AWS Region, View AWS Role, View AWS Security Group, View AWS Subnet, View AWS Volume Type, View AWS VPC, View Blueprint, View Cluster, View Image, View Project, View Subnet
Basic Access Access Console VM, Clone VM, Create App,Create Image, Create VM, Delete VM, Launch Blueprint, Update VM, View Account, View App, View AWS Availability Zone, View AWS Elastic IP, View AWS Image, View AWS Key Pair, View AWS Machine Type, View AWS Region, View AWS Role, View AWS Security Group, View AWS Subnet, View AWS Volume Type, View AWS VPC, View Blueprint, View Cluster, View Image, View Project, View Subnet, View VM
Set Custom Permissions (select from list) Access Console VM, Clone VM, Create App, Create Blueprint, Create Image, Create VM, Delete Blueprint, Delete VM, Download Blueprint, Export Blueprint, Import Blueprint, Launch Blueprint, Render Blueprint, Update Blueprint, Update VM, Upload Blueprint, View Account, View App, View AWS Availability Zone, View AWS Elastic IP, View AWS Image, View AWS Key Pair, View AWS Machine Type, View AWS Region, View AWS Role, View AWS Security Group, View AWS Subnet, View AWS Volume Type, View AWS VPC, View Blueprint, View Cluster, View Image, View Project, View Subnet, View VM
Marketplace Item No Access (none)
View marketplace and published blueprints View Marketplace Item
View marketplace and publish new blueprints Update Marketplace Item, View Marketplace Item
Set Custom Permissions (select from list) Config Marketplace Item, Create Marketplace Item, Delete Marketplace Item, Render Marketplace Item, Update Marketplace Item, View Marketplace Item
Report No Access (none)
View Only Notify Report Instance, View Common Report Config, View Report Config, View Report Instance
Edit Access Create Common Report Config, Create Report Config, Create Report Instance, Delete Common Report Config, Delete Report Config, Delete Report Instance, Notify Report Instance, Update Common Report Config, Update Report Config, View Common Report Config, View Report Config, View Report Instance
Set Custom Permissions (select from list) Create Common Report Config, Create Report Config, Create Report Instance, Delete Common Report Config, Delete Report Config, Delete Report Instance, Notify Report Instance, Update Common Report Config, Update Report Config, View Common Report Config, View Report Config, View Report Instance
Cluster No Access (none)
View Access View Cluster
Subnet No Access (none)
View Access View Subnet
Image No Access (none)
View Only View Image
Set Custom Permissions (select from list) Copy Image Remote, Create Image, Delete Image, Migrate Image, Update Image, View Image

The following table describe the permissions.

Note: By default, assigning certain permissions to a user role might implicitly assign more permissions to that role. However, the implicitly assigned permissions will not be displayed in the details page for that role. These permissions are displayed only if you manually assign them to that role.
Permission Description Assigned Implicilty By
Create App Allows to create an application.
Delete App Allows to delete an application.
View App Allows to view an application.
Action Run App Allows to run action on an application.
Download App Runlog Allows to download an application runlog.
Abort App Runlog Allows to abort an application runlog.
Access Console VM Allows to access the console of a virtual machine.
Create VM Allows to create a virtual machine.
View VM Allows to view a virtual machine.
Clone VM Allows to clone a virtual machine.
Delete VM Allows to delete a virtual machine.
Export VM Allows to export a virtual machine
Snapshot VM Allows to snapshot a virtual machine.
View VM Recovery Point Allows to view a vm_recovery_point.
Update VM Recovery Point Allows to update a vm_recovery_point.
Delete VM Recovery Point Allows to delete a vm_recovery_point.
Restore VM Recovery Point Allows to restore a vm_recovery_point.
Update VM Allows to update a virtual machine.
Update VM Boot Config Allows to update a virtual machine's boot configuration. Update VM
Update VM CPU Allows to update a virtual machine's CPU configuration. Update VM
Update VM Categories Allows to update a virtual machine's categories. Update VM
Update VM Description Allows to update a virtual machine's description. Update VM
Update VM GPU List Allows to update a virtual machine's GPUs. Update VM
Update VM NIC List Allows to update a virtual machine's NICs. Update VM
Update VM Owner Allows to update a virtual machine's owner. Update VM
Update VM Project Allows to update a virtual machine's project. Update VM
Update VM NGT Config Allows updates to a virtual machine's Nutanix Guest Tools configuration. Update VM
Update VM Power State Allows updates to a virtual machine's power state. Update VM
Update VM Disk List Allows to update a virtual machine's disks. Update VM
Update VM Memory Allows to update a virtual machine's memory configuration. Update VM
Update VM Power State Mechanism Allows updates to a virtual machine's power state mechanism. Update VM or Update VM Power State
Allow VM Power Off Allows power off and shutdown operations on a virtual machine. Update VM or Update VM Power State
Allow VM Power On Allows power on operation on a virtual machine. Update VM or Update VM Power State
Allow VM Reboot Allows reboot operation on a virtual machine. Update VM or Update VM Power State
Expand VM Disk Size Allows to expand a virtual machine's disk size. Update VM or Update VM Disk List
Mount VM CDROM Allows to mount an ISO to virtual machine's CDROM. Update VM or Update VM Disk List
Unmount VM CDROM Allows to unmount ISO from virtual machine's CDROM. Update VM or Update VM Disk List
Update VM Memory Overcommit Allows to update a virtual machine's memory overcommit configuration. Update VM or Update VM Memory
Allow VM Reset Allows reset (hard reboot) operation on a virtual machine. Update VM, Update VM Power State, or Allow VM Reboot
View Cluster Allows to view a cluster.
Update Cluster Allows to update a cluster.
Create Image Allows to create an image.
View Image Allows to view a image.
Copy Image Remote Allows to copy an image from local PC to remote PC.
Delete Image Allows to delete an image.
Migrate Image Allows to migrate an image from PE to PC.
Update Image Allows to update a image.
Create Image Placement Policy Allows to create an image placement policy.
View Image Placement Policy Allows to view an image placement policy.
Delete Image Placement Policy Allows to delete an image placement policy.
Update Image Placement Policy Allows to update an image placement policy.
Create AWS VM Allows to create an AWS virtual machine.
View AWS VM Allows to view an AWS virtual machine.
Update AWS VM Allows to update an AWS virtual machine.
Delete AWS VM Allows to delete an AWS virtual machine.
View AWS AZ Allows to view AWS Availability Zones.
View AWS Elastic IP Allows to view an AWS Elastic IP.
View AWS Image Allows to view an AWS image.
View AWS Key Pair Allows to view AWS keypairs.
View AWS Machine Type Allows to view AWS machine types.
View AWS Region Allows to view AWS regions.
View AWS Role Allows to view AWS roles.
View AWS Security Group Allows to view an AWS security group.
View AWS Subnet Allows to view an AWS subnet.
View AWS Volume Type Allows to view AWS volume types.
View AWS VPC Allows to view an AWS VPC.
Create Subnet Allows to create a subnet.
View Subnet Allows to view a subnet.
Update Subnet Allows to update a subnet.
Delete Subnet Allows to delete a subnet.
Create Blueprint Allows to create the blueprint of an application.
View Blueprint Allows to view the blueprint of an application.
Launch Blueprint Allows to launch the blueprint of an application.
Clone Blueprint Allows to clone the blueprint of an application.
Delete Blueprint Allows to delete the blueprint of an application.
Download Blueprint Allows to download the blueprint of an application.
Export Blueprint Allows to export the blueprint of an application.
Import Blueprint Allows to import the blueprint of an application.
Render Blueprint Allows to render the blueprint of an application.
Update Blueprint Allows to update the blueprint of an application.
Upload Blueprint Allows to upload the blueprint of an application.
Create OVA Allows to create an OVA.
View OVA Allows to view an OVA.
Update OVA Allows to update an OVA.
Delete OVA Allows to delete an OVA.
Create Marketplace Item Allows to create a marketplace item.
View Marketplace Item Allows to view a marketplace item.
Update Marketplace Item Allows to update a marketplace item.
Config Marketplace Item Allows to configure a marketplace item.
Render Marketplace Item Allows to render a marketplace item.
Delete Marketplace Item Allows to delete a marketplace item.
Create Report Config Allows to create a report_config.
View Report Config Allows to view a report_config.
Run Report Config Allows to run a report_config.
Share Report Config Allows to share a report_config.
Update Report Config Allows to update a report_config.
Delete Report Config Allows to delete a report_config.
Create Common Report Config Allows to create a common report_config.
View Common Report Config Allows to view a common report_config.
Update Common Report Config Allows to update a common report_config.
Delete Common Report Config Allows to delete a common report_config.
Create Report Instance Allows to create a report_instance.
View Report Instance Allows to view a report_instance.
Notify Report Instance Allows to notify a report_instance.
Notify Report Instance Allows to notify a report_instance.
Share Report Instance Allows to share a report_instance.
Delete Report Instance Allows to delete a report_instance.
View Account Allows to view an account.
View Project Allows to view a project.
View User Allows to view a user.
View User Group Allows to view a user group.
View Name Category Allows to view a category's name.
View Value Category Allows to view a category's value.
View Virtual Switch Allows to view a virtual switch.
Granting Restore Permission to Project User

About this task

By default, only a self service admin or a cluster admin can view and restore the recovery points. However, a self service admin or cluster admin can grant permission to the project user to restore the VM from a recovery point.

To grant restore Permission to a project user, do the following:

Procedure

  1. Log on to Prism Central with cluster admin or self service admin credentials.
  2. Go to the roles dashboard (select Administration > Roles in the pull-down menu) and do one of the following:
    • Click the Create Role button.
    • Select an existing role of a project user and then select Duplicate from the Actions drop-down menu. To modify the duplicate role, select Update Role from the Actions pull-down list.
  3. The Roles page for that role appears. In the Roles page, do the following in the indicated fields:
    1. Role Name : Enter a name for the new role.
    2. Description (optional): Enter a description of the role.
    3. Expand VM Recovery Point and do one of the following:
      • Select Full Access and then select Allow VM recovery point creation .
      • Click Change next to Set Custom Permissions to customize the permissions. Enable Restore VM Recovery Point permission. This permission also grants the permission to view the VM created from the restore process.
    4. Click Save to add the role. The page closes and the new role appears in the Roles view list.
  4. In the Roles view, select the newly created role and click Manage Assignment to assign the user to this role.
  5. In the Add New dialog, do the following:
    • Under Select Users or User Groups or OUs , enter the target user name. The search box displays the matched records. Select the required listing from the records.
    • Under Entities , select VM Recovery Point , select Individual Entry from the drop-down list, and then select All VM Recovery Points.
    • Click Save to finish.

Configuring Role Mapping

About this task

After user authentication is configured (see Configuring Authentication), the users or the authorized directories are not assigned the permissions by default. The required permissions must be explicitly assigned to users, authorized directories, or organizational units using role mapping.

You can refine the authentication process by assigning a role with associated permissions to users, groups, and organizational units. This procedure allows you to map and assign users to the predefined roles in Prism Central such as, User Admin , Cluster Admin , and Viewer . To assign roles, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Role Mapping from the Settings page.

    The Role Mapping window appears.

    Figure. Role Mapping Window Click to enlarge displays annotated role mapping window

  2. To create a role mapping, click the New Mapping button.

    The Create Role Mapping window appears. Enter the required information in the following fields.

    Figure. Create Role Mapping Window Click to enlarge displays create role mapping window

  3. Directory or Provider : Select the target directory or identity provider from the pull-down list.
    Only directories and identity providers previously configured in the authentication settings are available. If the desired directory or provider does not appear in the list, add that directory or provider, and then return to this procedure.
  4. Type : Select the desired LDAP entity type from the pull-down list.
    This field appears only if you have selected a directory from the Directory or Provider pull-down list. The following entity types are available:
    • User : A named user. For example, dev_user_1.
    • Group : A group of users. For example, dev_grp1, dev_grp2, sr_dev_1, and staff_dev_1.
    • OU : organizational units with one or more users, groups, and even other organizational units. For example, all_dev, consists of user dev_user_1 and groups dev_grp1, dev_grp2, sr_dev_1, and staff_dev_1.
  5. Role : Select a user role from the pull-down list.
    You can choose one of the following roles:
    • Viewer : Allows users with view-only access to the information and hence cannot perform any administrative tasks.
    • Cluster Admin (Formerly Prism Central Admin): Allows users to view and perform all administrative tasks except creating or modifying user accounts.
    • User Admin : Allows users to view information, perform administrative tasks, and to create and modify user accounts.
  6. Values : Enter the entity names. The entity names are assigned with the respective roles that you have selected.
    The entity names are case sensitive. If you need to provide more than one entity name, then the entity names should be separated by a comma (,) without any spaces in between them.

    LDAP-based authentication

    • For AD

      Enter the actual names used by the organizational units (it applies to all users and groups in those OUs), groups (all users in those groups), or users (each named user) used in LDAP in the Values field.

      For example, entering sr_dev_1,staff_dev_1 in the Values field when the LDAP type is Group and the role is Cluster Admin, implies that all users in the sr_dev_1 and staff_dev_1 groups are assigned the administrative role for the cluster.

      Do not include the domain name in the value. For example, enter all_dev , and not all_dev@<domain_name> . However, when users log in to Cluster Admin, include the domain along with the username.

      User : Enter the sAMAccountName or userPrincipalName in the values field.

      Group : Enter common name (cn) or name.

      OU : Enter name.

    • For OpenLDAP

      User : Use the username attribute (that was configured while adding the directory) value.

      Group : Use the group name attribute (cn) value.

      OU : Use the OU attribute (ou) value.

    SAML-based authentication:

    You must configure the NameID attribute in the identity provider. You can enter the NameID returned in the SAML response in the Values field.

    For SAML, only User type is supported. Other types such as, Group and OU, are not supported.

    If you enable Identity and Access Management, see Security Management Using Identity and Access Management (Prism Central)

  7. Click Save .

    The role mapping configurations are saved, and the new role is listed in the Role Mapping window.

    You can create a role map for each authorized directory. You can also create multiple role maps that apply to a single directory. When there are multiple maps for a directory, the most specific rule for a user applies.

    For example, adding a Group map set to Cluster Admin and a User map set to Viewer for a few specific users in that group means all users in the group have administrator permission except those few specific users who have only viewing permission.

  8. To edit a role map entry, click the pencil icon for that entry.
    After clicking the pencil icon, the Edit Role Mapping window appears which is similar to the Create Role Mapping window. Edit the required information in the required fields and click the Save button to update the changes.
  9. To delete a role map entry, click the X icon for that entry and click the OK button to confirm the role map entry deletion.
    The role map entry is removed from the list.

Assigning a Role

About this task

In addition to configuring basic role maps (see Configuring Role Mapping), you can configure more precise role assignments (AHV only). To assign a role to selected users or groups that applies just to a specified set of entities, do the following:

Procedure

  1. Select the desired role in the roles dashboard and then click the Role Assignment button in the details page.
  2. Click the New Users button and enter the user or group name you want assigned to this role.

    Entering text in the field displays a list of users from which you can select, and you can enter multiple user names in this field.

  3. Click the New Entities button, select the entity type from the pull-down list, and then enter the entity name in the field.

    Entering text in the field displays a list of entities from which you can select, and you can enter multiple entity names in the field. You can choose from the following entity types:

    • AHV VM —allows management of VMs including create and edit
    • Category —custom role permissons
    • AHV Subnet —allows user to view subnet details
    • AHV Cluster —allows user to view cluster details and manage cluster details per permissions assigned
  4. Repeat for any combination of users/entities you want to define.

    You can specify various user/entity relationships when configuring the role assignment. To illustrate, in the following example the first line assigns the my_custom_role to a single user (ssp_admin) for two VMs (normal_vm and test_andrey). The second line assigns the role to two users (locus1 and locus2) for a single category (4gcC1Z). The third line again assigns the role to the user locus1 but this time for all subnets.

    Note: To allow users to create certain entities like a VM, you may also need to grant them access to related entities like clusters, networks, and images that the VM requires.
    Figure. Role Assignment Page Click to enlarge example role assignment page

  5. Click the Save button (lower right) to save the role assignments.

Displaying Role Permissions

About this task

Do the following to display the privileges associated with a role.

Procedure

  1. Go to the roles dashboard and select the desired role from the list.

    For example, if you click the Consumer role, the details page for that role appears, and you can view all the privileges associated with the Consumer role.

    Figure. Role Summary Tab Click to enlarge

  2. Click the Users tab to display the users that are assigned this role.
    Figure. Role Users Tab Click to enlarge

  3. Click the User Groups tab to display the groups that are assigned this role.
  4. Click the Role Assignment tab to display the user/entity pairs assigned this role (see Assigning a Role).

Installing an SSL Certificate

About this task

Prism Central supports SSL certificate-based authentication for console access. To install a self-signed or custom SSL certificate, do the following:
Important: Ensure that SSL certificates are not password protected.
Note: Nutanix recommends that you replace the default self-signed certificate with a CA signed certificate.

Procedure

  1. Click the gear icon in the main menu and then select SSL Certificate in the Settings page.
  2. To replace (or install) a certificate, click the Replace Certificate button.
    Figure. SSL Certificate Window
    Click to enlarge

  3. To create a new self-signed certificate, click the Replace Certificate option and then click the Apply button.

    A dialog box appears to verify the action; click the OK button. This generates and applies a new RSA 2048-bit self-signed certificate for Prism Central.

    Figure. SSL Certificate Window: Regenerate
    Click to enlarge

  4. To apply a custom certificate that you provide, do the following:
    1. Click the Import Key and Certificate option and then click the Next button.
      Figure. SSL Certificate Window: Import Click to enlarge
    2. Do the following in the indicated fields, and then click the Import Files button.
      Note:
      • All the three imported files for the custom certificate must be PEM encoded.
      • Ensure that the private key does not have any extra data (or custom attributes) before the beginning (-----BEGIN CERTIFICATE-----) or after the end (-----END CERTIFICATE-----) of the private key block.
      • Private Key Type : Select the appropriate type for the signed certificate from the pull-down list (RSA 4096 bit, RSA 2048 bit, EC DSA 256 bit, or EC DSA 384 bit).
      • Private Key : Click the Browse button and select the private key associated with the certificate to be imported.
      • Public Certificate : Click the Browse button and select the signed public portion of the server certificate corresponding to the private key.
      • CA Certificate/Chain : Click the Browse button and select the certificate or chain of the signing authority for the public certificate.
      Figure. SSL Certificate Window: Select Files
      Click to enlarge

      In order to meet the high security standards of NIST SP800-131a compliance, the requirements of the RFC 6460 for NSA Suite B, and supply the optimal performance for encryption, the certificate import process validates the correct signature algorithm is used for a given key/cert pair. Refer to the following table to ensure the proper set of key types, sizes/curves, and signature algorithms. The CA must sign all public certificates with proper type, size/curve, and signature algorithm for the import process to validate successfully.
      Note: Prism does not have any specific requirement or enforcement logic for the subject name of the certificates (subject alternative names (SAN)) or wildcard certificates.
      Table 1. Supported Key Configurations
      Key Type Size/Curve Signature Algorithm
      RSA 4096 SHA256-with-RSAEncryption
      RSA 2048 SHA256-with-RSAEncryption
      EC DSA 256 prime256v1 ecdsa-with-sha256
      EC DSA 384 secp384r1 ecdsa-with-sha384
      EC DSA 521 secp521r1 ecdsa-with-sha512
      You can use the cat command to concatenate a list of CA certificates into a chain file.
      $ cat signer.crt inter.crt root.crt > server.cert
      Order is essential. The total chain should begin with the certificate of the signer and end with the root CA certificate as the final entry.

Results

After generating or uploading the new certificate, the interface gateway restarts. If the certificate and credentials are valid, the interface gateway uses the new certificate immediately, which means your browser session (and all other open browser sessions) will be invalid until you reload the page and accept the new certificate. If anything is wrong with the certificate (such as a corrupted file or wrong certificate type), the new certificate is discarded, and the system reverts back to the original default certificate provided by Nutanix.

Note: The system holds only one custom SSL certificate. If a new certificate is uploaded, it replaces the existing certificate. The previous certificate is discarded.

Controlling Remote (SSH) Access

About this task

Nutanix supports key-based SSH access to Prism Central. Enabling key-based SSH access ensures that password authentication is disabled and only the keys you have provided can be used to access the Prism Central (only for nutanix/admin users). Thus making the Prism Central more secure.

You can create a key pair (or multiple key pairs) and add the public keys to enable key-based SSH access. However, when site security requirements do not allow such access, you can remove all public keys to prevent SSH access.

To control key-based SSH access to Prism Central, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Cluster Lockdown in the Settings page.

    The Cluster Lockdown dialog box appears. Enabled public keys (if any) are listed in this window.

    Figure. Cluster Lockdown Window Click to enlarge displays cluster lockdown window

  2. To disable (or enable) remote login access, uncheck (check) the Enable Remote Login with Password box.

    Remote login access is enabled by default.

  3. To add a new public key, click the New Public Key button and then do the following in the displayed fields:
    1. Name : Enter a key name.
    2. Key : Enter (paste) the key value into the field.
    3. Click the Save button (lower right) to save the key and return to the main Cluster Lockdown window.

    There are no public keys available by default, but you can add any number of public keys.

  4. To delete a public key, click the X on the right of that key line.
    Note: Deleting all the public keys and disabling remote login access locks down the cluster from SSH access.

Password Retry Lockout

For enhanced security, Prism Central and Prim Element locks out the default 'admin' account for a period of 15 minutes after five unsuccessful login attempts. Once the account is locked out, the following message is displayed at the login screen.

Account locked due to too many failed attempts

You can attempt entering the password after the 15 minutes lockout period, or contact Nutanix Support in case you have forgotten your password.

Security Policies using Flow

Nutanix Flow includes a policy-driven security framework that inspects traffic within the data center. For more information, see the Flow Microsegmentation Guide.

Security Management Using Identity and Access Management (Prism Central)

Enabled and administered from Prism Central, Identity and Access Management (IAM) is an authentication and authorization feature that uses attribute-based access control (ABAC). It is disabled by default. This section describes Prism Central IAM prerequisites, enablement, and SAML-based standard-compliant identity provider (IDP) configuration.

After you enable the Micro Services Infrastructure (CMSP) on Prism Central, IAM is automatically enabled. You can configure a wider selection of identity providers, including Security Assertion Markup Language (SAML) based identity providers. The Prism Central web console presents an updated sign-on/authentication page

The enable process migrates existing directory, identity provider, and user configurations, including Common Access Card (CAC) client authentication configurations. After enabling IAM, if you want to enable a client to authenticate by using certificates, you must also enable CAC authentication. For more information, see Identity and Access Management Prerequisites and Considerations. Also, see the Identity and Access Management Software Support topic in the Prism Central Release Notes for specific support requirements.

The work flows for creating authentication configurations and providing user and role access described in Configuring Authentication are the same whether IAM is enabled or not.

IAM Features

Highly Scalable Architecture

Based on the Kubernetes open source platform, IAM uses independent pods for authentication (AuthN), authorization (AuthZ), and IAM data storage and replication.

  • Each pod automatically scales independently of Prism Central when required. No user intervention or control is required.
  • When new features or functions are available, you can update IAM pods independently of Prism Central updates through Life Cycle Manager (LCM).
  • IAM uses a rolling upgrade method to help ensure zero downtime.
Secure by Design
  • Mutual TLS authentication (mTLS) secures IAM component communication.
  • The Micro Services infrastructure (CMSP) on Prism Central provisions certificates for mTLS.
More SAML Identity Providers (IDP)

Without enabling CMSP/IAM on Prism Central, Active Directory Federation Services (ADFS) is the only supported IDP for Single Sign-on. After you enable it, IAM supports more IDPs. Nutanix has tested these IDPs when SAML IDP authentication is configured for Prism Central.

  • Active Directory Federation Services (ADFS)
  • Azure Active Directory Federation Services (Azure ADFS)
  • Okta
  • PingOne
  • Shibboleth
  • KeyCloak

Users can log on from the Prism Central web console only. IDP-initiated authentication work flows are not supported. That is, logging on or signing on from an IDP web page or site is not supported.

Updated Authentication Page

After enabling IAM, the Prism Central login page is updated depending on your configuration. For example, if you have configured local user account and Active Directory authentication, this default page appears for directory users as follows. To log in as a local user, click the Log In with your Nutanix Local Account link.

Figure. Sample Default Prism Central IAM Logon Page, Active Directory And Local User Authentication Click to enlarge Sample Prism Central IAM Logon Page shows new credential fields

In another example, if you have configured SAML authentication instances named Shibboleth and AD2, Prism Central displays this page.

Figure. Sample Prism Central IAM Logon Page, Active Directory , Identity Provider, And Local User Authentication Click to enlarge Sample Prism Central IAM Logon Page shows new credential fields

Note: After upgrade to pc.2022.9 if the Security Assertion Markup Language (SAML) IDP is configured, you need to download the Prism Central metadata and re-configure the SAML IDP to recognize Prism Central as the service provider. See Updating ADFS When Using SAML Authentication to create the required rules for ADFS.

Identity and Access Management Prerequisites and Considerations

IAM Prerequisites

For specific minimum software support and requirements for IAM, see the Prism Central release notes.

For microservices infrastructure requirements, see Enabling Microservices Infrastructure in the Prism Central Guide .

Prism Central
  • Ensure that Prism Central is hosted on an AOS cluster running AHV.
  • Ensure that you have created a Virtual IP address (VIP) for Prism Central. The Acropolis Upgrade Guide describes how to set the VIP for the Prism Central VM. Once set, do not change this address.
  • Ensure that you have created a fully qualified domain name (FQDN) for Prism Central. Once the Prism Central FQDN is set, do not change it. For more information about how to set the FQDN in the Cluster Details window, see Managing Prism Central in the Prism Central Guide .
  • When microservices infrastructure is enabled on a Prism Central scale-out three-node deployment, reconfiguring the IP address and gateway of the Prism Central VMs is not supported.
  • Ensure connectivity between Prism Central and its managed Prism Element clusters.
  • Enable Microservices Infrastructure on Prism Central (CMSP) first to enable and use IAM. For more information, see Enabling Microservices Infrastructure in the Prism Central Guide .
  • IAM supports small or large single PC VM deployments. However, you cannot expand the single VM deployment to a scale-out three-node deployment.
  • IAM supports scale-out three-node PC VMs deployments. Reverting this deployment to a single PC VM deployment is not supported.
  • Make sure Prism Central is managing at least one Prism Element cluster. For more information about how to register a cluster, see Register (Unregister) Cluster with Prism Central in the Prism Central Guide .
  • You cannot unregister the Prism Element cluster that is hosting the Prism Central deployment where you have enabled CMSP and IAM. You can unregister other clusters being managed by this Prism Central deployment.
Prism Element Clusters

Ensure that you have configured the following cluster settings. For more information, see Modifying Cluster Details in Prism Web Console Guide .

  • Virtual IP address (VIP). Once set, do not change this address
  • iSCSI data services IP address (DSIP). Once set, do not change this address
  • NTP server
  • Name server

IAM Considerations

Existing Authentication and Authorization Migrated After Enabling IAM
  • When you enable IAM by enabling CMSP, IAM migrates existing authentication and authorization configurations, including Common Access Card client authentication configurations.
Upgrading Prism Central After Enabling IAM
  • After you upgrade Prism Central, if CMSP (and therefore IAM) was previously enabled, both the services are enabled by default. You must contact Nutanix Support for any custom requirement.
Note: After upgrade to pc.2022.9 if the Security Assertion Markup Language (SAML) IDP is configured, you need to download the Prism Central metadata and re-configure the SAML IDP to recognize Prism Central as the service provider. See Updating ADFS When Using SAML Authentication to create the required rules for ADFS.
User Session Lifetime
  • Each session has a maximum lifetime of 8 hours
  • Session idle time is 15 minutes. After 15 minutes, a user or client is logged out and must re-authenticate.
Client Authentication and Common Access Card (CAC) Support
  • IAM supports deployments where CAC authentication and client authentication are enabled on Prism Central. After enabling IAM, however, Prism Central supports client authentication only if CAC authentication is also enabled. You can enable client authentication if you also enable CAC authentication.
  • Ensure that port 9441 is open in your firewall if you are using CAC client authentication.
Hypervisor Support
  • You can deploy IAM on an on-premise Prism Central (PC) deployment hosted on an AOS cluster running AHV. Clusters running other hypervisors are not supported.

Enabling IAM

Before you begin

  • IAM on Prism Central is disabled by default. When you enable the Micro Services Infrastructure on Prism Central, IAM is automatically enabled.
  • See Enabling Microservices Infrastructure in the Prism Central Guide .
  • See Identity and Access Management Prerequisites and Considerations and also the Identity and Access Management Software Support topic in the Prism Central release notes for specific support requirements.

Procedure

  1. Enable Micro Services Infrastructure on Prism Central as described in Enabling Micro Services Infrastructure in the Prism Central Guide .
  2. To view task status:
    1. Open a web browser and log in to the Prism Central web console.
    2. Go to the Activity > Tasks dashboard and find the IAM Migration & Bootstrap task.
    The task takes up to 60 minutes to complete. Part of the task is migrating existing authentication configurations.
  3. After the enablement tasks are completed, including the IAM Migration & Bootstrap task, log out of Prism Central. Wait at least 15 minutes before logging on to Prism Central.

    The Prism Central web console shows a new log in page as shown below. This confirms that IAM is enabled.

    Note:

    Depending on your existing authentication configuration, the log in page might look different.

    Also, you can go to Settings > Prism Central Management page to verify if Prism Central on Microservices Infrastructure (CMSP) is enabled. CMSP and IAM enablement happen together.

    Figure. Sample Prism Central IAM Logon Page Click to enlarge Sample Prism Central IAM Logon Page shows new credential fields

What to do next

Configure authentication and access. If you are implementing SAML authentication with Active Directory Federated Services (ADFS), see Updating ADFS When Using SAML Authentication.

Configuring Authentication

Caution: Prism Central does not allow the use of the (not secure) SSLv2 and SSLv3 ciphers. To eliminate the possibility of an SSL Fallback situation and denied access to Prism Central, disable (uncheck) SSLv2 and SSLv3 in any browser used for access. However, TLS must be enabled (checked).

Prism Central supports user authentication with these authentication options:

  • SAML authentication. Users can authenticate through a supported identity provider when SAML support is enabled for Prism Central. The Security Assertion Markup Language (SAML) is an open standard for exchanging authentication and authorization data between two parties: an identity provider (IDP) and Prism Central as the service provider.

    With IAM, in addition to ADFS, other IDPs are available. For more information, see Security Management Using Identity and Access Management (Prism Central) and Updating ADFS When Using SAML Authentication.

  • Local user authentication. Users can authenticate if they have a local Prism Central account. For more information, see Managing Local User Accounts .
  • Active Directory authentication. Users can authenticate using their Active Directory (or OpenLDAP) credentials when Active Directory support is enabled for Prism Central.

Enabling and Configuring Client Authentication/CAC

Before you begin

  • If you have enabled Identity and Access Management (IAM) on Prism Central as described in Enabling IAM and want to enable a client to authenticate by using certificates, you must also enable CAC authentication.
  • Ensure that port 9441 is open in your firewall if you are using CAC client authentication. After enabling CAC client authentication, your CAC logon redirects the browser to use port 9441.

Procedure

  1. In the web console, click the gear icon in the main menu and then select Authentication in the Settings page.
  2. Click the Client tab, then do the following steps.
    1. Select the Configure Client Chain Certificate check box.
    2. Click the Choose File button, browse to and select a client chain certificate to upload, and then click the Open button to upload the certificate.
      Note: Uploaded certificate files must be PEM encoded. The web console restarts after the upload step.
    3. To enable client authentication, click Enable Client Authentication .
    4. To modify client authentication, do one of the following:
      Note: The web console restarts when you change these settings.
      • Click Enable Client Authentication to disable client authentication.
      • Click Remove to delete the current certificate. (This deletion also disables client authentication.)
      • To enable OCSP or CRL-based certificate revocation checking, see Certificate Revocation Checking.

    Client authentication allows you to securely access the Prism by exchanging a digital certificate. Prism will validate that the certificate is signed by your organization’s trusted signing certificate.

    Client authentication ensures that the Nutanix cluster gets a valid certificate from the user. Normally, a one-way authentication process occurs where the server provides a certificate so the user can verify the authenticity of the server. When client authentication is enabled, this becomes a two-way authentication where the server also verifies the authenticity of the user. A user must provide a valid certificate when accessing the console either by installing the certificate on the local machine or by providing it through a smart card reader.

    Note: The CA must be the same for both the client chain certificate and the certificate on the local machine or smart card.
  3. To specify a service account that the Prism Central web console can use to log in to Active Directory and authenticate Common Access Card (CAC) users, select the Configure Service Account check box, and then do the following in the indicated fields:
    1. Directory : Select the authentication directory that contains the CAC users that you want to authenticate.
      This list includes the directories that are configured on the Directory List tab.
    2. Service Username : Enter the user name in the user name@domain.com format that you want the web console to use to log in to the Active Directory.
    3. Service Password : Enter the password for the service user name.
    4. Click Enable CAC Authentication .
      Note: For federal customers only.
      Note: The Prism Central console restarts after you change this setting.

    The Common Access Card (CAC) is a smart card about the size of a credit card, which some organizations use to access their systems. After you insert the CAC into the CAC reader connected to your system, the software in the reader prompts you to enter a PIN. After you enter a valid PIN, the software extracts your personal certificate that represents you and forwards the certificate to the server using the HTTP protocol.

    Nutanix Prism verifies the certificate as follows:

    • Validates that the certificate has been signed by your organization’s trusted signing certificate.
    • Extracts the Electronic Data Interchange Personal Identifier (EDIPI) from the certificate and uses the EDIPI to check the validity of an account within the Active Directory. The security context from the EDIPI is used for your PRISM session.
    • Prism Central supports both certificate authentication and basic authentication in order to handle both Prism Central login using a certificate and allowing REST API to use basic authentication. It is physically not possible for REST API to use CAC certificates. With this behavior, if the certificate is present during Prism Central login, the certificate authentication is used. However, if the certificate is not present, basic authentication is enforced and used.
    If you map a Prism Central role to a CAC user and not to an Active Directory group or organizational unit to which the user belongs, specify the EDIPI (User Principal Name, or UPN) of that user in the role mapping. A user who presents a CAC with a valid certificate is mapped to a role and taken directly to the web console home page. The web console login page is not displayed.
    Note: If you have logged on to Prism Central by using CAC authentication, to successfully log out of Prism Central, close the browser after you click Log Out .

Updating ADFS When Using SAML Authentication

With Nutanix IAM enabled, to maintain compatibility with new and existing IDP/SAML authentication configurations, update your Active Directory Federated Services (ADFS) configuration - specifically the Prism Central Relying Party Trust settings. For these configurations, you are using SAML as the open standard for exchanging authentication and authorization data between ADFS as the identity provider (IDP) and Prism Central as the service provider.

About this task

In your ADFS Server configuration, update the Prism Central Relying Party Trust settings by creating claim rules to send the selected LDAP attribute as the SAML NameID in Email address format. For example, map the User Principal Name to NameID in the SAML assertion claims.

As an example, this topic uses UPN as the LDAP Attribute to map. You could also map the email address attribute to NameID. See the Microsoft Active Directory Federation Services documentation for details about creating a claims aware Relying Party Trust and claims rules.

Procedure

  1. In the Relying Party Trust for Prism Central, configure a claims issuance policy with two rules.
    1. One rule based on the Send LDAP Attributes as Claims template.
    2. One rule based on the Transform an Incoming Claim template
  2. For the rule using the Send LDAP Attributes as Claims template, select the LDAP Attribute as User-Principal-Name and set Outgoing Claim Type to UPN .
    For User group configuration using the Send LDAP Attributes as Claims template, select the LDAP Attribute as Token-Groups - Unqualified-Names and set Outgoing Claim Type to Group .
  3. For the rule using the Transform an Incoming Claim template:
    1. Set Incoming claim type to UPN .
    2. Set the Outgoing claim type to Name ID .
    3. Set the Outgoing name ID format to Email .
    4. Select Pass through all claim values .

Adding a SAML-based Identity Provider

About this task

If you do not enable Nutanix Identity and Access Management (IAM) on Prism Central, ADFS is the only supported identity provider (IDP) for Single Sign-on and only one IDP is allowed at a time. If you enable IAM, additional IDPs are available. See Security Management Using Identity and Access Management (Prism Central) and also Updating ADFS When Using SAML Authentication.

Before you begin

  • An identity provider (typically a server or other computer) is the system that provides authentication through a SAML request. There are various implementations that can provide authentication services in line with the SAML standard.
  • If you enable IAM by enabling CMSP, you can specify other tested standard-compliant IDPs in addition to ADFS. See also the Prism Central release notes topic Identity and Access Management Software Support for specific support requirements..

    Only one identity provider is allowed at a time, so if one was already configured, the + New IDP link does not appear.

  • You must configure the identity provider to return the NameID attribute in SAML response. The NameID attribute is used by Prism Central for role mapping. See Configuring Role Mapping for details.

Procedure

  1. In the web console, click the gear icon in the main menu and then select Authentication in the Settings page.
  2. To add a SAML-based identity provider, click the + New IDP link.

    A set of fields is displayed. Do the following in the indicated fields:

    1. Configuration name : Enter a name for the identity provider. This name will appear in the log in authentication screen.
    2. Import Metadata : Click this radio button to upload a metadata file that contains the identity provider information.

      Identity providers typically provide an XML file on their website that includes metadata about that identity provider, which you can download from that site and then upload to Prism Central. Click + Import Metadata to open a search window on your local system and then select the target XML file that you downloaded previously. Click the Save button to save the configuration.

      Figure. Identity Provider Fields (metadata configuration) Click to enlarge

    This completes configuring an identity provider in Prism Central, but you must also configure the callback URL for Prism Central on the identity provider. To do this, click the Download Metadata link just below the Identity Providers table to download an XML file that describes Prism Central and then upload this metadata file to the identity provider.
  3. To edit a identity provider entry, click the pencil icon for that entry.

    After clicking the pencil icon, the relevant fields reappear. Enter the new information in the appropriate fields and then click the Save button.

  4. To delete an identity provider entry, click the X icon for that entry.

    After clicking the X icon, a window prompt appears to verify the delete action; click the OK button. The entry is removed from the list.

Restoring Identity and Access Management Configuration Settings

Prism Central regularly backs up the Identity and Access Management (IAM) database, typically every 15 minutes. This procedure describes how to restore a specific IAM backup instance.

About this task

The IAM restore process restores any authentication and authorization configuration settings from the IAM database. You can choose an available time-stamped backup instance when you run the shell script in this procedure.

Procedure

  1. Log in to the Prism Central VM through an SSH session as the nutanix user.
  2. Run the backup shell script restore_iamv2.sh
    nutanix@pcvm$ sh /home/nutanix/cluster/bin/restore_iamv2.sh
    The script displays a numbered list of available backups, including the backup file time-stamp.
    Enter the Backup No. from the backup list (default is 1):
  3. Select a backup by number to start the restore process.
    The script displays a series of messages indicating restore progress, similar to:
    You Selected the Backup No 1
    Stopping the IAM services
    Waiting to stop all the IAM services and to start the restore process
    Restore Process Started
    Restore Process Completed
    ...
    Restarting the IAM services
    IAM Services Restarted Successfully

    After the script runs successfully, the command shell prompt returns and your IAM configuration is restored.

  4. To validate that your settings have been restored, log on to the Prism Central web console and go to Settings > Authentication and check the settings.

Accessing a List of Open Source Software Running on a Cluster

Use this procedure to access a text file that lists all of the open source software running on a cluster.

Procedure

  1. Log on to any Controller VM in the cluster as the admin user by using SSH.
  2. Access the text file by using the following command.
    less /usr/local/nutanix/license/blackduck_version_license.txt
Read article

Security Guide

AOS Security 6.5

Product Release Date: 2022-07-25

Last updated: 2022-12-14

Audience & Purpose

This Security Guide is intended for security-minded people responsible for architecting, managing, and supporting infrastructures, especially those who want to address security without adding more human resources or additional processes to their datacenters.

This guide offers an overview of the security development life cycle (SecDL) and host of security features supported by Nutanix. It also demonstrates how Nutanix complies with security regulations to streamline infrastructure security management. In addition to this, this guide addresses the technical requirements that are site specific or compliance-standards (that should be adhered), which are not enabled by default.

Note:

Hardening of the guest OS or any applications running on top of the Nutanix infrastructure is beyond the scope of this guide. We recommend that you refer to the documentation of the products that you have deployed in your Nutanix environment.

Nutanix Security Infrastructure

Nutanix takes a holistic approach to security with a secure platform, extensive automation, and a robust partner ecosystem. The Nutanix security development life cycle (SecDL) integrates security into every step of product development, rather than applying it as an afterthought. The SecDL is a foundational part of product design. The strong pervasive culture and processes built around security harden the Enterprise Cloud Platform and eliminate zero-day vulnerabilities. Efficient one-click operations and self-healing security models easily enable automation to maintain security in an always-on hyperconverged solution.

Since traditional manual configuration and checks cannot keep up with the ever-growing list of security requirements, Nutanix conforms to RHEL 7 Security Technical Implementation Guides (STIGs) that use machine-readable code to automate compliance against rigorous common standards. With Nutanix Security Configuration Management Automation (SCMA), you can quickly and continually assess and remediate your platform to ensure that it meets or exceeds all regulatory requirements.

Nutanix has standardized the security profile of the Controller VM to a security compliance baseline that meets or exceeds the standard high-governance requirements.

The most commonly used references in United States to guide vendors to build products according to the set of technical requirements are as follows.

  • The National Institute of Standards and Technology Special Publications Security and Privacy Controls for Federal Information Systems and Organizations (NIST 800.53)
  • The US Department of Defense Information Systems Agency (DISA) Security Technical Implementation Guides (STIG)

SCMA Implementation

The Nutanix platform and all products leverage the Security Configuration Management Automation (SCMA) framework to ensure that services are constantly inspected for variance to the security policy.

Nutanix has implemented security configuration management automation (SCMA) to check multiple security entities for both Nutanix storage and AHV. Nutanix automatically reports log inconsistencies and reverts them to the baseline.

With SCMA, you can schedule the STIG to run hourly, daily, weekly, or monthly. STIG has the lowest system priority within the virtual storage controller, ensuring that security checks do not interfere with platform performance.
Note: Only the SCMA schedule can be modified. The AIDE schedule is run on a fixed weekly schedule. To change the SCMA schedule for AHV or the Controller VM, see Hardening Instructions (nCLI).

RHEL 7 STIG Implementation in Nutanix Controller VM

Nutanix leverages SaltStack and SCMA to self-heal any deviation from the security baseline configuration of the operating system and hypervisor to remain in compliance. If any component is found as non-compliant, then the component is set back to the supported security settings without any intervention. To achieve this objective, Nutanix has implemented the Controller VM to support STIG compliance with the RHEL 7 STIG as published by DISA.

The STIG rules are capable of securing the boot loader, packages, file system, booting and service control, file ownership, authentication, kernel, and logging.

Example: STIG rules for Authentication

Prohibit direct root login, lock system accounts other than root , enforce several password maintenance details, cautiously configure SSH, enable screen-locking, configure user shell defaults, and display warning banners.

Security Updates

Nutanix provides continuous fixes and updates to address threats and vulnerabilities. Nutanix Security Advisories provide detailed information on the available security fixes and updates, including the vulnerability description and affected product/version.

To see the list of security advisories or search for a specific advisory, log on to the Support Portal and select Documentation , and then Security Advisories .

Nutanix Security Landscape

This topic provides highlights on Nutanix security landscape and its highlights. The following table helps to identify the security features offered out-of-the-box in Nutanix infrastructure.

Topic Highlights
Authentication and Authorization
Network segmentation VLAN-based, data driven segmentation
Security Policy Management Implement security policies using Microsegmentation.
Data security and integrity
Hardening Instructions

Log monitoring and analysis

Flow Networking

See Flow Networking Guide

UEFI

See UEFI Support for VMs in the AHV Administration Guide

Secure Boot

See Secure Boot Support for VMs in the AHV Administration Guide

Windows Credential Guard support

See Windows Defender Credential Guard Support in AHV in the AHV Administration Guide

RBAC

See Controlling User Access (RBAC)

Hardening Instructions (nCLI)

This chapter describes how to implement security hardening features for Nutanix AHV and Controller VM.

Hardening AHV

You can use Nutanix Command Line Interface (nCLI) in order to customize the various configuration settings related to AHV as described below.

Table 1. Configuration Settings to Harden the AHV
Description Command or Settings Output
Getting the cluster-wide configuration of the SCMA policy. Run the following command:
nutanix@cvm$ ncli cluster get-hypervisor-security-config
Enable Aide : false
Enable Core : false
Enable High Strength P... : false
Enable Banner : false
Schedule : DAILY
Enabling the Advanced Intrusion Detection Environment (AIDE) to run on a weekly basis. Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params enable-aide=true
Enable Aide : true
Enable Core : false
Enable High Strength P... : false
Enable Banner : false
Schedule : DAILY 
Enabling the high-strength password policies (minlen=15, difok=8, maxclassrepeat=4). Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params \
enable-high-strength-password=true
Enable Aide : true
Enable Core : false
Enable High Strength P... : true
Enable Banner : false
Schedule : DAILY
Enabling the defense knowledge consent banner of the US department. Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params enable-banner=true
Enable Aide : true
Enable Core : false
Enable High Strength P... : true
Enable Banner : true
Schedule : DAILY
Changing the default schedule of running the SCMA. The schedule can be hourly, daily, weekly, and monthly. Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params schedule=hourly
Enable Aide : true
Enable Core : false
Enable High Strength P... : true
Enable Banner : true
Schedule : HOURLY
Enabling the settings so that AHV can generate stack traces for any cluster issue. Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params enable-core=true
Note: Nutanix recommends that Core should not be set to true unless instructed by the Nutanix support team.
Enable Aide : true
Enable Core : true
Enable High Strength P... : true
Enable Banner : true
Schedule : HOURLY
Configuring security levels for the nutanix user for ssh login to the Nutanix Cluster. Run the following command:
nutanix@cvm$ ncli cluster edit-cvm-security-params ssh-security-level=limited
Enabling locking of the security configuration. Run the following command:
nutanix@cvm$ ncli cluster edit-cvm-security-params enable-lock-status=true
When a high governance official needs to run the hardened configuration. The settings should be as follows:
Enable Aide               : true
    Enable Core               : false
    Enable High Strength P... : true
    Enable Banner             : false
    Enable SNMPv3 Only        : true
    Schedule                  : HOURLY
    Enable Kernel Mitigations : false
    SSH Security Level        : LIMITED
    Enable Lock Status        : true
    Enable Kernel Core        : true
When a federal official needs to run the hardened configuration. The settings should be as follows:
Enable Aide               : true
    Enable Core               : false
    Enable High Strength P... : true
    Enable Banner             : true
    Enable SNMPv3 Only        : true
    Schedule                  : HOURLY
    Enable Kernel Mitigations : false
    SSH Security Level        : LIMITED
    Enable Lock Status        : true
    Enable Kernel Core        : true
Note: A banner file can be modified to support non-DoD customer banners.
Backing up the DoD banner file. Run the following command on the AHV host:
[root@AHV-host ~]# cp -a /etc/puppet/modules/kvm/files/issue.DoD \
/etc/puppet/modules/kvm/files/issue.DoD.bak
Important: Any changes in the banner file are not preserved across upgrades.
Modifying the DoD banner file. Run the following command on the AHV host:
[root@AHV-host ~]# vi /etc/puppet/modules/kvm/files/issue.DoD
Note: Repeat all the above steps on every AHV in a cluster.
Important: Any changes in the banner file are not preserved across upgrades.
Setting the banner for all nodes through nCLI. Run the following command:
nutanix@cvm$ ncli cluster edit-hypervisor-security-params enable-banner=true

The following options are configured or customized to harden the AHV:

  • Enable AIDE : Advanced Intrusion Detection Environment (AIDE) is a Linux utility that monitors a given node. After you install the AIDE package, the system will generate a database that contains all the files you selected in your configuration file by entering the aide -–init command as a root user. You can move the database to a secure location in a read-only media or on other machines. After you create the database, you can use the aide -–check command for the system to check the integrity of the files and directories by comparing the files and directories on your system with the snapshot in the database. In case there are unexpected changes, a report gets generated, which you can review. If the changes to existing files or files added are valid, you can use the aide --update command to update the database with the new changes.
  • Enable high strength password : You can run the command as shown in the table in this section to enable high-strength password policies (minlen=15, difok=8, maxclassrepeat=4).
    Note:
    • minlen is the minimum required length for a password.
    • difok is the minimum number of characters that must be different from the old password.
    • maxclassrepeat is the number of consecutive characters of same class that you can use in a password.
  • Enable Core : A core dump consists of the recorded state of the working memory of a computer program at a specific time, generally when the program gets crashed or terminated abnormally. Core dumps are used to assist in diagnosing or debugging errors in computer programs. You can enable the core for troubleshooting purposes.
  • Enable Banner : You can set a banner to display a specific message. For example, set a banner to display a warning message that the system is available to authorized users only.

Hardening Controller VM

You can use Nutanix Command Line Interface (nCLI) in order to customize the various configuration settings related to the Controller VM as described below.

For the complete list of cluster security parameters, see Edit the security params of a Cluster in the Command Reference guide.

  • Run the following command to support cluster-wide configuration of the SCMA policy.

    nutanix@cvm$ ncli cluster get-cvm-security-config

    The current cluster configuration is displayed.

    Enable Aide               : false
        Enable Core               : false
        Enable High Strength P... : false
        Enable Banner             : false
        Enable SNMPv3 Only        : false
        Schedule                  : DAILY
        Enable Kernel Mitigations : false
        SSH Security Level        : DEFAULT
        Enable Lock Status        : false
        Enable Kernel Core        : false
  • Run the following command to schedule weekly execution of Advanced Intrusion Detection Environment (AIDE).

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-aide=true

    The following output is displayed.

    Enable Aide               : true
        Enable Core               : false
        Enable High Strength P... : false
        Enable Banner             : false
        Enable SNMPv3 Only        : false
        Schedule                  : DAILY
        Enable Kernel Mitigations : false
        SSH Security Level        : DEFAULT
        Enable Lock Status        : false
        Enable Kernel Core        : false
  • Run the following command to enable the strong password policy.

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-high-strength-password=true

    The following output is displayed.

    Enable Aide               : true
        Enable Core               : false
        Enable High Strength P... : true
        Enable Banner             : false
        Enable SNMPv3 Only        : false
        Schedule                  : DAILY
        Enable Kernel Mitigations : false
        SSH Security Level        : DEFAULT
        Enable Lock Status        : false
        Enable Kernel Core        : false
  • Run the following command to enable the defense knowledge consent banner of the US department.

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-banner=true

    The following output is displayed.

    Enable Aide               : true
        Enable Core               : false
        Enable High Strength P... : true
        Enable Banner             : true
        Enable SNMPv3 Only        : false
        Schedule                  : DAILY
        Enable Kernel Mitigations : false
        SSH Security Level        : DEFAULT
        Enable Lock Status        : false
        Enable Kernel Core        : false
  • Run the following command to enable the settings to allow only SNMP version 3.

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-snmpv3-only=true

    The following output is displayed.

    Enable Aide               : true
        Enable Core               : false
        Enable High Strength P... : true
        Enable Banner             : true
        Enable SNMPv3 Only        : true
        Schedule                  : DAILY
        Enable Kernel Mitigations : false
        SSH Security Level        : DEFAULT
        Enable Lock Status        : false
        Enable Kernel Core        : false
  • Run the following command to change the default schedule of running the SCMA. The schedule can be hourly, daily, weekly, and monthly.

    nutanix@cvm$ ncli cluster edit-cvm-security-params schedule=hourly

    The following output is displayed.

    Enable Aide               : true
        Enable Core               : false
        Enable High Strength P... : true
        Enable Banner             : true
        Enable SNMPv3 Only        : true
        Schedule                  : HOURLY
        Enable Kernel Mitigations : false
        SSH Security Level        : DEFAULT
        Enable Lock Status        : false
        Enable Kernel Core        : false
  • Run the following command to enable the settings so that Controller VM can generate stack traces for any cluster issue.

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-core=true

    The following output is displayed.

    Enable Aide               : true
        Enable Core               : false
        Enable High Strength P... : true
        Enable Banner             : true
        Enable SNMPv3 Only        : true
        Schedule                  : HOURLY
        Enable Kernel Mitigations : false
        SSH Security Level        : DEFAULT
        Enable Lock Status        : false
        Enable Kernel Core        : true
    Note: Nutanix recommends that Core should not be set to true unless instructed by the Nutanix support team.
  • Run the following command to configure security levels for the nutanix user for ssh login to the Nutanix Cluster.

    nutanix@cvm$ ncli cluster edit-cvm-security-params ssh-security-level=limited

    The following output is displayed.

    Enable Aide               : true
        Enable Core               : false
        Enable High Strength P... : true
        Enable Banner             : true
        Enable SNMPv3 Only        : true
        Schedule                  : HOURLY
        Enable Kernel Mitigations : false
        SSH Security Level        : LIMITED
        Enable Lock Status        : true
        Enable Kernel Core        : true
  • Run the following command to enable to locking of the security configuration.

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-lock-status=true

    The following output is displayed.

    Enable Aide               : true
        Enable Core               : false
        Enable High Strength P... : true
        Enable Banner             : true
        Enable SNMPv3 Only        : true
        Schedule                  : HOURLY
        Enable Kernel Mitigations : false
        SSH Security Level        : LIMITED
        Enable Lock Status        : true
        Enable Kernel Core        : true
    Note: If set true, the configuration settings can not be edited by the user and a support call will need to be made to unlock this configuration.

Scenario-Based Hardening

  • When a high governance official needs to run the hardened configuration then the settings should be as follows.

    Enable Aide               : true
        Enable Core               : false
        Enable High Strength P... : true
        Enable Banner             : false
        Enable SNMPv3 Only        : true
        Schedule                  : HOURLY
        Enable Kernel Mitigations : false
        SSH Security Level        : LIMITED
        Enable Lock Status        : true
        Enable Kernel Core        : true
  • When a federal official needs to run the hardened configuration then the settings should be as follows.

    Enable Aide               : true
        Enable Core               : false
        Enable High Strength P... : true
        Enable Banner             : true
        Enable SNMPv3 Only        : true
        Schedule                  : HOURLY
        Enable Kernel Mitigations : false
        SSH Security Level        : LIMITED
        Enable Lock Status        : true
        Enable Kernel Core        : true

DoD Banner Configuration

  • Note: A banner file can be modified to support non-DoD customer banners.
  • Run the following command to backup the DoD banner file.

    nutanix@cvm$ sudo cp -a /srv/salt/security/CVM/sshd/DODbanner \
    /srv/salt/security/CVM/sshd/DODbannerbak
  • Run the following command to modify DoD banner file.

    nutanix@cvm$ sudo vi /srv/salt/security/CVM/sshd/DODbanner
    Note: Repeat all the above steps on every Controller VM in a cluster.
  • Run the following command to backup the DoD banner file of the Prism Central VM.

    nutanix@pcvm$ sudo cp -a /srv/salt/security/PC/sshd/DODbanner \
    /srv/salt/security/PC/sshd/DODbannerbak
  • Run the following command to modify DoD banner file of the Prism Central VM.

    nutanix@pcvm$ sudo vi /srv/salt/security/PC/sshd/DODbanner
  • Run the following command to set the banner for all nodes through nCLI.

    nutanix@cvm$ ncli cluster edit-cvm-security-params enable-banner=true

TCP Wrapper Integration

Nutanix Controller VM uses the tcp_wrappers package to allow TCP supported daemons to control the network subnets which can access the libwrapped daemons. By default, SCMA controls the /etc/hosts.allow file in /srv/salt/security/CVM/network/hosts.allow and contains a generic entry to allow access to NFS, secure shell, and SNMP.

sshd: ALL : ALLOW
rpcbind: ALL : ALLOW
snmpd: ALL : ALLOW
snmptrapd: ALL : ALLOW

Nutanix recommends that the above configuration is changed to include only the localhost entries and the management network subnet for the restricted operations; this applies to both production and high governance compliance environments. This ensures that all subnets used to communicate with the CVMs are included in the /etc/hosts.allow file.

Common Criteria

Common Criteria is an international security certification that is recognized by many countries around the world. Nutanix AOS and AHV are Common Criteria certified by default and no additional configuration is required to enable the Common Criteria mode. For more information, see the Nutanix Trust website.
Note: Nutanix uses FIPS-validated cryptography by default.

Securing AHV VMs with Virtual Trusted Platform Module (vTPM)

Overview

A Trusted Platform Module (TPM) is used to manage cryptographic keys for security services like encryption and hardware (and software) integrity protection. AHV vTPM is software-based emulation of the TPM 2.0 specification that works as a virtual device.
Note: AHV vTPM does NOT require OR use a hardware TPM.
You can use the AHV vTPM feature to secure virtual machines running on AHV.
Note: You can enable vTPM using aCLI only.

vTPM Use Cases

AHV vTPM provides virtualization-based security support for the following primary use cases.

  • Support for storing cryptographic keys and certificates for Microsoft Windows BitLocker
  • TPM protection for storing VBS encryption keys for Windows Defender Credential Guard
See Microsoft Documentation for details on Microsoft Windows Defender Credential Guard and Microsoft Windows BitLocker.
Tip: Windows 11 installation requires TPM 2.0, see Microsoft website for Windows 11 specs, features, and computer requirements.

Considerations for Enabling vTPM in AHV VMs

Requirements

Supported Software Versions:

  • AHV version 20220304.242 or above
  • AOS version 6.5.1 or above

VM Requirements:

  • You must enable UEFI on the VM on which you want to enable vTPM, see UEFI Support for VM.
  • You must enable Secure Boot (applicable if using Microsoft Windows BitLocker), see Secure Boot Support for VMs.

Limitations

  • All Secure Boot limitations apply to vTPM VM.
  • Disaster recovery is not supported for vTPM VMs.

Creating AHV VMs with vTPM (aCLI)

About this task

You can create a virtual machine with the vTPM configuration enabled using the following aCLI procedure.

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. At the CVM prompt, type acli to enter the Acropolis CLI mode.
  3. Create a VM with the required configuration using one of the following methods.
    • Create a VM using Prism Element or Prism Central web console. If you choose to create the VM using Prism Element or Prism Central, proceed to Step 4 .
      Note: For simplicity, it is recommended to use Prism Element or Prism Central web console to create VMs, see Creating a VM.
    • Create a VM using aCLI. You can enable vTPM at the time of creating a VM. To enable vTPM during VM creation, do the following and proceed to step 5 (skip step 4 ).

      Use the "vm.create" command with required arguments to create a VM. For details on VM creation command ("vm.create") and supported arguments using aCLI, see "vm" in the Command Reference Guide.

      acli> vm.create <vm-name> machine_type=q35 uefi_boot=true secure_boot=true virtual_tpm=true <argument(s)>

      Replace <vm-name> with the name of the VM and <argument(s)> with one or more arguments as needed for your VM.

  4. Enable vTPM.
    acli> vm.update <vm-name> virtual_tpm=true

    In the above command, replace "<vm-name>" with name of the newly created VM.

  5. Start the VM.
    acli> vm.on <vm-name>
    Replace <vm-name> with the name of the VM.
    vTPM is enabled on the VM.

Enabling vTPM for Existing AHV VMs (aCLI)

About this task

You can update the settings of an existing virtual machine (that satisfies vTPM requirements) to enable vTPM using the following aCLI procedure.

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. At the CVM prompt, type acli to enter the acropolis CLI mode.
  3. Shut down the VM to enforce an update on the VM.
    acli> vm.shutdown <vm-name>
    Replace <vm-name> with the name of the VM.
  4. Enable vTPM.
    acli> vm.update <vm-name> virtual_tpm=true

    Replace <vm-name> with the name of the VM.

  5. Start the VM.
    acli> vm.on <vm-name>
    Replace <vm-name> with the name of the VM.
    vTPM is enabled on the VM.

Security Management Using Prism Element (PE)

Nutanix provides several mechanisms to maintain security in a cluster using Prism Element.

Configuring Authentication

About this task

Nutanix supports user authentication. To configure authentication types and directories and to enable client authentication or to enable client authentication only, do the following:
Caution: The web console (and nCLI) does not allow the use of the not secure SSLv2 and SSLv3 ciphers. There is a possibility of an SSL Fallback situation in some browsers which denies access to the web console. To eliminate this, disable (uncheck) SSLv2 and SSLv3 in any browser used for access. However, TLS must be enabled (checked).

Procedure

  1. Click the gear icon in the main menu and then select Authentication in the Settings page.
    The Authentication Configuration window appears.
    Note: The following steps combine three distinct procedures, enabling authentication (step 2), configuring one or more directories for LDAP/S authentication (steps 3-5), and enabling client authentication (step 6). Perform the steps for the procedures you need. For example, perform step 6 only if you intend to enforce client authentication.
  2. To enable server authentication, click the Authentication Types tab and then check the box for either Local or Directory Service (or both). After selecting the authentication types, click the Save button.
    The Local setting uses the local authentication provided by Nutanix (see User Management) . This method is employed when a user enters just a login name without specifying a domain (for example, user1 instead of user1@nutanix.com ). The Directory Service setting validates user@domain entries and validates against the directory specified in the Directory List tab. Therefore, you need to configure an authentication directory if you select Directory Service in this field.
    Figure. Authentication Types Tab Click to enlarge
    Note: The Nutanix admin user can log on to the management interfaces, including the web console, even if the Local authentication type is disabled.
  3. To add an authentication directory, click the Directory List tab and then click the New Directory option.
    A set of fields is displayed. Do the following in the indicated fields:
    1. Directory Type : Select one of the following from the pull-down list.
      • Active Directory : Active Directory (AD) is a directory service implemented by Microsoft for Windows domain networks.
        Note:
        • Users with the "User must change password at next logon" attribute enabled will not be able to authenticate to the web console (or nCLI). Ensure users with this attribute first login to a domain workstation and change their password prior to accessing the web console. Also, if SSL is enabled on the Active Directory server, make sure that Nutanix has access to that port (open in firewall).
        • An Active Directory user name or group name containing spaces is not supported for Prism Element authentication.
        • Active Directory domain created by using non-ASCII text may not be supported. For more information about usage of ASCII or non-ASCII text in Active Directory configuration, see the Internationalization (i18n) section.
        • Use of the "Protected Users" group is currently unsupported for Prism authentication. For more details on the "Protected Users" group, see “Guidance about how to configure protected accounts” on Microsoft documentation website.
        • The Microsoft AD is LDAP v2 and LDAP v3 compliant.
        • The Microsoft AD servers supported are Windows Server 2012 R2, Windows Server 2016, and Windows Server 2019.
      • OpenLDAP : OpenLDAP is a free, open source directory service, which uses the Lightweight Directory Access Protocol (LDAP), developed by the OpenLDAP project. Nutanix currently supports the OpenLDAP 2.4 release running on CentOS distributions only.
    2. Name : Enter a directory name.
      This is a name you choose to identify this entry; it need not be the name of an actual directory.
    3. Domain : Enter the domain name.
      Enter the domain name in DNS format, for example, nutanix.com .
    4. Directory URL : Enter the URL address to the directory.
      The URL format is as follows for an LDAP entry: ldap:// host : ldap_port_num . The host value is either the IP address or fully qualified domain name. (In some environments, a simple domain name is sufficient.) The default LDAP port number is 389. Nutanix also supports LDAPS (port 636) and LDAP/S Global Catalog (ports 3268 and 3269). The following are example configurations appropriate for each port option:
      Note: LDAPS support does not require custom certificates or certificate trust import.
      • Port 389 (LDAP). Use this port number (in the following URL form) when the configuration is single domain, single forest, and not using SSL.
        ldap://ad_server.mycompany.com:389
      • Port 636 (LDAPS). Use this port number (in the following URL form) when the configuration is single domain, single forest, and using SSL. This requires all Active Directory Domain Controllers have properly installed SSL certificates.
        ldaps://ad_server.mycompany.com:636
        Note: The LDAP server SSL certificate must include a Subject Alternative Name (SAN) that matches the URL provided during the LDAPS setup.
      • Port 3268 (LDAP - GC). Use this port number when the configuration is multiple domain, single forest, and not using SSL.
      • Port 3269 (LDAPS - GC). Use this port number when the configuration is multiple domain, single forest, and using SSL.
        Note: When constructing your LDAP/S URL to use a Global Catalog server, ensure that the Domain Control IP address or name being used is a global catalog server within the domain being configured. If not, queries over 3268/3269 may fail.
        Note: When querying the global catalog, the users sAMAccountName field must be unique across the AD forest. If the sAMAccountName field is not unique across the subdomains, authentication may fail intermittently or consistently.
      Note: For the complete list of required ports, see Port Reference .
    5. (OpenLDAP only) Configure the following additional fields:
      1. User Object Class : Enter the value that uniquely identifies the object class of a user.
      2. User Search Base : Enter the base domain name in which the users are configured.
      3. Username Attribute : Enter the attribute to uniquely identify a user.
      4. Group Object Class : Enter the value that uniquely identifies the object class of a group.
      5. Group Search Base : Enter the base domain name in which the groups are configured.
      6. Group Member Attribute : Enter the attribute that identifies users in a group.
      7. Group Member Attribute Value : Enter the attribute that identifies the users provided as value for Group Member Attribute .
    6. Search Type . How to search your directory when authenticating. Choose Non Recursive if you experience slow directory logon performance. For this option, ensure that users listed in Role Mapping are listed flatly in the group (that is, not nested). Otherwise, choose the default Recursive option.
    7. Service Account Username : Enter the service account user name in the user_name@domain.com format that you want the web console to use to log in to the Active Directory.

      A service account is created to run only a particular service or application with the credentials specified for the account. According to the requirement of the service or application, the administrator can limit access to the service account.

      A service account is under the Managed Service Accounts in the Active Directory server. An application or service uses the service account to interact with the operating system. Enter your Active Directory service account credentials in this (username) and the following (password) field.

      Note: Be sure to update the service account credentials here whenever the service account password changes or when a different service account is used.
    8. Service Account Password : Enter the service account password.
    9. When all the fields are correct, click the Save button (lower right).
      This saves the configuration and redisplays the Authentication Configuration dialog box. The configured directory now appears in the Directory List tab.
    10. Repeat this step for each authentication directory you want to add.
    Note:
    • The Controller VMs need access to the Active Directory server, so open the standard Active Directory ports to each Controller VM in the cluster (and the virtual IP if one is configured).
    • No permissions are granted to the directory users by default. To grant permissions to the directory users, you must specify roles for the users in that directory (see Assigning Role Permissions).
    • Service account for both Active directory and openLDAP must have full read permission on the directory service. Additionally, for successful Prism Element authentication, the users must also have search or read privileges.
    Figure. Directory List Tab Click to enlarge
  4. To edit a directory entry, click the Directory List tab and then click the pencil icon for that entry.
    After clicking the pencil icon, the Directory List fields reappear (see step 3). Enter the new information in the appropriate fields and then click the Save button.
  5. To delete a directory entry, click the Directory List tab and then click the X icon for that entry.
    After clicking the X icon, a window prompt appears to verify the delete action; click the OK button. The entry is removed from the list.
  6. To enable client authentication, do the following:
    1. Click the Client tab.
    2. Select the Configure Client Chain Certificate check box.
      Client Chain Certificate is a list of certificates that includes all intermediate CA and root-CA certificates.
      Note: To authenticate on the PE with Client Chain Certificate the 'Subject name’ field must be present. The subject name should match the userPrincipalName (UPN) in the AD. The UPN is a username with domain address. For example user1@nutanix.com .
      Figure. Client Tab (1) Click to enlarge
    3. Click the Choose File button, browse to and select a client chain certificate to upload, and then click the Open button to upload the certificate.
      Note: Uploaded certificate files must be PEM encoded. The web console restarts after the upload step.
      Figure. Client Tab (2) Click to enlarge
    4. To enable client authentication, click Enable Client Authentication .
    5. To modify client authentication, do one of the following:
      Note: The web console restarts when you change these settings.
      • Click Enable Client Authentication to disable client authentication.
      • Click Remove to delete the current certificate. (This also disables client authentication.)
      • To enable OCSP or CRL based certificate revocation checking, see Certificate Revocation Checking.
      Figure. Authentication Window: Client Tab (3) Click to enlarge

    Client authentication allows you to securely access the Prism by exchanging a digital certificate. Prism will validate that the certificate is signed by your organization’s trusted signing certificate.

    Client authentication ensures that the Nutanix cluster gets a valid certificate from the user. Normally, a one-way authentication process occurs where the server provides a certificate so the user can verify the authenticity of the server (see Installing an SSL Certificate). When client authentication is enabled, this becomes a two-way authentication where the server also verifies the authenticity of the user. A user must provide a valid certificate when accessing the console either by installing the certificate on their local machine or by providing it through a smart card reader. Providing a valid certificate enables user login from a client machine with the relevant user certificate without utilizing user name and password. If the user is required to login from a client machine which does not have the certificate installed, then authentication using user name and password is still available.
    Note: The CA must be the same for both the client chain certificate and the certificate on the local machine or smart card.
  7. To specify a service account that the web console can use to log in to Active Directory and authenticate Common Access Card (CAC) users, select the Configure Service Account check box, and then do the following in the indicated fields:
    Figure. Common Access Card Authentication Click to enlarge
    1. Directory : Select the authentication directory that contains the CAC users that you want to authenticate.
      This list includes the directories that are configured on the Directory List tab.
    2. Service Username : Enter the user name in the user name@domain.com format that you want the web console to use to log in to the Active Directory.
    3. Service Password : Enter the password for the service user name.
    4. Click Enable CAC Authentication .
      Note: For federal customers only.
      Note: The web console restarts after you change this setting.

    The Common Access Card (CAC) is a smart card about the size of a credit card, which some organizations use to access their systems. After you insert the CAC into the CAC reader connected to your system, the software in the reader prompts you to enter a PIN. After you enter a valid PIN, the software extracts your personal certificate that represents you and forwards the certificate to the server using the HTTP protocol.

    Nutanix Prism verifies the certificate as follows:
    • Validates that the certificate has been signed by your organization’s trusted signing certificate.
    • Extracts the Electronic Data Interchange Personal Identifier (EDIPI) from the certificate and uses the EDIPI to check the validity of an account within the Active Directory. The security context from the EDIPI is used for your PRISM session.
    • Prism Element supports both certificate authentication and basic authentication in order to handle both Prism Element login using a certificate and allowing REST API to use basic authentication. It is physically not possible for REST API to use CAC certificates. With this behavior, if the certificate is present during Prism Element login, the certificate authentication is used. However, if the certificate is not present, basic authentication is enforced and used.
    Note: Nutanix Prism does not support OpenLDAP as directory service for CAC.
    If you map a Prism role to a CAC user and not to an Active Directory group or organizational unit to which the user belongs, specify the EDIPI (User Principal Name, or UPN) of that user in the role mapping. A user who presents a CAC with a valid certificate is mapped to a role and taken directly to the web console home page. The web console login page is not displayed.
    Note: If you have logged on to Prism by using CAC authentication, to successfully log out of Prism, close the browser after you click Log Out .
  8. Click the Close button to close the Authentication Configuration dialog box.

Assigning Role Permissions

About this task

When user authentication is enabled for a directory service (see Configuring Authentication), the directory users do not have any permissions by default. To grant permissions to the directory users, you must specify roles for the users (with associated permissions) to organizational units (OUs), groups, or individuals within a directory.

If you are using Active Directory, you must also assign roles to entities or users, especially before upgrading from a previous AOS version.

To assign roles, do the following:

Procedure

  1. In the web console, click the gear icon in the main menu and then select Role Mapping in the Settings page.
    The Role Mapping window appears.
    Figure. Role Mapping Window Click to enlarge
  2. To create a role mapping, click the New Mapping button.

    The Create Role Mapping window appears. Do the following in the indicated fields:

    1. Directory : Select the target directory from the pull-down list.

      Only directories previously defined when configuring authentication appear in this list. If the desired directory does not appear, add that directory to the directory list (see Configuring Authentication) and then return to this procedure.

    2. LDAP Type : Select the desired LDAP entity type from the pull-down list.

      The entity types are GROUP , USER , and OU .

    3. Role : Select the user role from the pull-down list.
      There are three roles from which to choose:
      • Viewer : This role allows a user to view information only. It does not provide permission to perform any administrative tasks.
      • Cluster Admin : This role allows a user to view information and perform any administrative task (but not create or modify user accounts).
      • User Admin : This role allows the user to view information, perform any administrative task, and create or modify user accounts.
    4. Values : Enter the case-sensitive entity names (in a comma separated list with no spaces) that should be assigned this role.
      The values are the actual names of the organizational units (meaning it applies to all users in those OUs), groups (all users in those groups), or users (each named user) assigned this role. For example, entering value " admin-gp,support-gp " when the LDAP type is GROUP and the role is Cluster Admin means all users in the admin-gp and support-gp groups should be assigned the cluster administrator role.
      Note:
      • Do not include a domain in the value, for example enter just admin-gp , not admin-gp@nutanix.com . However, when users log into the web console, they need to include the domain in their user name.
      • The AD user UPN must be in the user@domain_name format.
      • When an admin defines user role mapping using an AD with forest setup, the admin can map to the user with the same name from any domain in the forest setup. To avoid this case, set up the user-role mapping with AD that has a specific domain setup.
    5. When all the fields are correct, click Save .
      This saves the configuration and redisplays the Role Mapping window. The new role map now appears in the list.
      Note: All users in an authorized service directory have full administrator permissions when role mapping is not defined for that directory. However, after creating a role map, any users in that directory that are not explicitly granted permissions through the role mapping are denied access (no permissions).
    6. Repeat this step for each role map you want to add.
      You can create a role map for each authorized directory. You can also create multiple maps that apply to a single directory. When there are multiple maps for a directory, the most specific rule for a user applies. For example, adding a GROUP map set to Cluster Admin and a USER map set to Viewer for select users in that group means all users in the group have administrator permission except those specified users who have viewing permission only.
    Figure. Create Role Mapping Window Click to enlarge
  3. To edit a role map entry, click the pencil icon for that entry.
    After clicking the pencil icon, the Edit Role Mapping window appears, which contains the same fields as the Create Role Mapping window (see step 2). Enter the new information in the appropriate fields and then click the Save button.
  4. To delete a role map entry, click the "X" icon for that entry.
    After clicking the X icon, a window prompt appears to verify the delete action; click the OK button. The entry is removed from the list.
  5. Click the Close button to close the Role Mapping window.

Certificate Revocation Checking

Enabling Certificate Revocation Checking using Online Certificate Status Protocol (nCLI)

About this task

OCSP is the recommended method for checking certificate revocation in client authentication. You can enable certificate revocation checking using the OSCP method through the command line interface (nCLI).

To enable certificate revocation checking using OCSP for client authentication, do the following.

Procedure

  1. Set the OCSP responder URL.
    ncli authconfig set-certificate-revocation set-ocsp-responder=<ocsp url> <ocsp url> indicates the location of the OCSP responder.
  2. Verify if OCSP checking is enabled.
    ncli authconfig get-client-authentication-config

    The expected output if certificate revocation checking is enabled successfully is as follows.

    Auth Config Status: true
    File Name: ca.cert.pem
    OCSP Responder URI: http://<ocsp-responder-url>

Enabling Certificate Revocation Checking using Certificate Revocation Lists (nCLI)

About this task

Note: OSCP is the recommended method for checking certificate revocation in client authentication.

You can use the CRL certificate revocation checking method if required, as described in this section.

To enable certificate revocation checking using CRL for client authentication, do the following.

Procedure

Specify all the CRLs that are required for certificate validation.
ncli authconfig set-certificate-revocation set-crl-uri=<uri 1>,<uri 2> set-crl-refresh-interval=<refresh interval in seconds> set-crl-expiration-interval=<expiration interval in seconds>
  • The above command resets any previous OCSP or CRL configurations.
  • The URIs must be percent-encoded and comma separated.
  • The CRLs are updated periodically as specified by the crl-refresh-interval value. This interval is common for the entire list of CRL distribution points. The default value for this is 86400 seconds (1 day).
  • The periodically updated CRLs are cached in-memory for the duration specified by value of set-crl-expiration-interval and expired after the duration, in case a particular CRL distribution point is not reachable. This duration is configured for the entire list of CRL distribution points. The default value for this is 604800 seconds (7 days).

Authentication Best Practices

The authentication best practices listed here are guidance to secure the Nutanix platform by using the most common authentication security measures.

Emergency Local Account Usage

You must use the admin account as a local emergency account. The admin account ensures that both the Prism Web Console and the Controller VM are available when the external services such as Active Directory is unavailable.

Note: Local emergency account usage does not support any external access mechanisms, specifically for the external application authentication or external Rest API authentication.

For all the external authentication, you must configure the cluster to use an external IAM service such as Active Directory. You must create service accounts on the IAM and the accounts must have access grants to the cluster through Prism web console user account management configuration for authentication.

Modifying Default Passwords

You must change the default Controller VM password for nutanix user account by adhering to the password complexity requirements.

Procedure

  1. SSH to the Controller VM.
  2. Change the "nutanix" user account password.
    nutanix@cvm$ passwd nutanix
  3. Respond to the prompts and provide the current and new root password.
    Changing password for nutanix.
    New password:
    Retype new password:
    passwd: all authentication tokens updated successfully.
    Note:
    • Changing the user account password on one of the Controller VMs is applied to all Controller VMs in the cluster.
    • Ensure that you preserve the modified nutanix user password, since the local authentication (PAM) module requires the previous password of the nutanix user to successfully start the password reset process.
    • For the root account, both the console and SSH direct login is disabled.
    • It is recommended to use the admin user as the administrative emergency account.

Controlling Cluster Access

About this task

Nutanix supports the Cluster lockdown feature. This feature enables key-based SSH access to the Controller VM and AHV on the Host (only for nutanix/admin users).

Enabling cluster lockdown mode ensures that password authentication is disabled and only the keys you have provided can be used to access the cluster resources. Thus making the cluster more secure.

You can create a key pair (or multiple key pairs) and add the public keys to enable key-based SSH access. However, when site security requirements do not allow such access, you can remove all public keys to prevent SSH access.

To control key-based SSH access to the cluster, do the following:
Note: Use this procedure to lock down access to the Controller VM and hypervisor host. In addition, it is possible to lock down access to the hypervisor.

Procedure

  1. Click the gear icon in the main menu and then select Cluster Lockdown in the Settings page.
    The Cluster Lockdown dialog box appears. Enabled public keys (if any) are listed in this window.
    Figure. Cluster Lockdown Window Click to enlarge
  2. To disable (or enable) remote login access, uncheck (check) the Enable Remote Login with Password box.
    Remote login access is enabled by default.
  3. To add a new public key, click the New Public Key button and then do the following in the displayed fields:
    1. Name : Enter a key name.
    2. Key : Enter (paste) the key value into the field.
    Note: Prism supports the following key types.
    • RSA
    • ECDSA
    1. Click the Save button (lower right) to save the key and return to the main Cluster Lockdown window.
    There are no public keys available by default, but you can add any number of public keys.
  4. To delete a public key, click the X on the right of that key line.
    Note: Deleting all the public keys and disabling remote login access locks down the cluster from SSH access.

Setup Admin Session Timeout

By default, the users are logged out automatically after being idle for 15 minutes. You can change the session timeout for users and configure to override the session timeout by following the steps shown below.

Procedure

  1. Click the gear icon in the main menu and then select UI Settings in the Settings page.
  2. Select the session timeout for the current user from the Session Timeout For Current User drop-down list.
    Figure. Session Timeout Settings Click to enlarge displays the window for setting an idle logout value and for disabling the logon background animation

  3. Select the appropriate option from the Session Timeout Override drop-down list to override the session timeout.

Password Retry Lockout

For enhanced security, Prism Element locks out the admin account for a period of 15 minutes after a default number of unsuccessful login attempts. Once the account is locked out, the following message is displayed at the logon screen.

Account locked due to too many failed attempts

You can attempt entering the password after the 15 minutes lockout period, or contact Nutanix Support in case you have forgotten your password.

Internationalization (i18n)

The following table lists all the supported and unsupported entities in UTF-8 encoding.

Table 1. Internationalization Support
Supported Entities Unsupported Entities
Cluster name Acropolis file server
Storage Container name Share path
Storage pool Internationalized domain names
VM name E-mail IDs
Snapshot name Hostnames
Volume group name Integers
Protection domain name Password fields
Remote site name Any Hardware related names ( for example, vSwitch, iSCSCI initiator, vLAN name)
User management
Chart name
Caution: The creation of none of the above entities are supported on Hyper-V because of the DR limitations.

Entities Support (ASCII or non-ASCII) for the Active Directory Server

  • In the New Directory Configuration, Name field is supported in non-ASCII.
  • In the New Directory Configuration, Domain field is not supported in non-ASCII.
  • In Role mapping, Values field is supported in non-ASCII.
  • User names and group names are supported in non-ASCII.

User Management

Nutanix user accounts can be created or updated as needed using the Prism web console.

  • The web console allows you to add (see Creating a User Account), edit (see Updating a User Account), or delete (see Deleting a User Account (Local)) local user accounts at any time.
  • You can reset the local user account password using nCLI if you are locked out and cannot login to the Prism Element or Prism Central web console ( see Resetting Password (CLI)).
  • You can also configure user accounts through Active Directory and LDAP (see Configuring Authentication). Active Directory domain created by using non-ASCII text may not be supported.
Note: In addition to the Nutanix user account, there are IPMI, Controller VM, and hypervisor host users. Passwords for these accounts cannot be changed through the web console.

Creating a User Account

About this task

The admin user is created automatically when you get a Nutanix system, but you can add more users as needed. Note that you cannot delete the admin user. To create a user, do the following:
Note: You can also configure user accounts through Active Directory (AD) and LDAP (see Configuring Authentication).

Procedure

  1. Click the gear icon in the main menu and then select Local User Management in the Settings page.
    The User Management dialog box appears.
    Figure. User Management Window Click to enlarge
  2. To add a user, click the New User button and do the following in the displayed fields:
    1. Username : Enter a user name.
    2. First Name : Enter a first name.
    3. Last Name : Enter a last name.
    4. Email : Enter a valid user email address.
      Note: AOS uses the email address for client authentication and logging when the local user performs user and cluster tasks in the web console.
    5. Password : Enter a password (maximum of 255 characters).
      A second field to verify that the password is not included, so be sure to enter the password correctly in this field.
    6. Language : Select the language setting for the user.
      By default English is selected. You can select Simplified Chinese or Japanese . Depending on the language that you select here, the cluster locale is be updated for the new user. For example, if you select Simplified Chinese , the next time that the new user logs on to the web console, the user interface is displayed in Simplified Chinese.
    7. Roles : Assign a role to this user.
      • Select the User Admin box to allow the user to view information, perform any administrative task, and create or modify user accounts. (Checking this box automatically selects the Cluster Admin box to indicate that this user has full permissions. However, a user administrator has full permissions regardless of whether the cluster administrator box is checked.)
      • Select the Cluster Admin box to allow the user to view information and perform any administrative task (but not create or modify user accounts).
      • Select the Backup Admin box to allow the user to perform backup-related administrative tasks. This role does not have permission to perform cluster or user tasks.

        Note: Backup admin user is designed for Nutanix Mine integrations as of AOS version 5.19 and has minimal functionality in cluster management. This role has restricted access to the Nutanix Mine cluster.
        • Health , Analysis , and Tasks features are available in read-only mode.
        • The File server and Data Protection options in the web console are not available for this user.
        • The following features are available for Backup Admin users with limited functionality.
            • Home - The user cannot a register a cluster with Prism Central. The registration widget is disabled. Other read-only data is displayed and available.
            • Alerts - Alerts and events are displayed. However, the user cannot resolve or acknowledge any alert or event. The user cannot configure Alert Policy or Email configuration .
            • Hardware - The user cannot expand the cluster or remove hosts from the cluster. Read-only data is displayed and available.
            • Network - Networking data or configuration is displayed but configuration options are not available.
            • Settings - The user can only upload a new image using the Settings page.
            • VM - The user cannot configure options like Create VM and Network Configuration in the VM page. The following options are available for the user in the VM page:
              • Launch console
              • Power On
              • Power Off
      • Leaving all the boxes unchecked allows the user to view information, but it does not provide permission to perform cluster or user tasks.
    8. When all the fields are correct, click Save .
      This saves the configuration and the web console redisplays the dialog box with the new user-administrative appearing in the list.
    Figure. Create User Window Click to enlarge

Updating a User Account

About this task

Update credentials and change the role for an existing user by using this procedure.
Note: To update your account credentials (that is, the user you are currently logged on as), see Updating My Account. Changing the password for a different user is not supported; you must log in as that user to change the password.

Procedure

  1. Click the gear icon in the main menu and then select Local User Management in the Settings page.
    The User Management dialog box appears.
  2. Enable or disable the login access for a user by clicking the toggle text Yes (enabled) or No (disabled) in the Enabled column.
    A Yes value in the Enabled column means that the login is enabled; a No value in the Enabled column means it is disabled.
    Note: A user account is enabled (login access activated) by default.
  3. To edit the user credentials, click the pencil icon for that user and update one or more of the values in the displayed fields:
    1. Username : The username is fixed when the account is created and cannot be changed.
    2. First Name : Enter a different first name.
    3. Last Name : Enter a different last name.
    4. Email : Enter a different valid email address.
      Note: AOS Prism uses the email address for client authentication and logging when the local user performs user and cluster tasks in the web console.
    5. Roles : Change the role assigned to this user.
      • Select the User Admin box to allow the user to view information, perform any administrative task, and create or modify user accounts. (Checking this box automatically selects the Cluster Admin box to indicate that this user has full permissions. However, a user administrator has full permissions regardless of whether the cluster administrator box is checked.)
      • Select the Cluster Admin box to allow the user to view information and perform any administrative task (but not create or modify user accounts).
      • Select the Backup Admin box to allow the user to perform backup-related administrative tasks. This role does not have permission to perform cluster or user administrative tasks.
      • Leaving all the boxes unchecked allows the user to view information, but it does not provide permission to perform cluster or user-administrative administrative tasks.
    6. Reset Password : Change the password of this user.
      Enter the new password for Password and Confirm Password fields. Click the info icon to view the password complexity requirements.
    7. When all the fields are correct, click Save .
      This saves the configuration and redisplays the dialog box with the new user appearing in the list.
    Figure. Update User Window Click to enlarge

Updating My Account

About this task

To update your account credentials (that is, credentials for the user you are currently logged in as), do the following:

Procedure

  1. To update your password, select Change Password from the user icon pull-down list in the web console.
    The Change Password dialog box appears. Do the following in the indicated fields:
    1. Current Password : Enter the current password.
    2. New Password : Enter a new password.
    3. Confirm Password : Re-enter the new password.
    4. When the fields are correct, click the Save button (lower right). This saves the new password and closes the window.
    Note: You can change the password for the "admin" account only once per day. Please contact Nutanix support if you need to update the password multiple times in one day
    Figure. Change Password Window Click to enlarge
  2. To update other details of your account, select Update Profile from the user icon pull-down list.
    The Update Profile dialog box appears. Update (as desired) one or more of the following fields:
    1. First Name : Enter a different first name.
    2. Last Name : Enter a different last name.
    3. Email : Enter a different valid user email address.
    4. Language : Select a language for your account.
    5. API Key : Enter the key value to use a new API key.
    6. Public Key : Click the Choose File button to upload a new public key file.
    7. When all the fields are correct, click the Save button (lower right). This saves the changes and closes the window.
    Figure. Update Profile Window Click to enlarge

Resetting Password (CLI)

This procedure describes how to reset a local user's password on the Prism Element or the Prism Central web consoles.

About this task

To reset the password using nCLI, do the following:

Note:

Only a user with admin privileges can reset a password for other users.

Procedure

  1. Access the CVM via SSH.
  2. Log in with the admin credentials.
  3. Use the ncli user reset-password command and specify the username and password of the user whose password is to be reset:
    nutanix@cvm$ ncli user reset-password user-name=xxxxx password=yyyyy
    
    • Replace user-name=xxxxx with the name of the user whose password is to be reset.

    • Replace password=yyyyy with the new password.

What to do next

You can relaunch the Prism Element or the Prism Central web console and verify the new password setting.

Exporting an SSL Certificate for Third-party Backup Applications

Nutanix allows you to export an SSL certificate for Prism Element on a Nutanix cluster and use it with third-party backup applications.

Procedure

  1. Log on to a Controller VM in the cluster using SSH.
  2. Run the following command to obtain the virtual IP address of the cluster:
    nutanix@cvm$ ncli cluster info

    The current cluster configuration is displayed.

        Cluster Id           : 0001ab12-abcd-efgh-0123-012345678m89::123456
        Cluster Uuid         : 0001ab12-abcd-efgh-0123-012345678m89
        Cluster Name         : three
        Cluster Version      : 6.0
        Cluster Full Version : el7.3-release-fraser-6.0-a0b1c2345d6789ie123456fg789h1212i34jk5lm6
        External IP address  : 10.10.10.10
        Node Count           : 3
        Block Count          : 1
        . . . . .
    Note: The external IP address in the output is the virtual IP address of the cluster.
  3. Run the following command to enter into the Python prompt:
    nutanix@cvm$ python

    The Python prompt appears.

  4. Run the following command to import the SSL library.
    $ import ssl
  5. From the Python console, run the following command to print the SSL certificate.
    $ print ssl.get_server_certificate(('virtual_IP_address',9440), ssl_version=ssl.PROTOCOL_TLSv1_2)
    Example: Refer to the following example where virtual_IP_address value is replaced by 10.10.10.10.
    $ print ssl.get_server_certificate(('10.10.10.10', 9440), ssl_version=ssl.PROTOCOL_TLSv1_2)
    The SSL certificate is displayed on the console.
    -----BEGIN CERTIFICATE-----
    0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01
    23456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123
    456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz012345
    6789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01234567
    89ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
    ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789AB
    CDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCD
    EFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEF
    GHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGH
    IJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJ
    KLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKL
    MNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMN
    OPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOP
    QRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQR
    STUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRST
    UVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUV
    WXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWX
    YZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
    abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZab
    cdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcd
    efghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef
    ghij
    -----END CERTIFICATE-----

Deleting a User Account (Local)

About this task

To delete an existing user, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Local User Management in the Settings page.
    The User Management dialog box appears.
    Figure. User Management Window Click to enlarge
  2. Click the X icon for that user. Note that you cannot delete the admin user.
    A window prompt appears to verify the action; click the OK button. The user account is removed and the user no longer appears in the list.

Certificate Management

This chapter describes how to install and replace an SSL certificate for configuration and use on the Nutanix Controller VM.

Note: Nutanix recommends that you check for the validity of the certificate periodically, and replace the certificate if it is invalid.

Installing an SSL Certificate

About this task

Nutanix supports SSL certificate-based authentication for console access. To install a self-signed or custom SSL certificate, do the following:
Important: Ensure that SSL certificates are not password protected.
Note:
  • Nutanix recommends that customers replace the default self-signed certificate with a CA signed certificate.
  • SSL certificate (self-signed or signed by CA) can only be installed cluster-wide from Prism. SSL certificates can not be customized for individual Controller VM.

Procedure

  1. Click the gear icon in the main menu and then select SSL Certificate in the Settings page.
    The SSL Certificate dialog box appears.
    Figure. SSL Certificate Window Click to enlarge
  2. To replace (or install) a certificate, click the Replace Certificate button.
  3. To create a new self-signed certificate, click the Regenerate Self Signed Certificate option and then click the Apply button.
    A dialog box appears to verify the action; click the OK button. This generates and applies a new RSA 2048-bit self-signed certificate for the Prism user interface.
    Figure. SSL Certificate Window: Regenerate Click to enlarge
  4. To apply a custom certificate that you provide, do the following:
    1. Click the Import Key and Certificate option and then click the Next button.
      Figure. SSL Certificate Window: Import Click to enlarge
    2. Do the following in the indicated fields, and then click the Import Files button.
      Note:
      • All the three imported files for the custom certificate must be PEM encoded.
      • Ensure that the private key does not have any extra data (or custom attributes) before the beginning (-----BEGIN CERTIFICATE-----) or after the end (-----END CERTIFICATE-----) of the private key block.
      • See Recommended Key Configurations to ensure proper set of key types, sizes/curves, and signature algorithms.
      • Private Key Type : Select the appropriate type for the signed certificate from the pull-down list (RSA 4096 bit, RSA 2048 bit, EC DSA 256 bit, or EC DSA 384 bit).
      • Private Key : Click the Browse button and select the private key associated with the certificate to be imported.
      • Public Certificate : Click the Browse button and select the signed public portion of the server certificate corresponding to the private key.
      • CA Certificate/Chain : Click the Browse button and select the certificate or chain of the signing authority for the public certificate.
      Figure. SSL Certificate Window: Select Files Click to enlarge
      In order to meet the high security standards of NIST SP800-131a compliance, the requirements of the RFC 6460 for NSA Suite B, and supply the optimal performance for encryption, the certificate import process validates the correct signature algorithm is used for a given key/cert pair. See Recommended Key Configurations to ensure proper set of key types, sizes/curves, and signature algorithms. The CA must sign all public certificates with proper type, size/curve, and signature algorithm for the import process to validate successfully.
      Note: There is no specific requirement for the subject name of the certificates (subject alternative names (SAN) or wildcard certificates are supported in Prism).
      You can use the cat command to concatenate a list of CA certificates into a chain file.
      $ cat signer.crt inter.crt root.crt > server.cert
      Order is essential. The total chain should begin with the certificate of the signer and end with the root CA certificate as the final entry.

Results

After generating or uploading the new certificate, the interface gateway restarts. If the certificate and credentials are valid, the interface gateway uses the new certificate immediately, which means your browser session (and all other open browser sessions) will be invalid until you reload the page and accept the new certificate. If anything is wrong with the certificate (such as a corrupted file or wrong certificate type), the new certificate is discarded, and the system reverts back to the original default certificate provided by Nutanix.
Note: The system holds only one custom SSL certificate. If a new certificate is uploaded, it replaces the existing certificate. The previous certificate is discarded.

Recommended Key Configurations

This table provides the Nutanix recommended set of key types, sizes/curves, and signature algorithms.

Note:
  • Client and CAC authentication only supports RSA 2048 bit certificate.
  • RSA 4096 bit certificates might not work with certain AOS and Prism Central releases. Please see the release notes for your AOS and Prism Central versions. Specifying an RSA 4096 bit certificate might cause multiple cluster services to restart frequently. To work around the issue, see KB 12775.
  • Certificate import fails if you attempt to upload SHA-1 certificate (including root CA).

Replacing a Certificate

Nutanix simplifies the process of certificate replacement to support the need of Certificate Authority (CA) based chains of trust. Nutanix recommends you to replace the default supplied self-signed certificate with a CA signed certificate.

Procedure

  1. Login to the Prism and click the gear icon.
  2. Click SSL Certificate .
  3. Select Replace Certificate to replace the certificate.
  4. Do one of the following.
    • Select Regenerate self signed certificate to generate a new self-signed certificate.
      Note:
      • This automatically generates and applies a certificate.
    • Select Import key and certificate to import the custom key and certificate.

    The following files are required and should be PEM encoded to import the keys and certificate.

    • The private key associated with the certificate. The below section describes generating a private key in detail.
    • The signed public portion of the server certificate corresponding to the private key.
    • The CA certificate or chain of the signing authority for the certificate.
    Note:

    You must obtain the Public Certificate and CA Certificate/Chain from the certificate authority.

    Figure. Importing Certificate Click to enlarge

    Generating an RSA 4096 and RSA 2048 private key

    Tip: You can run the OpenSSL commands for generating private key and CSR on a Linux client with OpenSSL installed.
    Note: Some OpenSSL command parameters might not be supported on older OpenSSL versions and require OpenSSL version 1.1.1 or above to work.
    • Run the following OpenSSL command to generate a RSA 4096 private key and the Certificate Signing Request (CSR).
      openssl req -out server.csr -new -newkey rsa:4096
              -nodes -sha256 -keyout server.key
    • Run the following OpenSSL command to generate an RSA 2048 private key and the Certificate Signing Request (CSR).
      openssl req -out server.csr -new -newkey rsa:2048
              -nodes -sha256 -keyout server.key

      After executing the openssl command, the system prompts you to provide more details that will be incorporated into your certificate. The mandatory fields are - Country Name, State or Province Name, and Organization Name. The optional fields are - Locality Name, Organizational Unit Name, Email Address, and Challenge Password.

    Nutanix recommends including a DNS name for all CVMs in the certificate using the Subject Alternative Name (SAN) extension. This avoids SSL certificate errors when you access a CVM by direct DNS instead of the shared cluster IP. This example shows how to include a DNS name while generating an RSA 4096 private key:

    openssl req -out server.csr -new -newkey rsa:4096 -sha256 -nodes 
    -addext "subjectAltName = DNS:example.com" 
    -keyout server.key 

    For a 3-node cluster you can provide DNS name for all three nodes in a single command. For example:

    openssl req -out server.csr -new -newkey rsa:4096 -sha256 -nodes 
    -addext "subjectAltName = DNS:example1.com,DNS:example2.com,DNS:example3.com" 
    -keyout server.key 

    If you have added a SAN ( subjectAltName ) extension to your certificate, then every time you add or remove a node from the cluster, you must add the DNS name when you generate or sign a new certificate.

    Generating an EC DSA 256 and EC DSA 384 private key

    • Run the following OpenSSL command to generate a EC DSA 256 private key and the Certificate Signing Request (CSR).
      openssl ecparam -out dsakey.pem -name prime256v1 –genkey 
      openssl req -out dsacert.csr -new -key dsakey.pem -nodes -sha256 
    • Run the following OpenSSL command to generate a EC DSA 384 private key and the Certificate Signing Request (CSR).
      openssl ecparam -out dsakey.pem -name secp384r1 –genkey
      openssl req -out dsacert.csr -new -key dsakey.pem -nodes –sha384 
      
    In order to meet the high security standards of NIST SP800-131a compliance, the requirements of the RFC 6460 for NSA Suite B, and supply the optimal performance for encryption, the certificate import process validates the correct signature algorithm is used for a given key/cert pair. See Recommended Key Configurations to ensure proper set of key types, sizes/curves, and signature algorithms. The CA must sign all public certificates with proper type, size/curve, and signature algorithm for the import process to validate successfully.
    Note: There is no specific requirement for the subject name of the certificates (subject alternative names (SAN) or wildcard certificates are supported in Prism).
    You can use the cat command to concatenate a list of CA certificates into a chain file. $ cat signer.crt inter.crt root.crt > server.cert

    Order is essential. The total chain should begin with the certificate of the signer and end with the root CA certificate as the final entry.

  5. If the CA chain certificate provided by the certificate authority is not in a single file, then run the following command to concatenate the list of CA certificates into a chain file.
    cat signer.crt inter.crt root.crt > server.cert
    Note: The chain should start with the certificate of the signer and ends with the root CA certificate.
  6. Browse and add the Private Key, Public Certificate, and CA Certificate/Chain.
  7. Click Import Files .

What to do next

Prism restarts and you must login to use the application.

Exporting an SSL Certificate for Third-party Backup Applications

Nutanix allows you to export an SSL certificate for Prism Element on a Nutanix cluster and use it with third-party backup applications.

Procedure

  1. Log on to a Controller VM in the cluster using SSH.
  2. Run the following command to obtain the virtual IP address of the cluster:
    nutanix@cvm$ ncli cluster info

    The current cluster configuration is displayed.

        Cluster Id           : 0001ab12-abcd-efgh-0123-012345678m89::123456
        Cluster Uuid         : 0001ab12-abcd-efgh-0123-012345678m89
        Cluster Name         : three
        Cluster Version      : 6.0
        Cluster Full Version : el7.3-release-fraser-6.0-a0b1c2345d6789ie123456fg789h1212i34jk5lm6
        External IP address  : 10.10.10.10
        Node Count           : 3
        Block Count          : 1
        . . . . .
    Note: The external IP address in the output is the virtual IP address of the cluster.
  3. Run the following command to enter into the Python prompt:
    nutanix@cvm$ python

    The Python prompt appears.

  4. Run the following command to import the SSL library.
    $ import ssl
  5. From the Python console, run the following command to print the SSL certificate.
    $ print ssl.get_server_certificate(('virtual_IP_address',9440), ssl_version=ssl.PROTOCOL_TLSv1_2)
    Example: Refer to the following example where virtual_IP_address value is replaced by 10.10.10.10.
    $ print ssl.get_server_certificate(('10.10.10.10', 9440), ssl_version=ssl.PROTOCOL_TLSv1_2)
    The SSL certificate is displayed on the console.
    -----BEGIN CERTIFICATE-----
    0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01
    23456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123
    456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz012345
    6789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01234567
    89ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
    ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789AB
    CDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCD
    EFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEF
    GHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGH
    IJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJ
    KLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKL
    MNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMN
    OPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOP
    QRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQR
    STUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRST
    UVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUV
    WXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWX
    YZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
    abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZab
    cdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcd
    efghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef
    ghij
    -----END CERTIFICATE-----

Controlling Cluster Access

About this task

Nutanix supports the Cluster lockdown feature. This feature enables key-based SSH access to the Controller VM and AHV on the Host (only for nutanix/admin users).

Enabling cluster lockdown mode ensures that password authentication is disabled and only the keys you have provided can be used to access the cluster resources. Thus making the cluster more secure.

You can create a key pair (or multiple key pairs) and add the public keys to enable key-based SSH access. However, when site security requirements do not allow such access, you can remove all public keys to prevent SSH access.

To control key-based SSH access to the cluster, do the following:
Note: Use this procedure to lock down access to the Controller VM and hypervisor host. In addition, it is possible to lock down access to the hypervisor.

Procedure

  1. Click the gear icon in the main menu and then select Cluster Lockdown in the Settings page.
    The Cluster Lockdown dialog box appears. Enabled public keys (if any) are listed in this window.
    Figure. Cluster Lockdown Window Click to enlarge
  2. To disable (or enable) remote login access, uncheck (check) the Enable Remote Login with Password box.
    Remote login access is enabled by default.
  3. To add a new public key, click the New Public Key button and then do the following in the displayed fields:
    1. Name : Enter a key name.
    2. Key : Enter (paste) the key value into the field.
    Note: Prism supports the following key types.
    • RSA
    • ECDSA
    1. Click the Save button (lower right) to save the key and return to the main Cluster Lockdown window.
    There are no public keys available by default, but you can add any number of public keys.
  4. To delete a public key, click the X on the right of that key line.
    Note: Deleting all the public keys and disabling remote login access locks down the cluster from SSH access.

Data-at-Rest Encryption

Nutanix provides an option to secure data while it is at rest using either self-encrypted drives or software-only encryption and key-based access management (cluster's native or external KMS for software-only encryption).

Encryption Methods

Nutanix provides you with the following options to secure your data.

  • Self Encrypting Drives (SED) Encryption - You can use a combination of SEDs and an external KMS to secure your data while it is at rest.
  • Software-only Encryption - Nutanix AOS uses the AES-256 encryption standard to encrypt your data. Once enabled, software-only data-at-rest encryption cannot be disabled, thus protecting against accidental data leaks due to human errors. Software-only encryption supports both Nutanix Native Key Manager (local and remote) and External KMS to secure your keys.

Note the following points regarding data-at-rest encryption.

  • Encryption is supported for AHV, ESXi, and Hyper-V.
    • For ESXi and Hyper-V, software-only encryption can be implemented at a cluster level or container level. For AHV, encryption can be implemented at the cluster level only.
  • Nutanix recommends using cluster-level encryption. With the cluster-level encryption, the administrative overhead of selecting different containers for the data storage gets eliminated.
  • Encryption cannot be disabled once it is enabled at a cluster level or container level.
  • Encryption can be implemented on an existing cluster with data that exists. If encryption is enabled on an existing cluster (AHV, ESXi, or Hyper-V), the unencrypted data is transformed into an encrypted format in a low priority background task that is designed not to interfere with other workload running in the cluster.
  • Data can be encrypted using either self-encrypted drives (SEDs) or software-only encryption. You can change the encryption method from SEDs to software-only. You can perform the following configurations.
    • For ESXi and Hyper-V clusters, you can switch from SEDs and External Key Management (EKM) combination to software-only encryption and EKM combination. First, you must disable the encryption in the cluster where you want to change the encryption method. Then, select the cluster and enable encryption to transform the unencrypted data into an encrypted format in the background.
    • For AHV, background encryption is supported.
  • Once the task to encrypt a cluster begins, you cannot cancel the operation. Even if you stop and restart the cluster, the system resumes the operation.
  • In the case of mixed clusters with ESXi and AHV nodes, where the AHV nodes are used for storage only, the encryption policies consider the cluster as an ESXi cluster. So, the cluster-level and container-level encryption are available.
  • You can use a combination of SED and non-SED drives in a cluster. After you encrypt a cluster using the software-only encryption, all the drives are considered as unencrypted drives. In case you switch from the SED encryption to the software-only encryption, you can add SED or non-SED drives to the cluster.
  • Data is not encrypted when it is replicated to another cluster. You must enable the encryption for each cluster. Data is encrypted as a part of the write operation and decrypted as a part of the read operation. During the replication process, the system reads, decrypts, and then sends the data over to the other cluster. You can use a third-party network solution if there a requirement to encrypt the data during the transmission to another cluster.
  • Software-only encryption does not impact the data efficiency features such as deduplication, compression, erasure coding, zero block suppression, and so on. The software encryption is the last data transformation performed. For example, during the write operation, compression is performed first, followed by encryption.

Key Management

Nutanix supports a Native Key Management Server, also called Local Key Manager (LKM), thus avoiding the dependency on an External Key Manager (EKM). Cluster localised Key Management Service support requires a minimum of 3-node in a cluster and is supported only for software-only encryption. So, 1-node and 2-node clusters can use either the Native KMS (remote) option or an EKM. .

The following types of keys are used for encryption.

  • Data Encryption Key (DEK) - A symmetric key, such as AES-256, that is used to encrypt the data.
  • Key Encryption Key (KEK) - This key is used to encrypt or decrypt the DEK.

Note the following points regarding the key management.

  • Nutanix does not support the use of the Local Key Manager with a third party External Key Manager.
  • Dual encryption (both SED and software-only encryption) requires an EKM. For more information, see Configuring Dual Encryption.
  • You can switch from an EKM to LKM, and inversely. For more information, see Switching between Native Key Manager and External Key Manager.
  • Rekey of keys stored in the Native KMS is supported for the Leader Keys. For more information, see Changing Key Encryption Keys (SEDs) and Changing Key Encryption Keys (Software Only).
  • You must back up the keys stored in the Native KMS. For more information, see Backing up Keys.
  • You must backup the encryption keys whenever you create a new container or remove an existing container. Nutanix Cluster Check (NCC) checks the status of the backup and sends an alert if you do not take a backup at the time of creating or removing a container.

Data-at-Rest Encryption (SEDs)

For customers who require enhanced data security, Nutanix provides a data-at-rest security option using Self Encrypting Drives (SEDs) included in the Ultimate license.

Note: If you are running the AOS Pro License on G6 platforms and above, you can use SED encryption by installing an add-on license.

Following features are supported:

  • Data is encrypted on all drives at all times.
  • Data is inaccessible in the event of drive or node theft.
  • Data on a drive can be securely destroyed.
  • A key authorization method allows password rotation at arbitrary times.
  • Protection can be enabled or disabled at any time.
  • No performance penalty is incurred despite encrypting all data.
  • Re-key of the leader encryption key (MEK) at arbitrary times is supported.
Note: If an SED cluster is present, then while executing the data-at-rest encryption, you will get an option to either select data-at-rest encryption using SEDs or data-at-rest encryption using AOS.
Figure. SED and AOS Options Click to enlarge

Note: This solution provides enhanced security for data on a drive, but it does not secure data in transit.

Data Encryption Model

To accomplish these goals, Nutanix implements a data security configuration that uses SEDs with keys maintained through a separate key management device. Nutanix uses open standards (TCG and KMIP protocols) and FIPS validated SED drives for interoperability and strong security.

Figure. Cluster Protection Overview Click to enlarge Graphical overview of the Nutanix data encryption methodology

This configuration involves the following workflow:

  1. The security implementation begins by installing SEDs for all data drives in a cluster.

    The drives are FIPS 140-2 validated and use FIPS 140-2 validated cryptographic modules.

    Creating a new cluster that includes SEDs only is straightforward, but an existing cluster can be converted to support data-at-rest encryption by replacing the existing drives with SEDs (after migrating all the VMs/vDisks off of the cluster while the drives are being replaced).

    Note: Contact Nutanix customer support for assistance before attempting to convert an existing cluster. A non-protected cluster can contain both SED and standard drives, but Nutanix does not support a mixed cluster when protection is enabled. All the disks in a protected cluster must be SED drives.
  2. Data on the drives is always encrypted but read or write access to that data is open. By default, the access to data on the drives is protected by the in-built manufacturer key. However, when data protection for the cluster is enabled, the Controller VM must provide the proper key to access data on a SED. The Controller VM communicates with the SEDs through a Trusted Computing Group (TCG) Security Subsystem Class (SSC) Enterprise protocol.

    A symmetric data encryption key (DEK) such as AES 256 is applied to all data being written to or read from the disk. The key is known only to the drive controller and never leaves the physical subsystem, so there is no way to access the data directly from the drive.

    Another key, known as a key encryption key (KEK), is used to encrypt/decrypt the DEK and authenticate to the drive. (Some vendors call this the authentication key or PIN.)

    Each drive has a separate KEK that is generated through the FIPS compliant random number generator present in the drive controller. The KEK is 32 bytes long to resist brute force attacks. The KEKs are sent to the key management server for secure storage and later retrieval; they are not stored locally on the node (even though they are generated locally).

    In addition to the above, the leader encryption key (MEK) is used to encrypt the KEKs.

    Each node maintains a set of certificates and keys in order to establish a secure connection with the external key management server.

  3. Keys are stored in a key management server that is outside the cluster, and the Controller VM communicates with the key management server using the Key Management Interoperability Protocol (KMIP) to upload and retrieve drive keys.

    Only one key management server device is required, but it is recommended that multiple devices are employed so the key management server is not a potential single point of failure. Configure the key manager server devices to work in clustered mode so they can be added to the cluster configuration as a single entity that is resilient to a single failure.

  4. When a node experiences a full power off and power on (and cluster protection is enabled), the controller VM retrieves the drive keys from the key management server and uses them to unlock the drives.

    If the Controller VM cannot get the correct keys from the key management server, it cannot access data on the drives.

    If a drive is re-seated, it becomes locked.

    If a drive is stolen, the data is inaccessible without the KEK (which cannot be obtained from the drive). If a node is stolen, the key management server can revoke the node certificates to ensure they cannot be used to access data on any of the drives.

Preparing for Data-at-Rest Encryption (External KMS for SEDs and Software Only)

About this task

Caution: DO NOT HOST A KEY MANAGEMENT SERVER VM ON THE ENCRYPTED CLUSTER THAT IS USING IT!

Doing so could result in complete data loss if there is a problem with the VM while it is hosted in that cluster.

If you are using an external KMS for encryption using AOS, preparation steps outside the web console are required. The information in this section is applicable if you choose to use an external KMS for configuring encryption.

You must install the license of the external key manager for all nodes in the cluster. See Compatibility and Interoperability Matrix for a complete list of the supported key management servers. For instructions on how to configure a key management server, refer to the documentation from the appropriate vendor.

The system accesses the EKM under the following conditions:

  • Starting a cluster

  • Regenerating a key (key regeneration occurs automatically every year by default)

  • Adding or removing a node (only when Self Encrypting Drives is used for encryption)

  • Switching between Native to EKM or EKM to Native

  • Starting, and restarting a service (only if Software-based encryption is used)

  • Upgrading AOS (only if Software-based encryption is used)

  • NCC heartbeat check if EKM is alive

Procedure

  1. Configure a key management server.

    The key management server devices must be configured into the network so the cluster has access to those devices. For redundant protection, it is recommended that you employ at least two key management server devices, either in active-active cluster mode or stand-alone.

    Note: The key management server must support KMIP version 1.0 or later.
    • SafeNet

      Ensure that Security > High Security > Key Security > Disable Creation and Use of Global Keys is checked.

    • Vormetric

      Set the appliance to compatibility mode. Suite B mode causes the SSL handshake to fail.

  2. Generate a certificate signing request (CSR) for each node in the cluster.
    • The Common Name field of the CSR is populated automatically with unique_node_identifier .nutanix.com to identify the node associated with the certificate.
      Tip: After generating the certificate from Prism, (if required) you can update the custom common name (CN) setting by running the following command using nCLI.
      ncli data-at-rest-encryption-certificate update-csr-information domain-name=abcd.test.com

      In the above command example, replace "abcd.test.com" with the actual domain name.

    • A UID field is populated with a value of Nutanix . This can be useful when configuring a Nutanix group for access control within a key management server, since it is based on fields within the client certificates.
    Note: Some vendors when doing client certificate authentication expect the client username to be a field in the CSR. While the CN and UID are pre-generated, many of the user populated fields can be used instead if desired. If a node-unique field such as CN is chosen, users must be created on a per node basis for access control. If a cluster-unique field is chosen, customers must create a user for each cluster.
  3. Send the CSRs to a certificate authority (CA) and get them signed.
    • Safenet

      The SafeNet KeySecure key management server includes a local CA option to generate signed certificates, or you can use other third-party vendors to create the signed certificates.

      To enable FIPS compliance, add user nutanix to the CA that signed the CSR. Under Security > High Security > FIPS Compliance click Set FIPS Compliant .

    Note: Some CAs strip the UID field when returning a signed certificate.
    To comply with FIPS, Nutanix does not support the creation of global keys.

    In the SafeNet KeySecure management console, go to Device > Key Server > Key Server > KMIP Properties > Authentication Settings .

    Then do the following:

    • Set the Username Field in Client Certificate option to UID (User ID) .
    • Set the Client Certificate Authentication option to Used for SSL session and username .

    If you do not perform these settings, the KMS creates global keys and fails to encrypt the clusters or containers using the software only method.

  4. Upload the signed SSL certificates (one for each node) and the certificate for the CA to the cluster. These certificates are used to authenticate with the key management server.
  5. Generate keys (KEKs) for the SED drives and upload those keys to the key management server.

Configuring Data-at-Rest Encryption (SEDs)

Nutanix offers an option to use self-encrypting drives (SEDs) to store data in a cluster. When SEDs are used, there are several configuration steps that must be performed to support data-at-rest encryption in the cluster.

Before you begin

A separate key management server is required to store the keys outside of the cluster. Each key management server device must be configured and addressable through the network. It is recommended that multiple key manager server devices be configured to work in clustered mode so they can be added to the cluster configuration as a single entity (see step 5) that is resilient to a single failure.

About this task

To configure cluster encryption, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
    The Data at Rest Encryption dialog box appears. Initially, encryption is not configured, and a message to that effect appears.
    Figure. Data at Rest Encryption Screen (initial) Click to enlarge initial screen of the data-at-rest encryption window

  2. Click the Create Configuration button.
    Clicking the Continue Configuration button, configure it link, or Edit Config button does the same thing, which is display the Data-at-Rest Encryption configuration page.
  3. Select the Encryption Type as Drive-based Encryption . This option is displayed only when SEDs are detected.
  4. In the Certificate Signing Request Information section, do the following:
    Figure. Certificate Signing Request Section Click to enlarge section of the data-at-rest encryption window for configuring a certificate signing request

    1. Enter appropriate credentials for your organization in the Email , Organization , Organizational Unit , Country Code , City , and State fields and then click the Save CSR Info button.
      The entered information is saved and is used when creating a certificate signing request (CSR). To specify more than one Organization Unit name, enter a comma separated list.
      Note: You can update this information until an SSL certificate for a node is uploaded to the cluster, at which point the information cannot be changed (the fields become read only) without first deleting the uploaded certificates.
    2. Click the Download CSRs button, and then in the new screen click the Download CSRs for all nodes to download a file with CSRs for all the nodes or click a Download link to download a file with the CSR for that node.
      Figure. Download CSRs Screen Click to enlarge screen to download a certificate signing request

    3. Send the files with the CSRs to the desired certificate authority.
      The certificate authority creates the signed certificates and returns them to you. Store the returned SSL certificates and the CA certificate where you can retrieve them in step 6.
      • The certificates must be X.509 format. (DER, PKCS, and PFX formats are not supported.)
      • The certificate and the private key should be in separate files.
  5. In the Key Management Server section, do the following:
    Figure. Key Management Server Section Click to enlarge section of the data-at-rest encryption window for configuring a key management server

    1. Click the Add New Key Management Server button.
    2. In the Add a New Key Management Server screen, enter a name, IP address, and port number for the key management server in the appropriate fields.
      The port is where the key management server is configured to listen for the KMIP protocol. The default port number is 5696. For the complete list of required ports, see Port Reference.
      • If you have configured multiple key management servers in cluster mode, click the Add Address button to provide the addresses for each key management server device in the cluster.
      • If you have stand-alone key management servers, click the Save button. Repeat this step ( Add New Key Management Server button) for each key management server device to add.
        Note: If your key management servers are configured into a leader/follower (active/passive) relationship and the architecture is such that the follower cannot accept write requests, do not add the follower into this configuration. The system sends requests (read or write) to any configured key management server, so both read and write access is needed for key management servers added here.
        Note: To prevent potential configuration problems, always use the Add Address button for key management servers configured into cluster mode. Only a stand-alone key management server should be added as a new server.
      Figure. Add Key Management Server Screen Click to enlarge screen to provide an address for a key management server

    3. To edit any settings, click the pencil icon for that entry in the key management server list to redisplay the add page and then click the Save button after making the change. To delete an entry, click the X icon.
  6. In the Add a New Certificate Authority section, enter a name for the CA, click the Upload CA Certificate button, and select the certificate for the CA used to sign your node certificates (see step 4c). Repeat this step for all CAs that were used in the signing process.
    Figure. Certificate Authority Section Click to enlarge screen to identify and upload a certificate authority certificate

  7. Go to the Key Management Server section (see step 5) and do the following:
    1. Click the Manage Certificates button for a key management server.
    2. In the Manage Signed Certificates screen, upload the node certificates either by clicking the Upload Files button to upload all the certificates in one step or by clicking the Upload link (not shown in the figure) for each node individually.
    3. Test that the certificates are correct either by clicking the Test all nodes button to test the certificates for all nodes in one step or by clicking the Test CS (or Re-Test CS ) link for each node individually. A status of Verified indicates the test was successful for that node.
    Note: Before removing a drive or node from an SED cluster, ensure that the testing is successful and the status is Verified . Otherwise, the drive or node will be locked.
    1. Repeat this step for each key management server.
    Note: Before removing a drive or node from an SED cluster, ensure that the testing is successful and the status is Verified . Otherwise, the drive or node will be locked.
    Figure. Upload Signed Certificates Screen Click to enlarge screen to upload and test signed certificates

  8. When the configuration is complete, click the Protect button on the opening page to enable encryption protection for the cluster.
    A clear key icon appears on the page.
    Figure. Data-at-Rest Encryption Screen (unprotected) Click to enlarge

    The key turns gold when cluster encryption is enabled.
    Note: If changes are made to the configuration after protection has been enabled, such as adding a new key management server, you must rekey the disks for the modification to take full effect (see Changing Key Encryption Keys (SEDs)).
    Figure. Data-at-Rest Encryption Screen (protected) Click to enlarge

Enabling/Disabling Encryption (SEDs)

Data on a self encrypting drive (SED) is always encrypted, but enabling/disabling data-at-rest encryption for the cluster determines whether a separate (and secured) key is required to access that data.

About this task

To enable or disable data-at-rest encryption after it has been configured for the cluster (see Configuring Data-at-Rest Encryption (SEDs)), do the following:
Note: The key management server must be accessible to disable encryption.

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
  2. In the Cluster Encryption page, do one of the following:
    • If cluster encryption is enabled currently, click the Unprotect button to disable it.
    • If cluster encryption is disabled currently, click the Protect button to enable it.
    Enabling cluster encryption enforces the use of secured keys to access data on the SEDs in the cluster; disabling cluster encryption means the data can be accessed without providing a key.

Changing Key Encryption Keys (SEDs)

The key encryption key (KEK) can be changed at any time. This can be useful as a periodic password rotation security precaution or when a key management server or node becomes compromised. If the key management server is compromised, only the KEK needs to be changed, because the KEK is independent of the drive encryption key (DEK). There is no need to re-encrypt any data, just to re-encrypt the DEK.

About this task

To change the KEKs for a cluster, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
  2. In the Cluster Encryption page, select Manage Keys and click the Rekey All Disks button under Hardware Encryption .
    Rekeying a cluster under heavy workloads may result in higher-than-normal IO latency, and some data may become temporarily unavailable. To continue with the rekey operation, click Confirm Rekey .
    This step resets the KEKs for all the self encrypting disks in the cluster.
    Note:
    • The Rekey All Disks button appears only when cluster protection is active.
    • If the cluster is already protected and a new key management server is added, you must press the Rekey All Disks button to use this new key management server for storing secrets.
    Figure. Cluster Encryption Screen Click to enlarge

Destroying Data (SEDs)

Data on a self encrypting drive (SED) is always encrypted, and the data encryption key (DEK) used to read the encrypted data is known only to the drive controller. All data on the drive can effectively be destroyed (that is, become permanently unreadable) by having the controller change the DEK. This is known as a crypto-erase.

About this task

To crypto-erase a SED, do the following:

Procedure

  1. In the web console, go to the Hardware dashboard and select the Diagram tab.
  2. Select the target disk in the diagram (upper section of screen) and then click the Remove Disk button (at the bottom right of the following diagram).

    As part of the disk removal process, the DEK for that disk is automatically cycled on the drive controller. The previous DEK is lost and all new disk reads are indecipherable. The key encryption key (KEK) is unchanged, and the new DEK is protected using the current KEK.

    Note: When a node is removed, all SEDs in that node are crypto-erased automatically as part of the node removal process.
    Figure. Removing a Disk Click to enlarge screen shot of the diagram tab of the hardware dashboard demonstrating how to remove a disk

Data-at-Rest Encryption (Software Only)

For customers who require enhanced data security, Nutanix provides a software-only encryption option for data-at-rest security (SEDs not required) included in the Ultimate license.
Note: On G6 platforms running the AOS Pro license, you can use software encryption by installing an add-on license.
Software encryption using a local key manager (LKM) supports the following features:
  • For AHV, the data can be encrypted on a cluster level. This is applicable to an empty cluster or a cluster with existing data.
  • For ESXi and Hyper-V, the data can be encrypted on a cluster or container level. The cluster or container can be empty or contain existing data. Consider the following points for container level encryption.
    • Once you enable container level encryption, you can not change the encryption type to cluster level encryption later.
    • After the encryption is enabled, the administrator needs to enable encryption for every new container.
  • Data is encrypted at all times.
  • Data is inaccessible in the event of drive or node theft.
  • Data on a drive can be securely destroyed.
  • Re-key of the leader encryption key at arbitrary times is supported.
  • Cluster’s native KMS is supported.
Note: In case of mixed hypervisors, only the following combinations are supported.
  • ESXi and AHV
  • Hyper-V and AHV
Note: This solution provides enhanced security for data on a drive, but it does not secure data in transit.

Data Encryption Model

To accomplish the above mentioned goals, Nutanix implements a data security configuration that uses AOS functionality along with the cluster’s native or an external key management server. Nutanix uses open standards (KMIP protocols) for interoperability and strong security.

Figure. Cluster Protection Overview Click to enlarge graphical overview of the Nutanix data encryption methodology

This configuration involves the following workflow:

  • For software encryption, data protection must be enabled for the cluster before any data is encrypted. Also, the Controller VM must provide the proper key to access the data.
  • A symmetric data encryption key (DEK) such as AES 256 is applied to all data being written to or read from the disk. The key is known only to AOS, so there is no way to access the data directly from the drive.
  • In case of an external KMS:

    Each node maintains a set of certificates and keys in order to establish a secure connection with the key management server.

    Only one key management server device is required, but it is recommended that multiple devices are employed so the key management server is not a potential single point of failure. Configure the key manager server devices to work in clustered mode so they can be added to the cluster configuration as a single entity that is resilient to a single failure.

Configuring Data-at-Rest Encryption (Software Only)

Nutanix offers a software-only option to perform data-at-rest encryption in a cluster or container.

Before you begin

  • Nutanix provides the option to choose the KMS type as the Native KMS (local), Native KMS (remote), or External KMS.
  • Cluster Localised Key Management Service (Native KMS (local)) requires a minimum of 3-node cluster. 1-node and 2-node clusters are not supported.
  • Software encryption using Native KMS is supported for remote office/branch office (ROBO) deployments using the Native KMS (remote) KMS type.
  • For external KMS, a separate key management server is required to store the keys outside of the cluster. Each key management server device must be configured and addressable through the network. It is recommended that multiple key manager server devices be configured to work in clustered mode so they can be added to the cluster configuration as a single entity that is resilient to a single failure.
    Caution: DO NOT HOST A KEY MANAGEMENT SERVER VM ON THE ENCRYPTED CLUSTER THAT IS USING IT!!

    Doing so could result in complete data loss if there is a problem with the VM while it is hosted in that cluster.

    Note: You must install the license of the external key manager for all nodes in the cluster. See Compatibility and Interoperability Matrix for a complete list of the supported key management servers. For instructions on how to configure a key management server, refer to the documentation from the appropriate vendor.
  • This feature requires an Ultimate license, or as an Add-On to the PRO license (for the latest generation of products). Ensure that you have procure the add-on license key to use the data-at-rest encryption using AOS, contact Sales team to procure the license.
  • Caution: For security, you can't disable software-only data-at-rest encryption once it is enabled.

About this task

To configure cluster or container encryption, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
    The Data at Rest Encryption dialog box appears. Initially, encryption is not configured, and a message to that effect appears.
    Figure. Data at Rest Encryption Screen (initial) Click to enlarge initial screen of the data-at-rest encryption window

  2. Click the Create Configuration button.
    Clicking the Continue Configuration button, configure it link, or Edit Config button does the same thing, which is display the Data-at-Rest Encryption configuration page
  3. Select the Encryption Type as Encrypt the entire cluster or Encrypt storage containers . Then click Save Encryption Type .
    Caution: You can enable encryption for the entire cluster or just the container. However, if you enable encryption on a container; and there are any encryption key issue like loss of encryption key, you can encounter the following:
    • The entire cluster data is affected, not just the encrypted container.
    • All the user VMs of the cluster will not able to access the data.
    The hardware option is displayed only when SEDs are detected. Else, software based encryption type will be used by default.
    Figure. Select encryption type Click to enlarge select KMS type

    Note: For ESXi and Hyper-V, the data can be encrypted on a cluster or container level. The cluster or container can be empty or contain existing data. Consider the following points for container level encryption.
    • Once you enable container level encryption, you can not change the encryption type to cluster level encryption later.
    • After the encryption is enabled, the administrator needs to enable encryption for every new container.
    To enable encryption for every new storage container, do the following:
    1. In the web console, select Storage from the pull-down main menu (upper left of screen) and then select the Table and Storage Container tabs.
    2. To enable encryption, select the target storage container and then click the Update link.
      The Update Storage Container window appears.
    3. In the Advanced Settings area, select the Enable check box to enable encryption for the storage container you selected.
      Figure. Update storage container Click to enlarge Selecting encryption check box

    4. Click Save to complete.
  4. Select the Key Management Service.
    To keep the keys safe with the native KMS, select Native KMS (local) or Native KMS (remote) and click Save KMS type . If you select this option, skip to step 9 to complete the configuration.
    Note:
    • Cluster Localised Key Management Service ( Native KMS (local) ) requires a minimum of 3-node cluster. 1-node and 2-node clusters are not supported.
    • For enhanced security of ROBO environments (typically, 1 or 2 node clusters), select the Native KMS (remote) for software based encryption of ROBO clusters managed by Prism Central.
      Note: This is option is available only if the cluster is registered to Prism Central.
    For external KMS type, select the External KMS option and click Save KMS type . Continue to step 5 for further configuration.
    Figure. Select KMS Type Click to enlarge section of the data-at-rest encryption window for selecting KMS type

    Note: You can switch between the KMS types at a later stage if the specific KMS prerequisites are met, see Switching between Native Key Manager and External Key Manager.
  5. In the Certificate Signing Request Information section, do the following:
    Figure. Certificate Signing Request Section Click to enlarge section of the data-at-rest encryption window for configuring a certificate signing request

    1. Enter appropriate credentials for your organization in the Email , Organization , Organizational Unit , Country Code , City , and State fields and then click the Save CSR Info button.
      The entered information is saved and is used when creating a certificate signing request (CSR). To specify more than one Organization Unit name, enter a comma separated list.
      Note: You can update this information until an SSL certificate for a node is uploaded to the cluster, at which point the information cannot be changed (the fields become read only) without first deleting the uploaded certificates.
    2. Click the Download CSRs button, and then in the new screen click the Download CSRs for all nodes to download a file with CSRs for all the nodes or click a Download link to download a file with the CSR for that node.
      Figure. Download CSRs Screen Click to enlarge screen to download a certificate signing request

    3. Send the files with the CSRs to the desired certificate authority.
      The certificate authority creates the signed certificates and returns them to you. Store the returned SSL certificates and the CA certificate where you can retrieve them in step 5.
      • The certificates must be X.509 format. (DER, PKCS, and PFX formats are not supported.)
      • The certificate and the private key should be in separate files.
  6. In the Key Management Server section, do the following:
    Figure. Key Management Server Section Click to enlarge section of the data-at-rest encryption window for configuring a key management server

    1. Click the Add New Key Management Server button.
    2. In the Add a New Key Management Server screen, enter a name, IP address, and port number for the key management server in the appropriate fields.
      The port is where the key management server is configured to listen for the KMIP protocol. The default port number is 5696. For the complete list of required ports, see Port Reference.
      • If you have configured multiple key management servers in cluster mode, click the Add Address button to provide the addresses for each key management server device in the cluster.
      • If you have stand-alone key management servers, click the Save button. Repeat this step ( Add New Key Management Server button) for each key management server device to add.
        Note: If your key management servers are configured into a master/slave (active/passive) relationship and the architecture is such that the follower cannot accept write requests, do not add the follower into this configuration. The system sends requests (read or write) to any configured key management server, so both read and write access is needed for key management servers added here.
        Note: To prevent potential configuration problems, always use the Add Address button for key management servers configured into cluster mode. Only a stand-alone key management server should be added as a new server.
      Figure. Add Key Management Server Screen Click to enlarge screen to provide an address for a key management server

    3. To edit any settings, click the pencil icon for that entry in the key management server list to redisplay the add page and then click the Save button after making the change. To delete an entry, click the X icon.
  7. In the Add a New Certificate Authority section, enter a name for the CA, click the Upload CA Certificate button, and select the certificate for the CA used to sign your node certificates (see step 3c). Repeat this step for all CAs that were used in the signing process.
    Figure. Certificate Authority Section Click to enlarge screen to identify and upload a certificate authority certificate

  8. Go to the Key Management Server section (see step 4) and do the following:
    1. Click the Manage Certificates button for a key management server.
    2. In the Manage Signed Certificates screen, upload the node certificates either by clicking the Upload Files button to upload all the certificates in one step or by clicking the Upload link (not shown in the figure) for each node individually.
    3. Test that the certificates are correct either by clicking the Test all nodes button to test the certificates for all nodes in one step or by clicking the Test CS (or Re-Test CS ) link for each node individually. A status of Verified indicates the test was successful for that node.
    4. Repeat this step for each key management server.
    Note: Before removing a drive or node from an SED cluster, ensure that the testing is successful and the status is Verified . Otherwise, the drive or node will be locked.
    Figure. Upload Signed Certificates Screen Click to enlarge screen to upload and test signed certificates

  9. When the configuration is complete, click the Enable Encryption button.
    Enable Encryption window is displayed.
    Figure. Data-at-Rest Encryption Screen (unprotected) Click to enlarge

    Caution: To help ensure that your data is secure, you cannot disable software-only data-at-rest encryption once it is enabled. Nutanix recommends regularly backing up your data, encryption keys, and key management server.
  10. Enter ENCRYPT .
  11. Click Encrypt button.
    The data-at-rest encryption is enabled. To view the status of the encrypted cluster or container, go to Data at Rest Encryption in the Settings menu.

    When you enable encryption, a low priority background task runs to encrypt all the unencrypted data. This task is designed to take advantage of any available CPU space to encrypt the unencrypted data within a reasonable time. If the system is occupied with other workloads, the background task consumes less CPU space. Depending on the amount of data in the cluster, the background task can take 24 to 36 hours to complete.

    Note: If changes are made to the configuration after protection has been enabled, such as adding a new key management server, you must do the rekey operation for the modification to take full effect. In case of EKM, rekey to change the KEKs stored in the EKM. In case of LKM, rekey to change the leader key used by native key manager, see Changing Key Encryption Keys (Software Only)) for details.
    Note: Once the task to encrypt a cluster begins, you cannot cancel the operation. Even if you stop and restart the cluster, the system resumes the operation.
    Figure. Data-at-Rest Encryption Screen (protected) Click to enlarge

Switching between Native Key Manager and External Key Manager

After Software Encryption has been established, Nutanix supports the ability to switch the KMS type from the External Key Manager to the Native Key Manager or from the Native Key Manager to an External Key Manager, without any down time.

Note:
  • The Native KMS requires a minimum of 3-node cluster.
  • For external KMS, a separate key management server is required to store the keys outside of the cluster. Each key management server device must be configured and addressable through the network. It is recommended that multiple key manager server devices be configured to work in clustered mode so they can be added to the cluster configuration as a single entity that is resilient to a single failure.
  • It is recommended that you backup and save the encryption keys with identifiable names before and after changing the KMS type. For backing up keys, see Backing up Keys.
To change the KMS type, change the KMS selection by editing the encryption configuration. For details, see step 3 in Configuring Data-at-Rest Encryption (Software Only) section.
Figure. Select KMS type Click to enlarge select KMS type

After you change the KMS type and save the configuration, the encryption keys are re-generated on the selected KMS storage medium and data is re-encrypted with the new keys. The old keys are destroyed.
Note: This operation completes in a few minutes, depending on the number of encrypted objects and network speed.

Changing Key Encryption Keys (Software Only)

The key encryption key (KEK) can be changed at any time. This can be useful as a periodic password rotation security precaution or when a key management server or node becomes compromised. If the key management server is compromised, only the KEK needs to be changed, because the KEK is independent of the drive encryption key (DEK). There is no need to re-encrypt any data, just to re-encrypt the DEK.

About this task

To change the KEKs for a cluster, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
  2. In the Cluster Encryption page, select Manage Keys and click the Rekey button under Software Encryption .
    Note: The Rekey button appears only when cluster protection is active.
    Note: If the cluster is already protected and a new key management server is added, you must press the Rekey button to use this new key management server for storing secrets.
    Figure. Cluster Encryption Screen Click to enlarge

    Note: The system automatically regenerates the leader key yearly.

Destroying Data (Software Only)

Data on the AOS cluster is always encrypted, and the data encryption key (DEK) used to read the encrypted data is known only to the AOS. All data on the drive can effectively be destroyed (that is, become permanently unreadable) by deleting the container or cluster. This is known as a crypto-erase.

About this task

Note: To help ensure that your data is secure, you cannot disable software-only data-at-rest encryption once it is enabled. Nutanix recommends regularly backing up your data, encryption keys, and key management server.

To crypto-erase the container or cluster, do the following:

Procedure

  1. Delete the storage container or destroy the cluster.
    • For information on how to delete a storage container, see Modifying a Storage Container in the Prism Web Console Guide .
    • For information on how to destroy a cluster, see Destroying a Cluster in the Acropolis Advanced Administration Guide .
    Note:

    When you delete a storage container, the Curator scans and deletes the DEK and KEK keys automatically.

    When you destroy a cluster, then:

    • the Native Key Manager (local) destroys the master key shares and the encrypted DEKs/KEKs.
    • the Native Key Manager (remote) retains the root key on the PC if the cluster is still registered to a PC when it is destroyed. You must unregister a cluster from the PC and then destroy the cluster to delete the root key.
    • the External Key Manager deletes the encrypted DEKs. However, the KEKs remain on the EKM. You must use an external key manager UI to delete the KEKs.
  2. Delete the key backup files, if any.

Switching from SED-EKM to Software-LKM

This section describes the steps to switch from SED and External KMS combination to software-only and LKM combination.

About this task

To switch from SED-EKM to Software-LKM, do the following.

Procedure

  1. Perform the steps for the software-only encryption with External KMS. For more information, see Configuring Data-at-Rest Encryption (Software Only).
    After the background task completes, all the data gets encrypted by the software. The time taken to complete the task depends on the amount of data and foreground I/O operations in the cluster.
  2. Disable the SED encryption. Ensure that all the disks are unprotected.
    For more information, see Enabling/Disabling Encryption (SEDs).
  3. Switch the key management server from the External KMS to Local Key Manager. For more information, see Switching between Native Key Manager and External Key Manager.

Configuring Dual Encryption

About this task

Dual Encryption protects the data on the clusters using both SED and software-only encryption. An external key manager is used to store the keys for dual encryption, the Native KMS is not supported.

To configure dual encryption, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
  2. In the Cluster Encryption page, check to enable both Drive-based and Software-based encryption.
  3. Click Save Encryption Type .
    Figure. Dual Encryption Click to enlarge selecting encryption types

  4. Continue with the rest of the encryption configuration, see:
    • Configuring Data-at-Rest Encryption (Software Only)
    • Configuring Data-at-Rest Encryption (SEDs)

Backing up Keys

About this task

You can take a backup of encryption keys:

  • when you enable Software-only Encryption for the first time
  • after you regenerate the keys

Backing up encryption keys is critical in the very unlikely situation in which keys get corrupted.

You can download key backup file for a cluster on a PE or all clusters on a PC. To download key backup file for all clusters, see Taking a Consolidated Backup of Keys (Prism Central) .

To download the key backup file for a cluster, do the following:

Procedure

  1. Log on to the Prism Element web console.
  2. Click the gear icon in the main menu and then select Data at Rest Encryption in the Settings page.
  3. In the Cluster Encryption page, select Manage Keys .
  4. Enter and confirm the password.
  5. Click the Download Key Backup button.

    The backup file is saved in the default download location on your local machine.

    Note: Ensure you move the backup key file to a safe location.

Taking a Consolidated Backup of Keys (Prism Central)

If you are using the Native KMS option with software encryption for your clusters, you can take a consolidated backup of all the keys from Prism Central.

About this task

To take a consolidated backup of keys for software encryption-enabled clusters (Native KMS-only), do the following:

Procedure

  1. Log on to the Prism Central web console.
  2. Click the hamburger icon, then select Clusters > List view.
  3. Select a cluster, go to Actions , then select Manage & Backup Keys .
  4. Download the backup keys:
    1. In Password , enter your password.
    2. In Confirm Password , reenter your password.
    3. To change the encryption key, select the Rekey Encryption Key (KEK) box .
    4. To download the backup key, click Backup Key .
    Note: Ensure that you move the backup key file to a safe location.

Importing Keys

You can import the encryption keys from backup. You must note the specific commands in this topic if you backed up your keys to an external key manager (EKM)

About this task

Note: Nutanix recommends that you contact Nutanix Support for this operation. Extended cluster downtime might result if you perform this task incorrectly.

Procedure

  1. Log on to any Controller VM in the cluster with SSH.
  2. Retrieve the encryption keys stored on the cluster and verify that all the keys you want to retrieve are listed.
    In this example, the password is Nutanix.123 . date is the timestamp portion of the backup file name.
    mantle_recovery_util --backup_file_path=/home/nutanix/encryption_key_backup_date \
    --password=Nutanix.123 --list_key_ids=true 
  3. Import the keys into the cluster.
    mantle_recovery_util --backup_file_path=/home/nutanix/key_backup \
    --password=Nutanix.123 --interactive_mode 
  4. If you are using an external key manager such as IBM Security Key Lifecycle Manager, Gemalto Safenet, or Vormetric Data Security Manager, use the --store_kek_remotely option to import the keys into the cluster.
    In this example, date is the timestamp portion of the backup file name.
    mantle_recovery_util --backup_file_path path/encryption_key_backup_date \
     --password key_password --store_kek_remotely

Securing Traffic Through Network Segmentation

Network segmentation enhances security, resilience, and cluster performance by isolating a subset of traffic to its own network.

You can achieve traffic isolation in one or more of the following ways:

Isolating Backplane Traffic by using VLANs (Logical Segmentation)
You can separate management traffic from storage replication (or backplane) traffic by creating a separate network segment (LAN) for storage replication. For more information about the types of traffic seen on the management plane and the backplane, see Traffic Types In a Segmented Network.

To enable the CVMs in a cluster to communicate over these separated networks, the CVMs are multihomed. Multihoming is facilitated by the addition of a virtual network interface card (vNIC) to the Controller VM and placing the new interface on the backplane network. Additionally, the hypervisor is assigned an interface on the backplane network.

The traffic associated with the CVM interfaces and host interfaces on the backplane network can be secured further by placing those interfaces on a separate VLAN.

In this type of segmentation, both network segments continue to use the same external bridge and therefore use the same set of physical uplinks. For physical separation, see Physically Isolating the Backplane Traffic on an AHV Cluster.

Isolating backplane traffic from management traffic requires minimal configuration through the Prism web console. No manual host (hypervisor) configuration steps are required.

For information about isolating backplane traffic, see Isolating the Backplane Traffic Logically on an Existing Cluster (VLAN-Based Segmentation Only).

Isolating Backplane Traffic Physically (Physical Segementation)

You can physically isolate the backplane traffic (intra cluster traffic) from the management traffic (Prism, SSH, SNMP) in to a separate vNIC on the CVM and using a dedicated virtual network that has its own physical NICs. This type of segmentation therefore offers true physical separation of the backplane traffic from the management traffic.

You can use Prism to configure the vNIC on the CVM and configure the backplane traffic to communicate over the dedicated virtual network. However, you must first manually configure the virtual network on the hosts and associate it with the physical NICs that it requires for true traffic isolation.

For more information about physically isolating backplane traffic, see Physically Isolating the Backplane Traffic on an AHV Cluster.

Isolating service-specific traffic
You can also secure traffic associated with a service (for example, Nutanix Volumes) by confining its traffic to a separate vNIC on the CVM and using a dedicated virtual network that has its own physical NICs. This type of segmentation therefore offers true physical separation for service-specific traffic.

You can use Prism to create the vNIC on the CVM and configure the service to communicate over the dedicated virtual network. However, you must first manually configure the virtual network on the hosts and associate it with the physical NICs that it requires for true traffic isolation. You need one virtual network for each service you want to isolate. For a list of the services whose traffic you can isolate in the current release, see Cluster Services That Support Traffic Isolation.

For information about isolating service-specific traffic, see Isolating Service-Specific Traffic.

Isolating Stargate-to-Stargate traffic over RDMA
Some Nutanix platforms support remote direct memory access (RDMA) for Stargate-to-Stargate service communication. You can create a separate virtual network for RDMA-enabled network interface cards. If a node has RDMA-enabled NICs, Foundation passes the NICs through to the CVMs during imaging. The CVMs use only the first of the two RDMA-enabled NICs for Stargate-to-Stargate communications. The virtual NIC on the CVM is named rdma0. Foundation does not configure the RDMA LAN. After creating a cluster, you need to enable RDMA by creating an RDMA LAN from the Prism web console. For more information about RDMA support, see Remote Direct Memory Access in the NX Series Hardware Administration Guide .

For information about isolating backplane traffic on an RDMA cluster, see Isolating the Backplane Traffic on an Existing RDMA Cluster.

Traffic Types In a Segmented Network

The traffic entering and leaving a Nutanix cluster can be broadly classified into the following types:

Backplane traffic
Backplane traffic is intra-cluster traffic that is necessary for the cluster to function, and it comprises traffic between CVMs and traffic between CVMs and hosts for functions such as storage RF replication, host management, high availability, and so on. This traffic uses eth2 on the CVM. In AHV, VM live migration traffic is also backplane, and uses the AHV backplane interface, VLAN, and virtual switch when configured. For nodes that have RDMA-enabled NICs, the CVMs use a separate RDMA LAN for Stargate-to-Stargate communications.
Management traffic
Management traffic is administrative traffic, or traffic associated with Prism and SSH connections, remote logging, SNMP, and so on. The current implementation simplifies the definition of management traffic to be any traffic that is not on the backplane network, and therefore also includes communications between user VMs and CVMs. This traffic uses eth0 on the CVM.

Traffic on the management plane can be further isolated per service or feature. An example of this type of traffic is the traffic that the cluster receives from external iSCSI initiators (Nutanix Volumes iSCSI traffic). For a list of services supported in the current release, see Cluster Services That Support Traffic Isolation.

Segmented and Unsegmented Networks

In the default unsegmented network in a Nutanix cluster (ESXi and AHV), the Controller VM has two virtual network interfaces—eth0 and eth1.

Interface eth0 is connected to the default external virtual switch, which is in turn connected to the external network through a bond or NIC team that contains the host physical uplinks.

Interface eth1 is connected to an internal network that enables the CVM to communicate with the hypervisor.

In the below unsegmented network (see figure Unsegmented Network - ESXi Cluster , and Unsegmented Network - AHV Cluster ) all external CVM traffic, whether backplane or management traffic, uses interface eth0. These interfaces are on the default VLAN on the default virtual switch.

Figure. Unsegmented Network- ESXi Cluster Click to enlarge

This figure shows an unsegmented network AHV cluster.

In AHV, VM live migration traffic is also backplane, and uses the AHV backplane interface, VLAN, and virtual switch when configured.

Figure. Unsegmented Network- AHV Cluster Click to enlarge

If you further isolate service-specific traffic, additional vNICs are created on the CVM. Each service requiring isolation is assigned a dedicated virtual NIC on the CVM. The NICs are named ntnx0, ntnx1, and so on. Each service-specific NIC is placed on a configurable existing or new virtual network (vSwitch or bridge) and a VLAN and IP subnet are specified.

Network with Segmentation

In a segmented network, management traffic uses CVM interface eth0 and additional services can be isolated to different VLANs or virtual switches. In backplane segmentation, the backplane traffic uses interface eth2. The backplane network uses either the default VLAN or, optionally, a separate VLAN that you specify when segmenting the network. In ESXi, you must select a port group for the new vmkernel interface. In AHV this internal interface is created automatically in the selected virtual switch. For physical separation of the backplane network, create this new port group on a separate virtual switch in ESXi, or select the desired virtual switch in the AHV GUI.

If you want to isolate service-specific traffic such as Volumes or Disaster Recovery as well as backplane traffic, then additional vNICs are needed on the CVM, but no new vmkernel adapters or internal interfaces are required. AOS creates additional vNICs on the CVM. Each service that requires isolation is assigned a dedicated vNIC on the CVM. The NICs are named ntnx0, ntnx1, and so on. Each service-specific NIC is placed on a configurable existing or new virtual network (vSwitch or bridge) and a VLAN and IP subnet are specified.

You can choose to perform backplane segmentation alone, with no other forms of segmentation. You can also choose to use one or more types of service specific segmentation with or without backplane segmentation. In all of these cases, you can choose to segment any service to either the existing, or a new virtual switch for further physical traffic isolation. The combination selected is driven by the security and networking requirements of the deployment. In most cases, the default configuration with no segmentation of any kind is recommended due to simplicity and ease of deployment.

The following figure shows an implementation scenario where the backplane and service specific segmentation is configured with two vSwitches on ESXi hypervisors.

Figure. Backplane and Service Specific Segmentation Configured with two vSwitches on an ESXi Cluster Click to enlarge

Here are the CVM to ESXi hypervisor connection details:

  • The eth0 vNIC on the CVM and vmk0 on the host are carrying management traffic and connected to the hypervisor through the existing PGm (portgroup) on vSwitch0.
  • The eth2 vNIC on the CVM and vmk2 on the host are carrying backplane traffic and connected to the hypervisor through a new user created PGb on the existing vSwitch.
  • The ntnx0 vNIC on the CVM is carrying iSCSI traffic and connected to the hypervisor through PGi on the vSwitch1. No new vmkernel adapter is required.
  • The ntnx1 vNIC on the CVM is carrying DR traffic and connected to the hypervisor through PGd on the vSwitch2. Here as well, there is no new vmkernel adapter required.

The following figure shows an implementation scenario where the backplane and service specific segmentation is configured with two vSwitches on an AHV hypervisors.

Figure. Backplane and Service Specific Segmentation Configured with two vSwitches on an AHV Cluster Click to enlarge

Here are the CVM to AHV hypervisor connection details:

  • The eth0 vNIC on the CVM is carrying management traffic and connected to the hypervisor through the existing vnet0.
  • Other vNICs such as eth2, ntnx0, and ntnx1 are connected to the hypervisor through the auto created interfaces on either the existing or new vSwitch.
Note: In the above figure the interface name 'br0-bp' is read as 'br0-backplane'.

The following table describes the vNIC, port group (PG), VM kernel (vmk), virtual network (vnet) and virtual switch connections for CVM and hypervisor in different implementation scenarios. The tables capture information for ESXi and AHV hypervisors:

Table 1.
Implementation Scenarios vNICs on CVM Connected to ESXi Hypervisor Connected to AHV Hypervisor
Backplane Segmentation with 1 vSwitch

eth0:

DR, iSCSI, andManagement traffic

vmk0 via existing PGm on vSwitch Existing vnet0

eth2:

Backplane traffic

New vmk2 via PGb on vSwitch0

CVM vNIC via PGb on vSwitch0

Auto created interfaces on bridge br0
Backplane Segmentation with 2 vSwitches

eth0:

Management traffic

vmk0 via existing PGm on vSwitch0 Existing vnet0

eth2:

Backplane traffic

New vmk2 via PGb on new vSwitch

CVM vNIC via PGb on new vSwitch

Auto created interfaces on new virtual switch
Service Specific Segmentation for Volumes with 1 vSwitch

eth0:

DR, Backplane, and Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

ntnx0:

iSCSI (Volumes) traffic

CVM vNIC via PGi on vSwitch0

Auto created interface on existing br0
Service Specific Segmentation for Volumes with 2 vSwitches

eth0:

DR, Backplane, and Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

ntnx0:

iSCSI (Volumes) traffic

CVM vNIC via PGi on new vSwitch

Auto created interface on new virtual switch
Service Specific Segmentation for DR with 1 vSwitch

eth0:

iSCSI, Backplane, and Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

ntnx1:

DR traffic

CVM vNIC via PGd on vSwitch0

Auto created interface on existing br0

Service Specific Segmentation for DR with 2 vSwitches

eth0:

iSCSI, Backplane, and Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

ntnx1:

DR traffic

CVM vNIC via PGd on new vSwitch

Auto created interface on new virtual switch

Backplane and Service Specific Segmentation with 1 vSwitch

eth0:

Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

eth2:

Backplane traffic

New vmk2 via PGb on vSwitch0

CVM vNIC via PGb on vSwitch0

Auto created interfaces on br0

ntnx0:

iSCSI traffic

CVM vNIC via PGi on vSwitch0 Auto created interface on br0

ntnx1:

DR traffic

CVM vNIC via PGd on vSwitch0

Auto created interface on br0
Backplane and Service Specific Segmentation with 2 vSwitches

eth0:

Management traffic

vmk0 via existing PGm on vSwitch0

Existing vnet0

eth2:

Backplane traffic

New vmk2 via PGb on new vSwitch

CVM vNIC via PGb on new vSwitch

Auto created interfaces on new virtual switch

ntnx0:

iSCSI traffic

CVM vNIC via PGi on vSwitch1

No new user defined vmkernel adapter is required.

Auto created interface on new virtual switch

ntnx1:

DR traffic

CVM vNIC via PGd on vSwitch2.

No new user defined vmkernel adapter is required.

Auto created interface in new virtual switch

Implementation Considerations

Supported Environment

Network segmentation is supported in the following environment:

  • The hypervisor must be one of the following:
    • For network segmentation by traffic type (separating backplane traffic from management traffic):
      • AHV
      • ESXi
      • Hyper-V
    • For service-specific traffic isolation:
      • AHV
      • ESXi
  • For logical network segmentation, AOS version must be 5.5 or later. For physical segmentation and service-specific traffic isolation, the AOS version must be 5.11 or later.
  • RDMA requirements:
    • Network segmentation is supported with RDMA for AHV and ESXi hypervisors only.
    • For more information about RDMA, see Remote Direct Memory Access in the NX Series Hardware Administration Guide .

Prerequisites

For Nutanix Volumes

Stargate does not monitor the health of a segmented network. If physical network segmentation is configured, network failures or connectivity issues are not tolerated. To overcome this issue, configure redundancy in the network. That is, use two or more uplinks in a fault tolerant configuration, connected to two separate physical switches.

For Disaster Recovery

  • Ensure that the VLAN and subnet that you plan to use for the network segment are routable.
  • Make sure that you have a pool of IP addresses to specify when configuring segmentation. For each cluster, you need n+1 IP addresses, where n is the number of nodes in the cluster. The additional IP address is for the virtual IP address requirement.
  • Enable network segmentation for disaster recovery at both sites (local and remote) before configuring remote sites at those sites.

Limitations

For Nutanix Volumes

  • If network segmentation is enabled for Volumes, volume group attachments are not recovered during VM recovery.
  • Nutanix service VMs such as Objects worker nodes continue to communicate with the CVM eth0 interface when using Volumes for iSCSI traffic. Other external clients such as Files use the new service-specific CVM interface.

For Disaster Recovery

The system does not support configuring a Leap DR and DR service specific traffic isolation together.

Cluster Services That Support Traffic Isolation

You can isolate traffic associated with the following services to its own virtual network:

  • Management (The default network that cannot be moved from CVM eth0)

  • Backplane

  • RDMA

  • Service Specific Disaster Recovery

  • Service Specific Volumes

Configurations in Which Network Segmentation Is Not Supported

Network segmentation is not supported in the following configurations:

  • Clusters on which the CVMs have a manually created eth2 interface.
  • Clusters on which the eth2 interface on one or more CVMs have been assigned an IP address manually. During an upgrade to an AOS release that supports network segmentation, an eth2 interface is created on each CVM in the cluster. Even though the cluster does not use these interfaces until you configure network segmentation, you must not manually configure these interfaces in any way.
Caution:

Nutanix has deprecated support for manual multi-homed CVM network interfaces from AOS version 5.15 and later. Such a manual configuration can lead to unexpected issues on these releases. If you have configured an eth2 interface on the CVM manually, refer to the KB-9479 and Nutanix Field Advisory #78 for details on how to remove the eth2 interface.

Configuring the Network on an AHV Host

These steps describe how to configure host networking for physical and service-specific network segmentation on an AHV host. These steps are prerequisites for physical and service-specific network segmentation and you must perform these steps before you perform physical or service-specific traffic isolation. If you are configuring networking on an ESXi host, perform the equivalent steps by referring to the ESXi documentation. On ESXi, you create vSwitches and port groups to achieve the same results.

About this task

For information about the procedures to create, update and delete a virtual switch in Prism Element Web Console, see Configuring a Virtual Network for Guest VMs in the Prism Web Console Guide .

Note: The term unconfigured node in this procedure refers to a node that is not part of a cluster and is being prepared for cluster expansion.

To configure host networking for physical and service-specific network segmentation, do the following:

Note: If you are segmenting traffic on nodes that are already part of a cluster, perform the first step. If you are segmenting traffic on an unconfigured node that is not part of a cluster, perform the second step directly.

Procedure

  1. If you are segmenting traffic on nodes that are already part of a cluster, do the following:
    1. From the default virtual switch vs0, remove the uplinks that you want to add to the virtual switch you created by updating the default virtual switch.

      For information about updating the default virtual switch vs0 to remove the uplinks, see Creating or Updating a Virtual Switch in the Prism Web Console Guide .

    2. Create a virtual switch for the backplane traffic or service whose traffic you want to isolate.
      Add the uplinks to the new virtual switch.

      For information about creating a new virtual switch, see Creating or Updating a Virtual Switch in the Prism Web Console Guide .

  2. If you are segmenting traffic on an unconfigured node (new host) that is not part of a cluster, do the following:
    1. Create a bridge for the backplane traffic or service whose traffic you want to isolate by logging on to the new AHV host.
      ovs-vsctl add-br br1
    2. From the default bridge br0, log on to the host CVM and keep only eth0 and eth1 in br0.
      manage_ovs --bridge_name br0 --interfaces eth0,eth1 --bond_name br0-up --bond_mode active-backup update_uplinks
    3. Log on to the host CVM and then add eth2 and eth3 to the uplink bond of br1.
      manage_ovs --bridge_name br1 --interfaces eth2,eth3 --bond_name br1-up --bond_mode active-backup update_uplinks
      Note: If this step is not done correctly, a network loop can be created that causes a network outage. Ensure that no other uplink interfaces exist on this bridge before adding the new interfaces, and always add interfaces into a bond.

What to do next

Prism can configure a VLAN only on AHV hosts. Therefore, if the hypervisor is ESXi, in addition to configuring the VLAN on the physical switch, make sure to configure the VLAN on the port group.

If you are performing physical network segmentation, see Physically Isolating the Backplane Traffic on an Existing Cluster.

If you are performing service-specific traffic isolation, see Service-Specific Traffic Isolation.

Network Segmentation for Traffic Types (Backplane, Management, and RDMA)

You can segment the network on a Nutanix cluster in the following ways:

  • You can segment the network on an existing cluster by using the Prism web console.
  • You can segment the network when creating a cluster by using Nutanix Foundation 3.11.2 or higher versions.

The following topics describe network segmentation procedures for existing clusters and changes during AOS upgrade and cluster expansion. For more information about segmenting the network when creating a cluster, see the Field Installation Guide.

Isolating the Backplane Traffic Logically on an Existing Cluster (VLAN-Based Segmentation Only)

You can segment the network on an existing cluster by using the Prism web console. You must configure a separate VLAN for the backplane network to achieve logical segmentation. The network segmentation process creates a separate network for backplane communications on the existing default virtual switch. The process then places the eth2 interfaces (that the process creates on the CVMs during upgrade) and the host interfaces on the newly created network. This method allows you to achieve logical segmentation of traffic over the selected VLAN. From the specified subnet, assign IP addresses to each new interface. You, therefore, need two IP addresses per node. When you specify the VLAN ID, AHV places the newly created interfaces on the specified VLAN.

Before you begin

If your cluster has RDMA-enabled NICs, follow the procedure in Isolating the Backplane Traffic on an Existing RDMA Cluster.

  • For ESXi clusters, it is mandatory to create and manage port groups that networking uses for CVM and backplane networking. Therefore, ensure that you create port groups on the default virtual switch vs0 for the ESXi hosts and CVMs.

    Since backplane traffic segmentation is logical, it is based on the VLAN that is tagged for the port groups. Therefore, while creating the port groups ensure that you tag the new port groups created for the ESXi hosts and CVMs with the appropriate VLAN ID. Consult your networking team to acquire the necessary VLANs for use with Nutanix nodes.

  • For new backplane networks, you must specify a non-routable subnet. The interfaces on the backplane network are automatically assigned IP addresses from this subnet, so reserve the entire subnet for the backplane network segmentation. See the Configuring Backplane IP Pool topic to create an IP pool for backplane interfaces.

About this task

You need separate VLANs for Management network and Backplane network. For example, configure VLAN 100 as Management network VLAN and VLAN 200 as Backplane network VLAN on the Ethernet links that connect the Nutanix nodes to the physical switch.
Note: Nutanix does not control these VLAN IDs. Consult your networking team to acquire VLANs for the Management and Backplane networks.

To segment the network on an existing ESXi and Hyper-V clusters for a backplane LAN, do the following:

To segment the network on an existing AHV cluster for a backplane LAN, follow the procedure described in the Physically Isolating the Backplane Traffic on an AHV Cluster topic.

Note:

In this method, for AHV nodes, logical segmentation (VLAN-based segmentation) is done on the default bridge. The process creates the host backplane interface on the Backplane Network port group on ESXi or br0-backplane (interface) on br0 bridge in case of AHV. The eth2 interface on the CVM is on CVM Backplane Network by default.

Procedure

  1. Log on to the Prism web console, click the gear icon in the top-right corner, and then click Network Configuration in the Settings page.
    The Network Configuration dialog box appears.
  2. In the Network Configuration > Internal Interfaces > Backplane LAN row, click Configure .
    The Create Interface dialog box appears.
  3. In the Create Interface dialog box, provide the necessary information.
    • In Subnet IP , specify a non-routable subnet.

      Ensure that the subnet has sufficient IP addresses. The segmentation process requires two IP addresses per node. Reconfiguring the backplane to increase the size of the subnet involves cluster downtime, so you might also want to make sure that the subnet can accommodate new nodes in the future.

    • In Netmask , specify the netmask.
    • If you want to assign the interfaces on the network to a VLAN, specify the VLAN ID in the VLAN ID field.

      Nutanix recommends that you use a VLAN. If you do not specify a VLAN ID, the default VLAN on the virtual switch is used.

  4. Click Verify and Save .
    The network segmentation process creates the backplane network if the network settings that you specified pass validation.

Isolating the Backplane Traffic on an Existing RDMA Cluster

Segment the network on an existing RDMA cluster by using the Prism web console.

About this task

The network segmentation process creates a separate network for RDMA communications on the existing default virtual switch and places the rdma0 interface (created on the CVMs during upgrade) and the host interfaces on the newly created network. From the specified subnet, IP addresses are assigned to each new interface. Two IP addresses are therefore required per node. If you specify the optional VLAN ID, the newly created interfaces are placed on the VLAN. A separate VLAN is highly recommended for the RDMA network to achieve true segmentation.

Before you begin

  • For new RDMA networks, you must specify a non-routable subnet. The interfaces on the backplane network are automatically assigned IP addresses from this subnet, so reserve the entire subnet for the backplane network alone.
  • If you plan to specify a VLAN for the RDMA network, make sure that the VLAN is configured on the physical switch ports to which the nodes are connected.
  • Configure the switch interface as a Trunk port.
  • Ensure that this cluster is configured to support RDMA during installation using the Foundation.

Procedure

  1. Log on to the Prism web console, click the gear icon in the top-right corner, and then click Network Configuration in the Settings page.
    The Network Configuration dialog box is displayed.
  2. Click the Internal Interfaces tab.
  3. Click Configure in the RDMA row.
    Ensure that you have configured the switch interface as a trunk port.
    The Create Interface dialog box is displayed.
    Figure. Create Interface Dialog Box Click to enlarge

  4. In the Create Interface dialog box, do the following:
    1. In Subnet IP and Netmask , specify a non-routable subnet and netmask, respectively. Make sure that the subnet can accommodate cluster expansion in the future.
    2. In VLAN , specify a VLAN ID for the RDMA LAN.
      A VLAN ID is optional but highly recommended for true network segmentation and enhanced security.
    3. c. From the PFC list, select the priority flow control value configured on the physical switch port.
  5. Click Verify and Save .
  6. Click Close .

Physically Isolating the Backplane Traffic on an Existing Cluster

By using the Prism web console, you can configure the eth2 interface on a separate virtual switch if you wish to isolate the backplane traffic to a separate physical network.

If you do not configure as separate virtual switch, the backplane traffic uses another VLAN in the default switch for VLAN-based traffic isolation.

A virtual switch is known as the following in different hypervisors.

Hypervisor Virtual Switch
AHV Virtual Switch
ESXi vSwitch
Hyper-V Hyper-V Virtual Switch

Network segmentation process creates a separate network for backplane communications on the new virtual switch. The segmentation process places the CVM eth2 interfaces and the host interfaces on the newly created network. Specify a subnet with a network mask and, optionally, a VLAN ID. From the specified subnet or an IP Pool assign IP addresses to each new interface in the new network. You require a minimum of two IP addresses per node.

If you specify the optional VLAN ID, the newly created interfaces are placed on VLAN.

Nutanix highly recommends a separate VLAN for the backplane network to achieve true segmentation.

Requirements and Limitations

  • Ensure that physical isolation of backplane traffic is supported by the AOS version deployed.
  • Ensure that you configure the network (port groups or bridges) on the hosts and associate the network with the required physical NICs before you enable physical isolation of the backplane traffic.

    For AHV, see Configuring the Network on an AHV Host. For ESXI and Hyper-V, see VMware and Microsoft documentation respectively.

  • Segmenting backplane traffic can involve up to two rolling reboots of the CVMs. The first rolling reboot is done to move the backplane interface (eth2) of the CVM to the selected port group, virtual switch or Hyper-V switch. This is done only for CVM(s) whose backplane interface is not already connected to the selected port group, virtual switch or Hyper-V switch. The second rolling reboot is done to migrate the cluster services to the newly configured backplane interface.
Physically Isolating the Backplane Traffic on an AHV Cluster

Before you begin

On the AHV hosts, do the following:

  1. From the default virtual switch vs0, remove the uplinks (physical NICs) that you want to add to a new virtual switch you create for the backplane traffic in the next step.
  2. Create a virtual switch for the backplane traffic.

    Add the uplinks to the new bond when you create the new virtual switch.

See Configuring the Network on an AHV Host for instructions about how to perform these tasks on a host.

Note: Before you perform the following procedure, ensure that the uplinks you added to the virtual switch are in the UP state.

About this task

Perform the following procedure to physically segment the backplane traffic on an AHV cluster.

Procedure

  1. Shut down all the guest VMs in the cluster from within the guest OS or use the Prism Element web console.
  2. Place all nodes of a cluster into the maintenance mode.
    1. Use SSH to log on to a Controller VM in the cluster
    2. Determine the IP address of the node you want to put into the maintenance mode:
      nutanix@cvm$ acli host.list
      Note the value of Hypervisor IP for the node you want to put in the maintenance mode.
    3. Put the node into the maintenance mode:
      nutanix@cvm$ acli host.enter_maintenance_mode hypervisor-IP-address [wait="{ true | false }" ] [non_migratable_vm_action="{ acpi_shutdown | block }" ]
      Note: Never put Controller VM and AHV hosts into maintenance mode on single-node clusters. It is recommended to shutdown user VMs before proceeding with disruptive changes.

      Replace host-IP-address with either the IP address or host name of the AHV host you want to shut down.

      The following are optional parameters for running the acli host.enter_maintenance_mode command:

      • wait
      • non_migratable_vm_action

      Do not continue if the host has failed to enter the maintenance mode.

    4. Verify if the host is in the maintenance mode:
      nutanix@cvm$ acli host.get host-ip

      In the output that is displayed, ensure that node_state equals to EnteredMaintenanceMode and schedulable equals to False .

  3. Enable backplane network segmentation.
    1. Log on to the Prism web console, click the gear icon in the top-right corner, and then click Network Configuration in the Settings page.
    2. On the Internal Interfaces tab, in the Backplane LAN row, click Configure .
    3. In the Backplane LAN dialog box, do the following:
      • In Subnet IP , specify a non-routable subnet that is different from the subnet used by the AHV host and CVMs.

        The AOS CVM default route uses the CVM eth0 interface, and there is no route on the backplane interface. Therefore, Nutanix recommends only using a non-routable subnet for the backplane network. To avoid split routing, do not use a routable subnet for the backplane network.

        Make sure that the backplane subnet has a sufficient number of IP addresses. Two IP addresses are required per node. Reconfiguring the backplane to increase the size of the subnet involves cluster downtime, so you might also want to make sure that the subnet can accommodate new nodes in the future.

      • In Netmask , specify the network mask.
      • If you want to assign the interfaces on the network to a VLAN, specify the VLAN ID in the VLAN ID field.

        Nutanix strongly recommends configuring a separate VLAN. If you do not specify a VLAN ID, AOS applies the untagged VLAN on the virtual switch.

      • In the Virtual Switch list, select the virtual switch you created for the backplane traffic.
    4. Click Verify and Save .
      If the network settings you specified pass validation, the backplane network is created and the CVMs perform a reboot in a rolling fashion (one at a time), after which the services use the new backplane network. The progress of this operation can be tracked on the Prism tasks page.
  4. Log on to a CVM in the cluster with SSH and stop Acropolis cluster-wide:
    nutanix@cvm$ allssh genesis stop acropolis 
  5. Restart Acropolis cluster-wide:
    nutanix@cvm$ cluster start 
  6. Remove all nodes from the maintenance mode.
    1. From any CVM in the cluster, run the following command to exit the AHV host from the maintenance mode:
      nutanix@cvm$ acli host.exit_maintenance_mode host-ip

      Replace host-ip with the new IP address of the host.

      This command migrates (live migration) all the VMs that were previously running on the host back to the host.

    2. Verify if the host has exited the maintenance mode:
      nutanix@cvm$ acli host.get host-ip

      In the output that is displayed, ensure that node_state equals to kAcropolisNormal or AcropolisNormal and schedulable equals to True .

  7. Power on the guest VMs from the Prism Element web console.
Physically Isolating the Backplane Traffic on an ESXi Cluster

Before you begin

On the ESXi hosts, do the following:

  1. Create a vSwitch for the backplane traffic.
  2. From vSwitch0, remove the uplinks (physical NICs) that you want to add to the vSwitch you created for the backplane traffic.
  3. On the backplane vSwitch, create one port group for the CVM and another for the host. Ensure that at least one uplink is present in the Active Adaptors list for each port group if you have overridden the failover order.

See the ESXi documentation for instructions about how to perform these tasks.

Note: Before you perform the following procedure, ensure that the uplinks you added to the vSwitch are in the UP state.

About this task

Perform the following procedure to physically segment the backplane traffic.

Procedure

  1. Log on to the Prism web console, click the gear icon in the top-right corner, and then click Network Configuration in the Settings page.
  2. On the Internal Interfaces tab, in the Backplane LAN row, click Configure .
  3. In the Backplane LAN dialog box, do the following:
    1. In Subnet IP , specify a non-routable subnet.
      If you do not specify a secure non-routable subnet, AHV uses the routable subnet on the default gateway. AOS does not route packets from the backplane network. Therefore, Nutanix recommends only using a secure non-routable subnet for the backplane network. Do not use a routable subnet for this purpose.

      Make sure that the subnet has a sufficient number of IP addresses. Two IP addresses are required per node. Reconfiguring the backplane to increase the size of the subnet involves cluster downtime, so you might also want to make sure that the subnet can accommodate new nodes in the future.

    2. In Netmask , specify the network mask.
    3. If you want to assign the interfaces on the network to a VLAN, specify the VLAN ID in the VLAN ID field.
      Nutanix strongly recommends configuring a separate VLAN. If you do not specify a VLAN ID, AOS applies the default VLAN on the virtual switch.
    4. In the Host Port Group list, select the port group you created for the host.
    5. In the CVM Port Group list, select the port group you created for the CVM.
    Note:

    Nutanix clusters support both vSphere Standard Switches and vSphere Distributed Switches. However, you must mandatorily configure only one type of virtual switches in one cluster. Configure all the backplane and management traffic in one cluster on either vSphere Standard Switches or vSphere Distributed Switches. Do not mix Standard and Distributed vSwitches on a single cluster.

  4. Click Verify and Save .
    If the network settings you specified pass validation, the backplane network is created and the CVMs perform a reboot in a rolling fashion (one at a time), after which the services use the new backplane network. The progress of this operation can be tracked on the Prism tasks page.
Physically Isolating the Backplane Traffic on a Hyper-V Cluster

Before you begin

On the Hyper-V hosts, do the following:

  1. Create a Hyper-V Virtual Switch for the backplane traffic.
  2. From the default External Switch, remove the uplinks (physical NICs) that you want to add to the backplane Virtual Switch you created for the backplane traffic.
  3. On the backplane Virtual Switch, create a subnet and, optionally, assign a VLAN.

See the Hyper-V documentation on the Microsoft portal for instructions about how to perform these tasks.

Note: Before you perform the following procedure, ensure that the uplinks you added to the backplane Virtual Switch are in the UP state.

About this task

Perform the following procedure to physically segment the backplane traffic.

Procedure

  1. Log on to the Prism web console, click the gear icon in the top-right corner, and then click Network Configuration in the Settings page.
  2. On the Internal Interfaces tab, in the Backplane LAN row, click Configure .
  3. In the Backplane LAN dialog box, do the following:
    1. In Subnet IP , specify a non-routable subnet.
      If you do not specify a secure non-routable subnet, AHV uses the routable subnet on the default gateway. AOS does not route packets from the backplane network. Therefore, Nutanix recommends only using a secure non-routable subnet for the backplane network. Do not use a routable subnet for this purpose.

      Make sure that the subnet has a sufficient number of IP addresses. Two IP addresses are required per node. Reconfiguring the backplane to increase the size of the subnet involves cluster downtime, so you might also want to make sure that the subnet can accommodate new nodes in the future.

    2. In Netmask , specify the network mask.
    3. If you want to assign the interfaces on the network to a VLAN, specify the VLAN ID in the VLAN ID field.
      Nutanix strongly recommends configuring a separate VLAN. If you do not specify a VLAN ID, AOS applies the default VLAN on the virtual switch.
    4. In the Bridge list, select the Hyper-V switch you created for the backplane traffic.
  4. Click Verify and Save .
    If the network settings you specified pass validation, the backplane network is created and the CVMs perform a reboot in a rolling fashion (one at a time), after which the services use the new backplane network. The progress of this operation can be tracked on the Prism tasks page.
    Note: Segmenting backplane traffic can involve up to two rolling reboots of the CVMs. The first rolling reboot is done to move the backplane interface (eth2) of the CVM to the selected port group or virtual switch. This is done only for CVM(s) whose backplane interface is not already connected to the selected port group or bridge virtual switch. The second rolling reboot is done to migrate the cluster services to the newly configured backplane interface.

Reconfiguring the Backplane Network

Backplane network reconfiguration is a CLI-driven procedure that you perform on any one of the CVMs in the cluster. The change is propagated to the remaining CVMs.

About this task

Caution: At the end of this procedure, the cluster stops and restarts, even if only the VLAN is changed, and therefore involves cluster downtime.

To reconfigure the cluster, do the following:

Procedure

  1. Log on to any CVM in the cluster using SSH.
  2. Reconfigure the backplane network.
    nutanix@cvm$ backplane_ip_reconfig [--backplane_vlan=vlan-id] \
    [--backplane_ip_pool=ip_pool_name]

    Replace vlan-id with the new VLAN ID, and ip_pool_name with the newly created backplane IP pool.

    See Configuring Backplane IP Pool to create a backplane IP pool.

    For example, reconfigure the backplane network to use VLAN ID 10 and newly created backplane IP pool.

    nutanix@cvm$ backplane_ip_reconfig --backplane_vlan=10 \
    --backplane_ip_pool=NewBackplanePool

    Output similar to the following is displayed:

    This operation will do a 'cluster stop', resulting in disruption of 
    cluster services. Do you still want to continue? (Type "yes" (without quotes) 
    to continue)
    Type yes to confirm that you want to reconfigure the backplane network.
    Caution: During the reconfiguration process, you might receive an error message similar to the following.
    Failed to reach a node.
    You can safely ignore this error message and therefore do not stop the script manually.
    Note: The backplane_ip_reconfig command is not supported on ESXi clusters with vSphere Distributed Switches. To reconfigure the backplane network on a vSphere Distributed Switch setup, disable the backplane network (see Disabling Network Segmentation on an ESXi and Hyper-V Clusters) and enable again with a different subnet or VLAN.
  3. Type yes to confirm that you want to reconfigure the backplane network.
    The reconfiguration procedure takes a few minutes and includes a cluster restart. If you type anything other than yes , network reconfiguration is aborted.
  4. After the process completes, verify that the backplane was reconfigured.
    1. Verify that the IP addresses of the eth2 interfaces on the CVM are set correctly.
      nutanix@cvm$ svmips -b
      Output similar to the following is displayed:
      172.30.25.1 172.30.25.3 172.30.25.5
    2. Verify that the IP addresses of the backplane interfaces of the hosts are set correctly.
      nutanix@cvm$ hostips -b
      Output similar to the following is displayed:
      172.30.25.2 172.30.25.4 172.30.25.6
    The svmips and hostips commands, when used with the option b , display the IP addresses assigned to the interfaces on the backplane.

Disabling Network Segmentation on an ESXi and Hyper-V Clusters

Backplane network reconfiguration is a CLI-driven procedure that you perform on any one of the CVMs in the cluster. The change is propagated to the remaining CVMs.

About this task

Procedure

  1. Log on to any CVM in the cluster using SSH.
  2. Disable the backplane network.
    • Use this CLI to disable network segmentation on an ESXi and Hyper-V cluster:
      nutanix@cvm$ network_segmentation --backplane_network --disable

      Output similar to the following appears:

      Operation type : Disable
      Network type : kBackplane
      Params : {}
      Please enter [Y/y] to confirm or any other key to cancel the operation

      Type Y/y to confirm that you want to reconfigure the backplane network.

      If you type Y/y, network segmentation is disabled and the cluster restarts in a rolling manner, one CVM at a time. If you type anything other than Y/y, network segmentation is not disabled.

      This method does not involve cluster downtime.

  3. Verify that network segmentation was successfully disabled. You can verify this in one of two ways:
    • Verify that the backplane is disabled.
      nutanix@cvm$ network_segment_status

      Output similar to the following is displayed:

      2017-11-23 06:18:23 INFO zookeeper_session.py:110 network_segment_status is attempting to connect to Zookeeper

      Network segmentation is disabled

    • Verify that the commands to show the backplane IP addresses of the CVMs and hosts list the management IP addresses (run the svmips and hostips commands once without the b option and once with the b option, and then compare the IP addresses shown in the output).
      Important:
      nutanix@cvm$ svmips
      192.127.3.2 192.127.3.3 192.127.3.4
      nutanix@cvm$ svmips -b
      192.127.3.2 192.127.3.3 192.127.3.4
      nutanix@cvm$ hostips
      192.127.3.5 192.127.3.6 192.127.3.7
      nutanix@cvm$ hostips -b
      192.127.3.5 192.127.3.6 192.127.3.7

      In the example above, the outputs of the svmips and hostips commands with and without the b option are the same, indicating that the backplane network segmentation is disabled.

Disabling Network Segmentation on an AHV Cluster

About this task

You perform backplane network reconfiguration procedure on any one of the CVMs in the cluster. The change propagates to the remaining CVMs.

Procedure

  1. Shut down all the guest VMs in the cluster from within the guest OS or use the Prism Element web console.
  2. Place all nodes of a cluster into the maintenance mode.
    1. Use SSH to log on to a Controller VM in the cluster
    2. Determine the IP address of the node you want to put into the maintenance mode:
      nutanix@cvm$ acli host.list
      Note the value of Hypervisor IP for the node you want to put in the maintenance mode.
    3. Put the node into the maintenance mode:
      nutanix@cvm$ acli host.enter_maintenance_mode hypervisor-IP-address [wait="{ true | false }" ] [non_migratable_vm_action="{ acpi_shutdown | block }" ]
      Note: Never put Controller VM and AHV hosts into maintenance mode on single-node clusters. It is recommended to shutdown user VMs before proceeding with disruptive changes.

      Replace host-IP-address with either the IP address or host name of the AHV host you want to shut down.

      The following are optional parameters for running the acli host.enter_maintenance_mode command:

      • wait
      • non_migratable_vm_action

      Do not continue if the host has failed to enter the maintenance mode.

    4. Verify if the host is in the maintenance mode:
      nutanix@cvm$ acli host.get host-ip

      In the output that is displayed, ensure that node_state equals to EnteredMaintenanceMode and schedulable equals to False .

  3. Disable backplane network segmentation from the Prism Web Console.
    1. Log on to the Prism web console, click the gear icon in the top-right corner, and then click Network Configuration under the Settings .
    2. In the Internal Interfaces tab, in the Backplane LAN row, click Disable .
      Figure. Disable Network Configuration Click to enlarge

    3. Click Yes to disable Backplane LAN.

      This involves a rolling reboot of CVMs to migrate the cluster services back to the external interface.

  4. Log on to a CVM in the cluster with SSH and stop Acropolis cluster-wide:
    nutanix@cvm$ allssh genesis stop acropolis 
  5. Restart Acropolis cluster-wide:
    nutanix@cvm$ cluster start 
  6. Remove all nodes from the maintenance mode.
    1. From any CVM in the cluster, run the following command to exit the AHV host from the maintenance mode:
      nutanix@cvm$ acli host.exit_maintenance_mode host-ip

      Replace host-ip with the new IP address of the host.

      This command migrates (live migration) all the VMs that were previously running on the host back to the host.

    2. Verify if the host has exited the maintenance mode:
      nutanix@cvm$ acli host.get host-ip

      In the output that is displayed, ensure that node_state equals to kAcropolisNormal or AcropolisNormal and schedulable equals to True .

  7. Power on the guest VMs from the Prism Element web console.

Service-Specific Traffic Isolation

Isolating the traffic associated with a specific service is a two-step process. The process is as follows:

  • Configure the networks and uplinks on each host manually. Prism only creates the VNIC that the service requires, and it places that VNIC on the bridge or port group that you specify. Therefore, you must manually create the bridge or /port group on each host and add the required physical NICs as uplinks to that bridge or port group.
  • Configure network segmentation for the service by using Prism. Create an extra VNIC for the service, specify any additional parameters that are required (for example, IP address pools), and the bridge or port group that you want to dedicate to the service.

Isolating Service-Specific Traffic

Before you begin

  • Ensure to configure each host as described in Configuring the Network on an AHV Host.
  • Review Prerequisites.

About this task

To isolate a service to a separate virtual network, do the following:

Procedure

  1. Log on to the Prism web console and click the gear icon at the top-right corner of the page.
  2. In the left pane, click Network Configuration .
  3. In the details pane, on the Internal Interfaces tab, click Create New Interface .
    The Create New Interface dialog box is displayed.
  4. On the Interface Details tab, do the following:
    1. Specify a descriptive name for the network segment.
    2. (On AHV) Optionally, in VLAN ID , specify a VLAN ID.
      Make sure that the VLAN ID is configured on the physical switch.
    3. In Bridge (on AHV) or CVM Port Group (on ESXi), select the bridge or port group that you created for the network segment.
    4. To specify an IP address pool for the network segment, click Create New IP Pool , and then, in the IP Pool dialog box, do the following:
      • In Name , specify a name for the pool.
      • In Netmask , specify the network mask for the pool.
      • Click Add an IP Range , specify the start and end IP addresses in the IP Range dialog box that is displayed.
      • Use Add an IP Range to add as many IP address ranges as you need.
        Note: Add at least n+1 IP addresses in an IP range considering n is the number of nodes in the cluster.
      • Click Save .
      • Use Add an IP Pool to add more IP address pools. You can use only one IP address pool at any given time.
      • Select the IP address pool that you want to use, and then click Next .
        Note: You can also use an existing unused IP address pool.
  5. On the Feature Selection tab, do the following:
    You cannot enable network segmentation for multiple services at the same time. Complete the configuration for one service before you enable network segmentation for another service.
    1. Select the service whose traffic you want to isolate.
    2. Configure the settings for the selected service.
      The settings on this page depend on the service you select. For information about service-specific settings, see Service-Specific Settings and Configurations.
    3. Click Save .
  6. In the Create Interface dialog box, click Save .
    The CVMs are rebooted multiple times, one after another. This procedure might trigger more tasks on the cluster. For example, if you configure network segmentation for disaster recovery, the firewall rules are added on the CVM to allow traffic on the specified ports through the new CVM interface and updated when a new recovery cluster is added or an existing cluster is modified.

What to do next

See Service-Specific Settings and Configurations for any additional tasks that are required after you segment the network for a service.

Modifying Network Segmentation Configured for a Service

To modify network segmentation configured for a service, you must first disable network segmentation for that service and then create the network interface again for that service with the new IP address pool and VLAN.

About this task

For example, if the interface of the service you want to modify is ntnx0, after the reconfiguration, the same interface (ntnx0) is assigned to that service if that interface is not assigned to any other service. If ntnx0 is assigned to another service, a new interface (for example ntnx1) is created and assigned to that service.

Perform the following to reconfigure network segmentation configured for a service.

Procedure

  1. Disable the network segmentation configured for a service by following the instructions in Disabling Network Segmentation Configured for a Service.
  2. Create the network again by following the instructions in Isolating Service-Specific Traffic.

Disabling Network Segmentation Configured for a Service

To disable network segmentation configured for a service, you must disable the dedicated VNIC. Disabling network segmentation frees up the name of the VNIC. Disabling network segmentation frees up the vNIC’s name. The free name is reused in a subsequent network segmentation configuration.

About this task

At the end of this procedure, the cluster performs a rolling restart. Disabling network segmentation might also disrupt the functioning of the associated service. To restore normal operations, you might have to perform other tasks immediately after the cluster has completed the rolling restart. For information about the follow-up tasks, see Service-Specific Settings and Configurations.

To disable the network segmentation configured for a service, do the following:

Procedure

  1. Log on to the Prism web console and click the gear icon at the top-right corner of the page.
  2. In the left pane, click Network Configuration .
  3. On the Internal Interfaces tab, for the interface that you want to disable, click Disable .
    Note: The defined IP address pool is available even after disabling the network segmentation.

Deleting a vNIC Configured for a Service

If you disable network segmentation for a service, the vNIC for that service is not deleted. AOS reuses the vNIC if you enable network segmentation again. However, you can manually delete a vNIC by logging into any CVM in the cluster with SSH.

Before you begin

Ensure that the following prerequisites are met before you delete the vNIC configured for a Service:
  • Disable the network segmentation configured for a service by following the instructions in Disabling Network Segmentation Configured for a Service.
  • Observe the Limitation specified in Limitation for vNIC Hot-Unplugging topic in AHV Admin Guide .
you

About this task

Perform the following to delete a vNIC.

Procedure

  1. Log on to any CVM in the cluster with SSH.
  2. Delete the vNIC.
    nutanix@cvm$ network_segmentation --service_network --interface="interface-name" --delete

    Replace interface-name with the name of the interface you want to delete. For example, ntnx0.

Service-Specific Settings and Configurations

The following sections describe the settings required by the services that support network segmentation.

Nutanix Volumes

Network segmentation for Volumes also requires you to migrate iSCSI client connections to the new segmented network. If you no longer require segmentation for Volumes traffic, you must also migrate connections back to eth0 after disabling the vNIC used for Volumes traffic.

You can create two different networks for Nutanix Volumes with different IP pools, VLANs, and data services IP addresses. For example, you can create two iSCSI networks for production and non-production traffic on the same Nutanix cluster.

Follow the instructions in Isolating Service-Specific Traffic again to create the second network for Volumes after you create the first network.

Table 1. Settings to be Specified When Configuring Traffic Isolation
Parameter or Setting Description
Virtual IP (Optional) Virtual IP address for the service. If specified, the IP address must be picked from the specified IP address pool. If not specified, an IP address from the specified IP address pool is selected for you.
Client Subnet The network (in CIDR notation) that hosts the iSCSI clients. Required If the vNIC created for the service on the CVM is not on the same network as the clients.
Gateway Gateway to the subnetwork that hosts the iSCSI clients. Required If you specify the client subnet.
Migrating iSCSI Connections to the Segmented Network

After you enable network segmentation for Volumes, you must manually migrate connections from existing iSCSI clients to the newly segmented network.

Before you begin

Make sure that the task for enabling network segmentation for the service succeeds.

About this task

Note: Even though support is available to run iSCSI traffic on both the segmented and management networks at the same time, Nutanix recommends that you move the iSCSI traffic for guest VMs to the segmented network to achieve true isolation.

To migrate iSCSI connections to the segmented network, do the following:

Procedure

  1. Log out from all the clients connected to iSCSI targets that are using CVM eth0 or the Data Service IP address.
  2. Optionally, remove all the discovery records for the Data Services IP address (DSIP) on eth0.
  3. If the clients are allowlisted by their IP address, remove the client IP address that is on the management network from the allowlist, and then add the client IP address on the new network to the allowlist.
    nutanix@cvm$ acli vg.detach_external vg_name initiator_network_id=old_vm_IP
    nutanix@cvm$ acli vg.attach_external vg_name initiator_network_id=new_vm_IP
    

    Replace vg_name with the name of the volume group and old_vm_IP and new_vm_IP with the old and new client IP addresses, respectively.

  4. Discover the virtual IP address specified for Volumes.
  5. Connect to the iSCSI targets from the client.
Migrating Existing iSCSI Connections to the Management Network (Controller VM eth0)

About this task

To migrate existing iSCSI connections to eth0, do the following:

Procedure

  1. Log out from all the clients connected to iSCSI targets using the CVM vNIC dedicated to Volumes.
  2. Remove all the discovery records for the DSIP on the new interface.
  3. Discover the DSIP for eth0.
  4. Connect the clients to the iSCSI targets.
Disaster Recovery with Protection Domains

The settings for configuring network segmentation for disaster recovery apply to all Asynchronous, NearSync, and Metro Availability replication schedules. You can use disaster recovery with Asynchronous, NearSync, and Metro Availability replications only if both the primary site and the recovery site is configured with Network Segmentation. Before enabling or disabling the network segmentation on a host, disable all the disaster recovery replication schedules running on that host.

Note: Network segmentation does not support disaster recovery with Leap.
Table 1. Settings to be Specified When Configuring Traffic Isolation
Parameter or Setting Description
Virtual IP (Optional) Virtual IP address for the service. If specified, the IP address must be picked from the specified IP address pool. If not specified, an IP address from the specified IP address pool is selected for you.
Note: Virtual IP address is different from the external IP address and the data services IP address of the cluster.
Gateway Gateway to the subnetwork.
Remote Site Configuration

After configuring network segmentation for disaster recovery, configure remote sites at both locations. You also need to reconfigure remote sites if you disable network segmentation.

For information about configuring remote sites, see Remote Site Configuration in the Data Protection and Recovery with Prism Element Guide.

Segmenting a Stretched Layer 2 Network for Disaster Recovery

A stretched Layer 2 network configuration allows the source and remote metro clusters to be in the same broadcast domain and communicate without a gateway.

About this task

You can enable network segmentation for disaster recovery on a stretched Layer 2 network that does not have a gateway. A stretched Layer 2 network is usually configured across the physically remote clusters such as a metro availability cluster deployment. A stretched Layer 2 network allows the source and remote clusters to be configured in the same broadcast domain without the usual gateway.

See AOS Release Notes for minimum AOS version required to configure a stretched Layer 2 network.

To configure a network segment as a stretched L2 network, do the following.

Procedure

Run the following command:
nutanix@cvm$ network_segmentation --service_network --service_name=kDR --ip_pool= DR-ip-pool-name --service_vlan= DR-vlan-id --desc_name= Description --host_physical_network= portgroup/bridge --stretched_metro

Replace the following: (See Isolating Service-Specific Traffic for the information)

  • DR-ip-pool-name with the name of the IP Pool created for the DR service or any existing unused IP address pool.
  • DR-vlan-id with the VLAN ID being used for the DR service.
  • Description with a suitable description of this stretched L2 network segment.
  • portgroup/bridge with the details of Bridge or CVM Port Group used for the DR service.

For more information about the network_segmentation command, see the Command Reference guide.

Configuring Backplane IP Pool

This procedure shows how to create an IP pool for backplane interfaces using the new CLI.

About this task

Network Segmentation for backplane traffic previously required an entire subnet even if a cluster has a small number of nodes. This resulted in inefficient use of IP addresses. The backplane IP pool feature enables you to provide a small IP pool instead of an entire subnet.

You can create an IP address pool using the new network_segmentation ip_pool command. The named IP pool includes one or more IP ranges. For example, an IP address from 172.16.1.100 to 172.16.1.105 can be one IP range and 172.16.1.120 to 172.16.1.125 can be another IP range within the same named IP pool and same IP subnet.

At present in the Prism interface, there is no option to create an IP address pool for backplane segmentation. However, the Prism interface allows creating small IP address pools for service-specific traffic such as Volumes and DR. You can use the new network_segmentation ip_pool CLI to create IP address pools for Backplane, Volumes, and DR as well. You can also manage (edit, delete, and update) IP address pools that are created for Backplane, Volumes and DR using the new CLI.

Procedure

  1. Log on to any CVM in the cluster using SSH
  2. Create a new IP pool and define the IP ranges
    nutanix@cvm$ network_segmentation --ip_pool_name=IP-Pool-name --ip_pool_netmask=netmask 
    --ip_ranges="[(‘First-IP-Address', ‘Last-IP-Address'), (‘First-IP-Address', ‘Last-IP-Address')]" 
    ip_pool create

    Replace:

    • IP-Pool-name with a user defined IP pool name
    • Netmask with a network mask in dotted decimal mask notation
    • First-IP-Address with the first IP Address in the range
    • Last-IP-Address with the last IP Address in the same range

    For example:

    nutanix@cvm$ network_segmentation --ip_pool_name=BackplanePool --ip_pool_netmask=255.255.255.0 
    --ip_ranges="[('172.16.1.100', '172.16.1.105'), ('172.16.1.120', '172.16.1.125')]" 
    ip_pool create
    
  3. Enable Network Segmentation for backplane using the new IP pool
    nutanix@cvm$ network_segmentation --backplane_network --ip_pool=BackplanePool --backplane_vlan=1234
    --host_virtual_switch=vs1

Enabling Backplane Network Segmentation on a Mixed Hypervisor Cluster

This procedure shows how to enable Backplane Network Segmentation on a mixed hypervisor cluster.

About this task

You can enable Backplane Network Segmentation on a mixed hypervisor cluster containing:

  • ESXi and AHV storage only nodes.
  • Hyper-V and AHV storage only nodes.

Procedure

  1. Log on to any CVM in the cluster using SSH.
  2. Enable network segmentation for backplane traffic

    On a cluster containing ESXi and AHV storage only nodes:

    nutanix@cvm$ network_segmentation --backplane_network 
    --ip_pool=IP-pool-name  
    --backplane_vlan=VLAN-ID  
    [--esx_host_physical_network=ESXi-host-portgroup-name ]
    [--esx_cvm_physical_network=ESXi-cvm-portgroup-name ]
    [--ahv_host_physical_network=AHV-network-name ]
    

    On a cluster containing Hyper-V and AHV storage only nodes:

    nutanix@cvm$ network_segmentation --backplane_network 
    --ip_pool=IP-pool-name  
    --backplane_vlan=VLAN-ID 
    [--hyperv_host_physical_network=HyperV-host-network-name ] 
    [--ahv_host_physical_network=AHV-network-name ]
    

    In the above command replace:

    • IP-Pool-name with a user defined IP pool name

    • VLAN-ID with a backplane VLAN ID

    • ESXi-host-portgroup-name with the ESXi host network name

    • ESXi-cvm-portgroup-name with the ESXi CVM network name

    • AHV-network-name with the AHV storage only node bridge name

    • HyperV-host-network-name with the Hyper-V switch name

    For example, enable network segmentation on a mixed hypervisor containing ESXi and AHV storage only nodes:

    nutanix@cvm$ network_segmentation --backplane_network 
    --ip_pool=BackplanePool 
    --backplane_vlan=1234 
    --esx_host_physical_network=host-pg 
    --esx_cvm_physical_network=cvm-pg 
    --ahv_host_physical_network=br1

Updating Backplane Portgroup

You can update the backplane portgroups that are assigned to CVM and host nodes. Earlier, to change a portgroup that is assigned to a CVM and host, you had to disable Network Segmentation and re-enable with new portgroups.

This feature is only supported on a cluster running ESXi hypervisor.

Updating backplane portgroups helps you to:

  • Move from one vSphere Standard Standard Switch (VSS) portgroup to another VSS portgroup within the same virtual standard switch
  • Move from one VSS portgroup to another VSS portgroup in a different Virtual Standard Switch
  • Move from a VSS portgroup to a vSphere Distributed Switch (VDS) portgroup
  • Move from a VDS portgroup to a VSS
Note:

For renaming existing VSS or VDS portgroups, you must manually perform the rename operation from the vCenter application or use the ESXi CLI and run the update operation with the new portgroup. This is to ensure the configuration stored in the Nutanix internal database is up to date.

Limitations of Updating Backplane Portgroup

Consider the following limitations before updating backplane portgroups:

  • This feature is not supported on clusters running on the AHV and Hyper-V hypervisors.
  • This feature does not support updating any other configuration such as VLAN ID, and IP address.
Note: This feature does not perform any network validation on the new portgroups. Hence the user must ensure the portgroup settings are accurate before proceeding with the portgroup update operation. If the settings are not accurate, the CVM on that node may not be able to communicate with its peers and this results in a stuck rolling reboot.

Updating Backplane Portgroup

This procedure shows how to update backplane portgroups:

About this task

You can update the backplane portgroups that are assigned to CVM and host nodes

Procedure

  1. Log on to any CVM in the cluster using SSH
  2. Update the CVM and host portgroups
    nutanix@cvm$ network_segmentation --backplane_network 
    --host_physical_network=new-host-portgroup-name 
    --cvm_physical_network=new-cvm-portgroup-name
    --update 
    

    In the above command replace:

    • new-host-portgroup-name with the new host portgroup name
    • new-cvm-portgroup-name with the new CVM portgroup name

    For example:

    nutanix@cvm$ network_segmentation --backplane_network 
    --host_physical_network=new-bp-host-pgroup 
    --cvm_physical_network=new-bp-cvm-pgroup 
    --update
    

    For creating a port group, see Creating Port Groups on the Distributed Switch in vSphere Administration Guide for Acropolis .

IP Address Customization for each CVM and Host

This procedure shows how to create custom IP addresses for each CVM and host for network segmentation.

About this task

IP Address customization for each CVM and host feature enables you to allocate an IP address manually to a CVM and host. This helps in maintaining similarity between external and segmented IP addresses. This feature is supported while configuring backplane segmentation, service specific traffic isolation for Volumes, and service specific traffic isolation for Disaster Recovery.

Procedure

  1. Create a JSON file with a mapping of IP Addresses

    You must manually define the mapping of the CVM external IP address to the new segmented IP address in a JSON file. The segmented IP addresses should belong to the same IP address pool that is created from the Prism UI or the CLI command before starting the Network Segmentation operation. You can create and save the JSON file in any CVM in the cluster.

    Here is the example of JSON file format:

    • JSON file format for Backplane Segmentation:
      {
        "svmips": {
          `cvm_external_ip1`: `cvm_backplane_ip1`,
          `cvm_external_ip2`: `cvm_backplane_ip2`,
          `cvm_external_ip3`: `cvm_backplane_ip3`
        },
        "hostips": {
          `host_external_ip1`: `host_backplane_ip1`,
          `host_external_ip2`: `host_backplane_ip2`,
          `host_external_ip3`: `host_backplane_ip3`
        }
      }
      For example:
      {
        "svmips": {
          "10.47.240.141": "172.16.10.141",
          "10.47.240.142": "172.16.10.142",
          "10.47.240.143": "172.16.10.143"
        },
        "hostips": {
          "10.47.240.137": "172.16.10.137",
          "10.47.240.138": "172.16.10.138",
          "10.47.240.139": "172.16.10.139"
        }
      }
      
    • JSON file format for Service Segmentation:
      {
        "svmips": {
          `cvm_external_ip1`: `cvm_service_ip1`,
          `cvm_external_ip2`: `cvm_service_ip2`,
          `cvm_external_ip3`: `cvm_service_ip3`
          }
      }
      For example:
      {
        "svmips": {
          "10.47.240.141": "10.47.6.141",
          "10.47.240.142": "10.47.6.142",
          "10.47.240.143": "10.47.6.143"
        }
      }
      
  2. Log on to the CVM in the cluster where the JSON file exists using SSH
  3. Enable Service Specific Traffic Isolation for Volumes using the JSON file
    nutanix@CVM:~$ network_segmentation --service_network 
    --ip_pool=pool1 --desc_name="Volumes Seg 1" 
    --service_name=kVolumes 
    --host_physical_network=dv-volumes-network-1 
    --service_vlan=151 
    --ip_map_filepath=/home/nutanix/ip_map.json
    

Enabling Physical Backplane Segmentation on Hyper-V Using CLI

This procedure shows how to enable physical backplane segmentation on a cluster containing Hyper-V node using CLI.

About this task

Physical backplane segmentation support is now available on a cluster containing Hyper-V nodes. Earlier, the support was available on AHV and ESXi nodes.

Procedure

  1. Log on to any CVM in the cluster using SSH
  2. Enable backplane segmentation on a Hyper-V node
    nutanix@CVM:~$ network_segmentation --backplane_network
    --ip_pool=IP-Pool-name 
    --backplane_vlan=VLAN-ID 
    --host_physical_network=hyperv_host_physical_network

    In the above command replace:

    • IP-Pool-name with a user defined IP pool name

    • VLAN-ID with a backplane VLAN ID

    • hyperv_host_physical_network with the Hyper-V switch name

    For example:

    nutanix@CVM:~$ network_segmentation --backplane_network 
    --ip_pool=BackplanePool 
    --backplane_vlan=1234 
    --host_physical_network=BackplaneSwitch

Network Segmentation during Cluster Expansion

When expanding a cluster:

  • If you enable the backplane network segmentation, Prism allocates two IP addresses for every new node from the backplane IP Pool.
  • If you enable service-specific traffic isolation, Prism allocates one IP address for every new node from the respective (Volumes or DR) IP pools.
  • If enough IP addresses are not available in the specified network, the Prism Element web console displays a failure message in the tasks page. To add more IP ranges to the IP pool, see Configuring Backplane IP Pool.
  • If you cannot add more IPs to the IP pool, then reconfigure that specific network segmentation. For more information about how to reconfigure the network, see Reconfiguring the Backplane Network.
  • The network settings on the physical switch to which the new nodes are connected must be identical to the other nodes in the cluster. New nodes communicate with current nodes using the same VLAN ID for segmented networks. Otherwise, the expand cluster task will fail in the network validation stage.
  • After fulfilling the earlier points, you can add nodes to the cluster. For instructions about how to add nodes to your Nutanix cluster, see Expanding a Cluster in the Prism Web Console Guide .

Network Segmentation–Related Changes During an AOS Upgrade

When you upgrade from an AOS version which does not support network segmentation to an AOS version that does, the eth2 interface (used to segregate backplane traffic) is automatically created on each CVM. However, the network remains unsegmented, and the cluster services on the CVM continue to use eth0 until you configure network segmentation.

The vNICs ntnx0, ntnx1, and so on, are not created during an upgrade to a release that supports service-specific traffic isolation. They are created when you configure traffic isolation for a service.

Note:

Do not delete the eth2 interface that is created on the Controller VMs, even if you are not using the network segmentation feature.

Firewall Requirements

Ports and Protocols describes detailed port information (like protocol, service description, source, destination, and associated service) for Nutanix products and services. It includes port and protocol information for 1-click upgrades and LCM updates.

Log management

This chapter describes how to configure cluster-wide setting for log-forwarding and documenting the log fingerprint.

Log Forwarding

The Nutanix Controller VM provides a method for log integrity by using a cluster-wide setting to forward all the logs to a central log host. Due to the appliance form factor of the Controller VM, system and audit logs does not support local log retention periods as a significant increase in log traffic can be used to orchestrate a distributed denial of service attack (DDoS).

Nutanix recommends deploying a central log host in the management enclave to adhere to any compliance or internal policy requirement for log retention. In case of any system compromise, a central log host serves as a defense mechanism to preserve log integrity.

Note: The audit in the Controller VM uses the audisp plugin by default to ship all the audit logs to the rsyslog daemon (stored in /home/log/messages ). Searching for audispd in the central log host provides the entire content of the audit logs from the Controller VM. The audit daemon is configured with a rules engine that adheres to the auditing requirements of the Operating System Security Requirements Guide (OS SRG), and is embedded as part of the Controller VM STIG.

Use the nCLI to enable forwarding of system, audit, aide, and SCMA logs of all the Controller nodes in a cluster at the required log level. For more information, see Send Logs to Remote Syslog Server in the Acropolis Advanced Administration Guide

Documenting the Log Fingerprint

For forensic analysis, non-repudiation is established by verifying the fingerprint of the public key for the log file entry.

Procedure

  1. Login to the CVM.
  2. Run the following command to document the fingerprint for each public key assigned to an individual admin.
    nutanix@cvm$ ssh-keygen -lf /<location of>/id_rsa.pub

    The fingerprint is then compared to the SSH daemon log entries and forwarded to the central log host ( /home/log/secure in the Controller VM).

    Note: After completion of the ssh public key inclusion in Prism and verification of connectivity, disable the password authentication for all the Controller VMs and AHV hosts. From the Prism main menu, de-select Cluster Lockdown configuration > Enable Remote Login with password check box from the gear icon drop-down list.

Security Management Using Prism Central (PC)

Prism Central provides several mechanisms and features to enforce security of your multi-cluster environment.

If you enable Identity and Access Management (IAM), see Security Management Using Identity and Access Management (Prism Central).

Configuring Authentication

Caution: Prism Central does not support the SSLv2 and SSLv3 ciphers. Therefore, you must disable the SSLv2 and SSLv3 options in a browser before accessing Prism Central. This avoids an SSL Fallback and access denial situations. However, you must enable TLS protocol in the browser.

Prism Central supports these user authentication options:

  • SAML authentication. Users can authenticate through a supported identity provider when SAML support is enabled for Prism Central. The Security Assertion Markup Language (SAML) is an open standard for exchanging authentication and authorization data between two parties: an identity provider (IDP) and Prism Central as the service provider.

    If you do not enable Nutanix Identity and Access Management (IAM) on Prism Central, ADFS is the only supported IDP for Single Sign-on. If you enable IAM, additional IDPs are available. For more information, see Security Management Using Identity and Access Management (Prism Central) and Updating ADFS When Using SAML Authentication.

  • Local user authentication. Users can authenticate if they have a local Prism Central account. For more information, see Managing Local User Accounts .
  • Active Directory authentication. Users can authenticate using their Active Directory (or OpenLDAP) credentials when Active Directory support is enabled for Prism Central.

Adding An Authentication Directory (Prism Central)

Before you begin

Caution: Prism Central does not allow the use of the (not secure) SSLv2 and SSLv3 ciphers. To eliminate the possibility of an SSL Fallback situation and denied access to Prism Central, disable (uncheck) SSLv2 and SSLv3 in any browser used for access. However, TLS must be enabled (checked).

Procedure

  1. In the web console, click the gear icon in the main menu and then select Authentication in the Settings page.

    The Authentication Configuration window appears.

    Figure. Authentication Configuration Window Click to enlarge Authentication Configuration window main display

  2. To add an authentication directory, click the New Directory button.

    A set of fields is displayed. Do the following in the indicated fields:

    1. Directory Type : Select one of the following from the pull-down list.
      • Active Directory : Active Directory (AD) is a directory service implemented by Microsoft for Windows domain networks.
        Note:
        • Users with the "User must change password at next logon" attribute enabled will not be able to authenticate to Prism Central. Ensure users with this attribute first login to a domain workstation and change their password prior to accessing Prism Central. Also, if SSL is enabled on the Active Directory server, make sure that Nutanix has access to that port (open in firewall).
        • Use of the "Protected Users" group is currently unsupported for Prism authentication. For more details on the "Protected Users" group, see “Guidance about how to configure protected accounts” on Microsoft documentation website.
        • An Active Directory user name or group name containing spaces is not supported for Prism Central authentication.
        • The Microsoft AD is LDAP v2 and LDAP v3 compliant.
        • The Microsoft AD servers supported are Windows Server 2012 R2, Windows Server 2016, and Windows Server 2019.
      • OpenLDAP : OpenLDAP is a free, open source directory service, which uses the Lightweight Directory Access Protocol (LDAP), developed by the OpenLDAP project.
        Note: Prism Central uses a service account to query OpenLDAP directories for user information and does not currently support certificate-based authentication with the OpenLDAP directory.
    2. Name : Enter a directory name.

      This is a name you choose to identify this entry; it need not be the name of an actual directory.

    3. Domain : Enter the domain name.

      Enter the domain name in DNS format, for example, nutanix.com .

    4. Directory URL : Enter the URL address to the directory.

      The URL format is as follows for an LDAP entry: ldap:// host : ldap_port_num . The host value is either the IP address or fully qualified domain name. (In some environments, a simple domain name is sufficient.) The default LDAP port number is 389. Nutanix also supports LDAPS (port 636) and LDAP/S Global Catalog (ports 3268 and 3269). The following are example configurations appropriate for each port option:

      Note: LDAPS support does not require custom certificates or certificate trust import.
      • Port 389 (LDAP). Use this port number (in the following URL form) when the configuration is single domain, single forest, and not using SSL.
        ldap://ad_server.mycompany.com:389
      • Port 636 (LDAPS). Use this port number (in the following URL form) when the configuration is single domain, single forest, and using SSL. This requires all Active Directory Domain Controllers have properly installed SSL certificates.
        ldaps://ad_server.mycompany.com:636
      • Port 3268 (LDAP - GC). Use this port number when the configuration is multiple domain, single forest, and not using SSL.
      • Port 3269 (LDAPS - GC). Use this port number when the configuration is multiple domain, single forest, and using SSL.
        Note:
        • When constructing your LDAP/S URL to use a Global Catalog server, ensure that the Domain Control IP address or name being used is a global catalog server within the domain being configured. If not, queries over 3268/3269 may fail.
        • Cross-forest trust between multiple AD forests is not supported.

      For the complete list of required ports, see Port Reference.
    5. [OpenLDAP only] Configure the following additional fields:
      Note:

      The value for the following variables depend on your OpenLDAP configuration.

      • User Object Class : Enter the value that uniquely identifies the object class of a user.
      • User Search Base : Enter the base domain name in which the users are configured.
      • Username Attribute : Enter the attribute to uniquely identify a user.
      • Group Object Class : Enter the value that uniquely identifies the object class of a group.
      • Group Search Base : Enter the base domain name in which the groups are configured.
      • Group Member Attribute : Enter the attribute that identifies users in a group.
      • Group Member Attribute Value : Enter the attribute that identifies the users provided as value for Group Member Attribute .

      Here are some of the possible options for the fields:

      • User Object Class: user | person | inetOrgPerson | organizationalPerson | posixAccount
      • User Search Base: ou=<organizational unit>, dc=<domain>
      • Username Attribute: uid
      • Group Object Class: posixGroup | groupOfNames
      • Group Search Base: ou=<organizational unit>, dc=<domain>
      • Group Member Attribute: member | memberUid
      • Group Member Attribute Value: uid
    6. Search Type . How to search your directory when authenticating. Choose Non Recursive if you experience slow directory logon performance. For this option, ensure that users listed in Role Mapping are listed flatly in the group (that is, not nested). Otherwise, choose the default Recursive option.
    7. Service Account Username : Depending upon the Directory type you select in step 2.a, the service account user name format as follows:
      • For Active Directory , enter the service account user name in the user_name@domain.com format.
      • For OpenLDAP , enter the service account user name in the following Distinguished Name (DN) format:

        cn=username, dc=company, dc=com

        A service account is created to run only a particular service or application with the credentials specified for the account. According to the requirement of the service or application, the administrator can limit access to the service account.

        A service account is under the Managed Service Accounts in the Active Directory and openLDAP server. An application or service uses the service account to interact with the operating system. Enter your Active Directory and openLDAP service account credentials in this (username) and the following (password) field.

        Note: Be sure to update the service account credentials here whenever the service account password changes or when a different service account is used.
    8. Service Account Password : Enter the service account password.
    9. When all the fields are correct, click the Save button (lower right).

      This saves the configuration and redisplays the Authentication Configuration dialog box. The configured directory now appears in the Directory List tab.

    10. Repeat this step for each authentication directory you want to add.
    Note:
    • No permissions are granted to the directory users by default. To grant permissions to the directory users, you must specify roles for the users in that directory (see Configuring Role Mapping).
    • Service account for both Active directory and openLDAP must have full read permission on the directory service.
    Figure. Directory List Fields Click to enlarge Directory List tab display

  3. To edit a directory entry, click the pencil icon for that entry.

    After clicking the pencil icon, the relevant fields reappear. Enter the new information in the appropriate fields and then click the Save button.

  4. To delete a directory entry, click the X icon for that entry.

    After clicking the X icon, a window prompt appears to verify the delete action; click the OK button. The entry is removed from the list.

Adding a SAML-based Identity Provider

Before you begin

  • An identity provider (typically a server or other computer) is the system that provides authentication through a SAML request. There are various implementations that can provide authentication services in line with the SAML standard.
  • You can specify other tested standard-compliant IDPs in addition to ADFS. See the Prism Central release notes topic Identity and Access Management Software Support for specific support requirements and also Security Management Using Identity and Access Management (Prism Central).

    IAM allows only one identity provider at a time, so if you already configured one, the + New IDP link does not appear.

  • You must configure the identity provider to return the NameID attribute in SAML response. Prism Central uses the NameID attribute for role mapping.

Procedure

  1. In the web console, click the gear icon in the main menu and then select Authentication in the Settings page.
  2. To add a SAML-based identity provider, click the + New IDP link.

    A set of fields is displayed. Do the following in the indicated fields:

    1. Configuration name : Enter a name for the identity provider. This name appears in the logon authentication screen.
    2. Group Attribute Name (Optional) : Optionally, enter the group attribute name such as groups . Ensure that this name matches the group attribute name provided in the IDP configuration.
    3. Group Attribute Delimiter (Optional) : Optionally, enter a delimiter that needs to be used when multiple groups are selected for the Group attribute.
    4. Import Metadata : Click this option to upload a metadata file that contains the identity provider information.

      Identity providers typically provide an XML file on their website that includes metadata about that identity provider, which you can download from that site and then upload to Prism Central. Click + Import Metadata to open a search window on your local system and then select the target XML file that you downloaded previously. Click the Save button to save the configuration.

      Figure. Identity Provider Fields (metadata configuration) Click to enlarge

    This step completes configuring an identity provider in Prism Central, but you must also configure the callback URL for Prism Central on the identity provider. To configure the callback URL, click the Download Metadata link just below the Identity Providers table to download an XML file that describes Prism Central and then upload this metadata file to the identity provider.
  3. To edit an identity provider entry, click the pencil icon for that entry.

    After clicking the pencil icon, the relevant fields reappear. Enter the new information in the appropriate fields and then click the Save button.

  4. To delete an identity provider entry, click the X icon for that entry.

    After clicking the X icon, a window prompt appears to verify the delete action; click the OK button. The entry is removed from the list.

Enabling and Configuring Client Authentication

Procedure

  1. In the web console, click the gear icon in the main menu and then select Authentication in the Settings page.
  2. Click the Client tab, then do the following steps.
    1. Select the Configure Client Chain Certificate check box.

      The Client Chain Certificate is a list of certificates that includes all intermediate CA and root-CA certificates.

    2. Click the Choose File button, browse to and select a client chain certificate to upload, and then click the Open button to upload the certificate.
      Note:
      • Client and CAC authentication only supports RSA 2048 bit certificate.
      • Uploaded certificate files must be PEM encoded. The web console restarts after the upload step.
    3. To enable client authentication, click Enable Client Authentication .
    4. To modify client authentication, do one of the following:
      Note: The web console restarts when you change these settings.
      • Click Enable Client Authentication to disable client authentication.
      • Click Remove to delete the current certificate. (This also disables client authentication.)
      • To enable OCSP or CRL based certificate revocation checking, see Certificate Revocation Checking.

    Client authentication allows you to securely access the Prism by exchanging a digital certificate. Prism will validate that the certificate is signed by your organization’s trusted signing certificate.

    Client authentication ensures that the Nutanix cluster gets a valid certificate from the user. Normally, a one-way authentication process occurs where the server provides a certificate so the user can verify the authenticity of the server (see Installing an SSL Certificate). When client authentication is enabled, this becomes a two-way authentication where the server also verifies the authenticity of the user. A user must provide a valid certificate when accessing the console either by installing the certificate on the local machine or by providing it through a smart card reader.
    Note: The CA must be the same for both the client chain certificate and the certificate on the local machine or smart card.
  3. To specify a service account that the Prism Central web console can use to log in to Active Directory and authenticate Common Access Card (CAC) users, select the Configure Service Account check box, and then do the following in the indicated fields:
    1. Directory : Select the authentication directory that contains the CAC users that you want to authenticate.
      This list includes the directories that are configured on the Directory List tab.
    2. Service Username : Enter the user name in the user name@domain.com format that you want the web console to use to log in to the Active Directory.
    3. Service Password : Enter the password for the service user name.
    4. Click Enable CAC Authentication .
      Note: For federal customers only.
      Note: The Prism Central console restarts after you change this setting.

    The Common Access Card (CAC) is a smart card about the size of a credit card, which some organizations use to access their systems. After you insert the CAC into the CAC reader connected to your system, the software in the reader prompts you to enter a PIN. After you enter a valid PIN, the software extracts your personal certificate that represents you and forwards the certificate to the server using the HTTP protocol.

    Nutanix Prism verifies the certificate as follows:

    • Validates that the certificate has been signed by your organization’s trusted signing certificate.
    • Extracts the Electronic Data Interchange Personal Identifier (EDIPI) from the certificate and uses the EDIPI to check the validity of an account within the Active Directory. The security context from the EDIPI is used for your PRISM session.
    • Prism Central supports both certificate authentication and basic authentication in order to handle both Prism Central login using a certificate and allowing REST API to use basic authentication. It is physically not possible for REST API to use CAC certificates. With this behavior, if the certificate is present during Prism Central login, the certificate authentication is used. However, if the certificate is not present, basic authentication is enforced and used.
    Note: Nutanix Prism does not support OpenLDAP as directory service for CAC.
    If you map a Prism Central role to a CAC user and not to an Active Directory group or organizational unit to which the user belongs, specify the EDIPI (User Principal Name, or UPN) of that user in the role mapping. A user who presents a CAC with a valid certificate is mapped to a role and taken directly to the web console home page. The web console login page is not displayed.
    Note: If you have logged on to Prism Central by using CAC authentication, to successfully log out of Prism Central, close the browser after you click Log Out .

Certificate Revocation Checking

Enabling Certificate Revocation Checking using Online Certificate Status Protocol (nCLI)

About this task

OCSP is the recommended method for checking certificate revocation in client authentication. You can enable certificate revocation checking using the OSCP method through the command line interface (nCLI).

To enable certificate revocation checking using OCSP for client authentication, do the following.

Procedure

  1. Set the OCSP responder URL.
    ncli authconfig set-certificate-revocation set-ocsp-responder=<ocsp url> <ocsp url> indicates the location of the OCSP responder.
  2. Verify if OCSP checking is enabled.
    ncli authconfig get-client-authentication-config

    The expected output if certificate revocation checking is enabled successfully is as follows.

    Auth Config Status: true
    File Name: ca.cert.pem
    OCSP Responder URI: http://<ocsp-responder-url>

Enabling Certificate Revocation Checking using Certificate Revocation Lists (nCLI)

About this task

Note: OSCP is the recommended method for checking certificate revocation in client authentication.

You can use the CRL certificate revocation checking method if required, as described in this section.

To enable certificate revocation checking using CRL for client authentication, do the following.

Procedure

Specify all the CRLs that are required for certificate validation.
ncli authconfig set-certificate-revocation set-crl-uri=<uri 1>,<uri 2> set-crl-refresh-interval=<refresh interval in seconds> set-crl-expiration-interval=<expiration interval in seconds>
  • The above command resets any previous OCSP or CRL configurations.
  • The URIs must be percent-encoded and comma separated.
  • The CRLs are updated periodically as specified by the crl-refresh-interval value. This interval is common for the entire list of CRL distribution points. The default value for this is 86400 seconds (1 day).
  • The periodically updated CRLs are cached in-memory for the duration specified by value of set-crl-expiration-interval and expired after the duration, in case a particular CRL distribution point is not reachable. This duration is configured for the entire list of CRL distribution points. The default value for this is 604800 seconds (7 days).

User Management

Managing Local User Accounts

About this task

The Prism Central admin user is created automatically, but you can add more (locally defined) users as needed. To add, update, or delete a user account, do the following:

Note:
  • To add user accounts through Active Directory, see Configuring Authentication. If you enable the Prism Self Service feature, an Active Directory is assigned as part of that process.
  • Changing the Prism Central admin user password does not impact registration (re-registering clusters is not required).

Procedure

  • Click the gear icon in the main menu and then select Local User Management in the Settings page.

    The Local User Management dialog box appears.

    Figure. User Management Window Click to enlarge displays user management window

  • To add a user account, click the New User button and do the following in the displayed fields:
    1. Username : Enter a user name.
    2. First Name : Enter a first name.
    3. Last Name : Enter a last name.
    4. Email : Enter a valid user email address.
    5. Password : Enter a password (maximum of 255 characters).
      Note: A second field to verify the password is not included, so be sure to enter the password correctly in this field.
    6. Language : Select the language setting for the user.

      English is selected by default. You have an option to select Simplified Chinese or Japanese . If you select either of these, the cluster locale is updated for the new user. For example, if you select Simplified Chinese , the user interface is displayed in Simplified Chinese when the new user logs in.

    7. Roles : Assign a role to this user.

      There are three options:

      • Checking the User Admin box allows the user to view information, perform any administrative task, and create or modify user accounts.
      • Checking the Prism Central Admin (formerly "Cluster Admin") box allows the user to view information and perform any administrative task, but it does not provide permission to manage (create or modify) other user accounts.
      • Leaving both boxes unchecked allows the user to view information, but it does not provide permission to perform any administrative tasks or manage other user accounts.
    8. When all the fields are correct, click the Save button (lower right).

      This saves the configuration and redisplays the dialog box with the new user appearing in the list.

    Figure. Create User Window Click to enlarge displats create user window

  • To modify a user account, click the pencil icon for that user and update one or more of the values as desired in the Update User window.
    Figure. Update User Window Click to enlarge displays update user window

  • To disable login access for a user account, click the Yes value in the Enabled field for that user; to enable the account, click the No value.

    A Yes value means the login is enabled; a No value means it is disabled. A user account is enabled (login access activated) by default.

  • To delete a user account, click the X icon for that user.
    A window prompt appears to verify the action; click the OK button. The user account is removed and the user no longer appears in the list.

Updating My Account

About this task

To update your account credentials (that is, credentials for the user you are currently logged in as), do the following:

Procedure

  1. To update your password, select Change Password from the user icon pull-down list of the main menu.
    The Change Password dialog box appears. Do the following in the indicated fields:
    1. Current Password : Enter the current password.
    2. New Password : Enter a new password.
    3. Confirm Password : Re-enter the new password.
    4. When the fields are correct, click the Save button (lower right). This saves the new password and closes the window.
    Note: Password complexity requirements might appear above the fields; if they do, your new password must comply with these rules.
    Figure. Change Password Window Click to enlarge change password window

  2. To update other details of your account, select Update Profile from the user icon pull-down list.
    The Update Profile dialog box appears. Do the following in the indicated fields for any parameters you want to change:
    1. First Name : Enter a different first name.
    2. Last Name : Enter a different last name.
    3. Email Address : Enter a different valid user email address.
    4. Language : Select a different language for your account from the pull-down list.
    5. API Key : Enter a new API key.
      Note: Your keys can be managed from the API Keys page on the Nutanix support portal (see Licensing) . Your connection will be secure without the optional public key (following field), and the public key option is provided in the event that your default public key expires.
    6. Public Key : Click the Choose File button to upload a new public key file.
    7. When all the fields are correct, click the Save button (lower right). This saves the changes and closes the window.
    Figure. Update Profile Window Click to enlarge

Resetting Password (CLI)

This procedure describes how to reset a local user's password on the Prism Element or the Prism Central web consoles.

About this task

To reset the password using nCLI, do the following:

Note:

Only a user with admin privileges can reset a password for other users.

Procedure

  1. Access the CVM via SSH.
  2. Log in with the admin credentials.
  3. Use the ncli user reset-password command and specify the username and password of the user whose password is to be reset:
    nutanix@cvm$ ncli user reset-password user-name=xxxxx password=yyyyy
    
    • Replace user-name=xxxxx with the name of the user whose password is to be reset.

    • Replace password=yyyyy with the new password.

What to do next

You can relaunch the Prism Element or the Prism Central web console and verify the new password setting.

Deleting a Directory User Account

About this task

To delete a directory-authenticated user, do the following:

Procedure

  1. Click the Hamburger icon, and go to Administration > Projects
    The Project page appears. This page lists all existing projects.
  2. Select the project that the user is associated with and go to Actions > Update Projects
    The Edit Projects page appears.
  3. Go to Users, Groups, Roles tab.
  4. Click the X icon to delete the user.
    Figure. Edit Project Window Click to enlarge

  5. Click Save

    Prism deletes the user account and also removes the user from any associated projects.

    Repeat the same steps if the user is associated with multiple projects.

Controlling User Access (RBAC)

Prism Central supports role-based access control (RBAC) that you can configure to provide customized access permissions for users based on their assigned roles. The roles dashboard allows you to view information about all defined roles and the users and groups assigned to those roles.

  • Prism Central includes a set of predefined roles (see Built-in Role Management).
  • You can also define additional custom roles (see Custom Role Management).
  • Configuring authentication confers default user permissions that vary depending on the type of authentication (full permissions from a directory service or no permissions from an identity provider). You can configure role maps to customize these user permissions (see Configuring Role Mapping).
  • You can refine access permissions even further by assigning roles to individual users or groups that apply to a specified set of entities (see Assigning a Role).
    Note: Please note that the entities are treated as separate instances. For example, if you want to grant a user or a group the permission to manage cluster and images, an administrator must add both of these entities to the list of assignments.
  • With RBAC, user roles do not depend on the project membership. You can use RBAC and log in to Prism Central even without a project membership.
Note: Defining custom roles and assigning roles are supported on AHV only.

Built-in Role Management

The following built-in roles are defined by default. You can see a more detailed list of permissions for any of the built-in roles through the details view for that role (see Displaying Role Permissions). The Project Admin, Developer, Consumer, and Operator roles are available when assigning roles in a project.

Role Privileges
Super Admin Full administrator privileges
Prism Admin Full administrator privileges except for creating or modifying the user accounts
Prism Viewer View-only privileges
Self-Service Admin Manages all cloud-oriented resources and services
Note: This is the only cloud administration role available.
Project Admin Manages cloud objects (roles, VMs, Apps, Marketplace) belonging to a project
Note: You can specify a role for a user when you assign a user to a project, so individual users or groups can have different roles in the same project.
Developer Develops, troubleshoots, and tests applications in a project
Consumer Accesses the applications and blueprints in a project
Operator Accesses the applications in a project
VPC Admin Manages VPCs and related entities. Agnostic of the physical network/infrastructure. VPC admin of a Nutanix deployment.
Note: Previously, the Super Admin role was called User Admin , the Prism Admin role was called Prism Central Admin and Cluster Admin , and the Prism Viewer was called Viewer .

Custom Role Management

If the built-in roles are not sufficient for your needs, you can create one or more custom roles (AHV only).

Creating a Custom Role

About this task

To create a custom role, do the following:

Procedure

  1. Go to the roles dashboard (select Administration > Roles in the pull-down menu) and click the Create Role button.

    The Roles page appears. See Custom Role Permissions for a list of the permissions available for each custom role option.

  2. In the Roles page, do the following in the indicated fields:
    1. Role Name : Enter a name for the new role.
    2. Description (optional): Enter a description of the role.
      Note: All entity types are listed by default, but you can display just a subset by entering a string in the Filter Entities search field.
      Figure. Filter Entities Click to enlarge Filters the available entities

    3. Select an entity you want to add to this role and provide desired access permissions from the available options. The access permissions vary depending on the selected entity.

      For example, for the VM entity, click the radio button for the desired VM permissions:

      • No Access
      • View Access
      • Basic Access
      • Edit Access
      • Set Custom Permissions

      If you select Set Custom Permissions , click the Change link to display the Custom VM Permissions window, check all the permissions you want to enable, and then click the Save button. Optionally, check the Allow VM Creation box to allow this role to create VMs.

      Figure. Custom VM Permissions Window Click to enlarge displays the custom VM permissions window

  3. Click Save to create the role. The page closes and the new role appears in the Roles view list.
Modifying a Custom Role

About this task

Perform the following procedure to modify or delete a custom role.

Procedure

  1. Go to the roles dashboard and select (check the box for) the desired role from the list.
  2. Do one of the following:
    • To modify the role, select Update Role from the Actions pull-down list. The Roles page for that role appears. Update the field values as desired and then click Save . See Creating a Custom Role for field descriptions.
    • To delete the role, select Delete from the Action pull-down list. A confirmation message is displayed. Click OK to delete and remove the role from the list.
Custom Role Permissions

A selection of permission options are available when creating a custom role.

The following table lists the permissions you can grant when creating or modifying a custom role. When you select an option for an entity, the permissions listed for that option are granted. If you select Set custom permissions , a complete list of available permissions for that entity appears. Select the desired permissions from that list.

Entity Option Permissions
App (application) No Access (none)
Basic Access Abort App Runlog, Access Console VM, Action Run App, Clone VM, Create AWS VM, Create Image, Create VM, Delete AWS VM, Delete VM, Download App Runlog, Update AWS VM, Update VM, View App, View AWS VM, View VM
Set Custom Permissions (select from list) Abort App Runlog, Access Console VM, Action Run App, Clone VM, Create App, Create AWS VM, Create Image, Create VM, Delete App, Delete AWS VM, Delete VM, Download App Runlog, Update App, Update AWS VM, Update VM, View App, View AWS VM, View VM
VM Recovery Point No Access (none)
View Only View VM Recovery Point
Full Access Delete VM Recovery Point, Restore VM Recovery Point, Snapshot VM, Update VM Recovery Point, View VM Recovery Point, Allow VM Recovery Point creation
Set Custom Permissions (Change) Abort App Runlog, Access Console VM, Action Run App, Clone VM, Create App, Create AWS VM, Create Image, Create VM, Delete App, Delete AWS VM, Delete VM, Download App Runlog, Update App, Update AWS VM, Update VM, View App, View AWS VM, View VM
Note:

You can assign permissions for the VM Recovery Point entity to users or user groups in the following two ways.

  • Manually assign permission for each VM where the recovery point is created.
  • Assign permission using Categories in the Role Assignment workflow.
Tip: When a recovery point is created, it is associated with the same category as the VM.
VM No Access (none)
View Access Access Console VM, View VM
Basic Access Access Console VM, Update VM Power State, View VM
Edit Access Access Console VM, Update VM, View Subnet, View VM
Full Access Access Console VM, Clone VM, Create VM, Delete VM, Export VM, Update VM, Update VM Boot Config, Update VM CPU, Update VM Categories, Update VM Description, Update VM Disk List, Update VM GPU List, Update VM Memory, Update VM NIC List, Update VM Owner, Update VM Power State, Update VM Project, View Cluster, View Subnet, View VM.
Set Custom Permissions (select from list) Access Console VM, Clone VM, Create VM, Delete VM, Update VM, Update VM Boot Config, Update VM CPU, Update VM Categories, Update VM Disk List, Update VM GPU List, Update VM Memory, Update VM NIC List, Update VM Owner, Update VM Power State, Update VM Project, View Cluster, View Subnet, View VM.

Granular permissions (applicable if IAM is enabled, see Granular Role-Based Access Control (RBAC)) for details.

Allow VM Power Off, Allow VM Power On, Allow VM Reboot, Allow VM Reset, Expand VM Disk Size, Mount VM CDROM, Unmount VM CDROM, Update VM Memory Overcommit, Update VM NGT Config, Update VM Power State Mechanism

Allow VM creation (additional option) (n/a)
Blueprint No Access (none)
View Access View Account, View AWS AZ, View AWS Elastic IP, View AWS Image, View AWS Key Pair, View AWS Machine Type, View AWS Region, View AWS Role, View AWS Security Group, View AWS Subnet, View AWS Volume Type, View AWS VPC, View Blueprint, View Cluster, View Image, View Project, View Subnet
Basic Access Access Console VM, Clone VM, Create App,Create Image, Create VM, Delete VM, Launch Blueprint, Update VM, View Account, View App, View AWS AZ, View AWS Elastic IP, View AWS Image, View AWS Key Pair, View AWS Machine Type, View AWS Region, View AWS Role, View AWS Security Group, View AWS Subnet, View AWS Volume Type, View AWS VPC, View Blueprint, View Cluster, View Image, View Project, View Subnet, View VM
Full Access Access Console VM, Clone Blueprint, Clone VM, Create App, Create Blueprint, Create Image, Create VM, Delete Blueprint, Delete VM, Download Blueprint, Export Blueprint, Import Blueprint, Launch Blueprint, Render Blueprint, Update Blueprint, Update VM, Upload Blueprint, View Account, View App, View AWS AZ, View AWS Elastic IP, View AWS Image, View AWS Key Pair, View AWS Machine Type, View AWS Region, View AWS Role, View AWS Security Group, View AWS Subnet, View AWS Volume Type, View AWS VPC, View Blueprint, View Cluster, View Image, View Project, View Subnet, View VM
Set Custom Permissions (select from list) Access Console VM, Clone VM, Create App, Create Blueprint, Create Image, Create VM, Delete Blueprint, Delete VM, Download Blueprint, Export Blueprint, Import Blueprint, Launch Blueprint, Render Blueprint, Update Blueprint, Update VM, Upload Blueprint, View Account, View App, View AWS AZ, View AWS Elastic IP, View AWS Image, View AWS Key Pair, View AWS Machine Type, View AWS Region, View AWS Role, View AWS Security Group, View AWS Subnet, View AWS Volume Type, View AWS VPC, View Blueprint, View Cluster, View Image, View Project, View Subnet, View VM
Marketplace Item No Access (none)
View marketplace and published blueprints View Marketplace Item
View marketplace and publish new blueprints Update Marketplace Item, View Marketplace Item
Full Access Config Marketplace Item, Create Marketplace Item, Delete Marketplace Item, Render Marketplace Item, Update Marketplace Item, View Marketplace Item
Set Custom Permissions (select from list) Config Marketplace Item, Create Marketplace Item, Delete Marketplace Item, Render Marketplace Item, Update Marketplace Item, View Marketplace Item
Report No Access (none)
View Only Notify Report Instance, View Common Report Config, View Report Config, View Report Instance
Full Access Create Common Report Config, Create Report Config, Create Report Instance, Delete Common Report Config, Delete Report Config, Delete Report Instance, Notify Report Instance, Run Report Config, Share Report Config, Share Report Instance, Update Common Report Config, Update Report Config, View Common Report Config, View Report Config, View Report Instance, View User, View User Group
Cluster No Access (none)
View Access View Cluster
Update Access Update Cluster
Full Access Update Cluster, View Cluster
Subnet No Access (none)
View Access View Subnet, View Virtual Switch
Image No Access (none)
View Only View Image
Set Custom Permissions (select from list) Copy Image Remote, Create Image, Delete Image, Migrate Image, Update Image, View Image
OVA No Access (none)
View Access View OVA
Full Access View OVA, Create OVA, Update OVA and Delete OVA
Set custom permissions Change View OVA, Create OVA, Update OVA and Delete OVA
Object Store No Access (none)
View Access View Object Store
Full Access View Object Store, Create Object Store, Update Object Store and Delete Object Store
Set custom permissions Change View Object Store, Create Object Store, Update Object Store and Delete Object Store
Analysis Session No Access (none)
View Only View Analysis Session
Full Access Create Analysis Session, Delete Analysis Session, Share Analysis Session, Update Analysis Session, View Analysis Session, View User and View User Group
Dashboard No Access (none)
View Only View Dashboard
Full Access Create Dashboard, Delete Dashboard, Share Dashboard, Update Dashboard, View Dashboard, View User and View User Group
Capacity Scenario No Access (none)
View Only View Capacity Scenario
Full Access Create Whatif, Delete Whatif, Share Whatif, Update Whatif, View Whatif, View User and View User Group

The following table describe the permissions.

Note: By default, assigning certain permissions to a user role might implicitly assign more permissions to that role. However, the implicitly assigned permissions will not be displayed in the details page for that role. These permissions are displayed only if you manually assign them to that role.
Permission Description Assigned Implicilty By
Create App Allows to create an application.
Delete App Allows to delete an application.
View App Allows to view an application.
Action Run App Allows to run action on an application.
Download App Runlog Allows to download an application runlog.
Abort App Runlog Allows to abort an application runlog.
Access Console VM Allows to access the console of a virtual machine.
Create VM Allows to create a virtual machine.
View VM Allows to view a virtual machine.
Clone VM Allows to clone a virtual machine.
Delete VM Allows to delete a virtual machine.
Export VM Allows to export a virtual machine
Snapshot VM Allows to snapshot a virtual machine.
View VM Recovery Point Allows to view a vm_recovery_point.
Update VM Recovery Point Allows to update a vm_recovery_point.
Delete VM Recovery Point Allows to delete a vm_recovery_point.
Restore VM Recovery Point Allows to restore a vm_recovery_point.
Update VM Allows to update a virtual machine.
Update VM Boot Config Allows to update a virtual machine's boot configuration. Update VM
Update VM CPU Allows to update a virtual machine's CPU configuration. Update VM
Update VM Categories Allows to update a virtual machine's categories. Update VM
Update VM Description Allows to update a virtual machine's description. Update VM
Update VM GPU List Allows to update a virtual machine's GPUs. Update VM
Update VM NIC List Allows to update a virtual machine's NICs. Update VM
Update VM Owner Allows to update a virtual machine's owner. Update VM
Update VM Project Allows to update a virtual machine's project. Update VM
Update VM NGT Config Allows updates to a virtual machine's Nutanix Guest Tools configuration. Update VM
Update VM Power State Allows updates to a virtual machine's power state. Update VM
Update VM Disk List Allows to update a virtual machine's disks. Update VM
Update VM Memory Allows to update a virtual machine's memory configuration. Update VM
Update VM Power State Mechanism Allows updates to a virtual machine's power state mechanism. Update VM or Update VM Power State
Allow VM Power Off Allows power off and shutdown operations on a virtual machine. Update VM or Update VM Power State
Allow VM Power On Allows power on operation on a virtual machine. Update VM or Update VM Power State
Allow VM Reboot Allows reboot operation on a virtual machine. Update VM or Update VM Power State
Expand VM Disk Size Allows to expand a virtual machine's disk size. Update VM or Update VM Disk List
Mount VM CDROM Allows to mount an ISO to virtual machine's CDROM. Update VM or Update VM Disk List
Unmount VM CDROM Allows to unmount ISO from virtual machine's CDROM. Update VM or Update VM Disk List
Update VM Memory Overcommit Allows to update a virtual machine's memory overcommit configuration. Update VM or Update VM Memory
Allow VM Reset Allows reset (hard reboot) operation on a virtual machine. Update VM, Update VM Power State, or Allow VM Reboot
View Cluster Allows to view a cluster.
Update Cluster Allows to update a cluster.
Create Image Allows to create an image.
View Image Allows to view a image.
Copy Image Remote Allows to copy an image from local PC to remote PC.
Delete Image Allows to delete an image.
Migrate Image Allows to migrate an image from PE to PC.
Update Image Allows to update a image.
Create Image Placement Policy Allows to create an image placement policy.
View Image Placement Policy Allows to view an image placement policy.
Delete Image Placement Policy Allows to delete an image placement policy.
Update Image Placement Policy Allows to update an image placement policy.
Create AWS VM Allows to create an AWS virtual machine.
View AWS VM Allows to view an AWS virtual machine.
Update AWS VM Allows to update an AWS virtual machine.
Delete AWS VM Allows to delete an AWS virtual machine.
View AWS AZ Allows to view AWS Availability Zones.
View AWS Elastic IP Allows to view an AWS Elastic IP.
View AWS Image Allows to view an AWS image.
View AWS Key Pair Allows to view AWS keypairs.
View AWS Machine Type Allows to view AWS machine types.
View AWS Region Allows to view AWS regions.
View AWS Role Allows to view AWS roles.
View AWS Security Group Allows to view an AWS security group.
View AWS Subnet Allows to view an AWS subnet.
View AWS Volume Type Allows to view AWS volume types.
View AWS VPC Allows to view an AWS VPC.
Create Subnet Allows to create a subnet.
View Subnet Allows to view a subnet.
Update Subnet Allows to update a subnet.
Delete Subnet Allows to delete a subnet.
Create Blueprint Allows to create the blueprint of an application.
View Blueprint Allows to view the blueprint of an application.
Launch Blueprint Allows to launch the blueprint of an application.
Clone Blueprint Allows to clone the blueprint of an application.
Delete Blueprint Allows to delete the blueprint of an application.
Download Blueprint Allows to download the blueprint of an application.
Export Blueprint Allows to export the blueprint of an application.
Import Blueprint Allows to import the blueprint of an application.
Render Blueprint Allows to render the blueprint of an application.
Update Blueprint Allows to update the blueprint of an application.
Upload Blueprint Allows to upload the blueprint of an application.
Create OVA Allows to create an OVA.
View OVA Allows to view an OVA.
Update OVA Allows to update an OVA.
Delete OVA Allows to delete an OVA.
Create Marketplace Item Allows to create a marketplace item.
View Marketplace Item Allows to view a marketplace item.
Update Marketplace Item Allows to update a marketplace item.
Config Marketplace Item Allows to configure a marketplace item.
Render Marketplace Item Allows to render a marketplace item.
Delete Marketplace Item Allows to delete a marketplace item.
Create Report Config Allows to create a report_config.
View Report Config Allows to view a report_config.
Run Report Config Allows to run a report_config.
Share Report Config Allows to share a report_config.
Update Report Config Allows to update a report_config.
Delete Report Config Allows to delete a report_config.
Create Common Report Config Allows to create a common report_config.
View Common Report Config Allows to view a common report_config.
Update Common Report Config Allows to update a common report_config.
Delete Common Report Config Allows to delete a common report_config.
Create Report Instance Allows to create a report_instance.
View Report Instance Allows to view a report_instance.
Notify Report Instance Allows to notify a report_instance.
Notify Report Instance Allows to notify a report_instance.
Share Report Instance Allows to share a report_instance.
Delete Report Instance Allows to delete a report_instance.
View Account Allows to view an account.
View Project Allows to view a project.
View User Allows to view a user.
View User Group Allows to view a user group.
View Name Category Allows to view a category's name.
View Value Category Allows to view a category's value.
View Virtual Switch Allows to view a virtual switch.
Granting Restore Permission to Project User

About this task

By default, only a self service admin or a cluster admin can view and restore the recovery points. However, a self service admin or cluster admin can grant permission to the project user to restore the VM from a recovery point.

To grant restore Permission to a project user, do the following:

Procedure

  1. Log on to Prism Central with cluster admin or self service admin credentials.
  2. Go to the roles dashboard (select Administration > Roles in the pull-down menu) and do one of the following:
    • Click the Create Role button.
    • Select an existing role of a project user and then select Duplicate from the Actions drop-down menu. To modify the duplicate role, select Update Role from the Actions pull-down list.
  3. The Roles page for that role appears. In the Roles page, do the following in the indicated fields:
    1. Role Name : Enter a name for the new role.
    2. Description (optional): Enter a description of the role.
    3. Expand VM Recovery Point and do one of the following:
      • Select Full Access and then select Allow VM recovery point creation .
      • Click Change next to Set Custom Permissions to customize the permissions. Enable Restore VM Recovery Point permission. This permission also grants the permission to view the VM created from the restore process.
    4. Click Save to add the role. The page closes and the new role appears in the Roles view list.
  4. In the Roles view, select the newly created role and click Manage Assignment to assign the user to this role.
  5. In the Add New dialog, do the following:
    • Under Select Users or User Groups or OUs , enter the target user name. The search box displays the matched records. Select the required listing from the records.
    • Under Entities , select VM Recovery Point , select Individual Entry from the drop-down list, and then select All VM Recovery Points.
    • Click Save to finish.

Configuring Role Mapping

About this task

After user authentication is configured (see Configuring Authentication), the users or the authorized directories are not assigned the permissions by default. The required permissions must be explicitly assigned to users, authorized directories, or organizational units using role mapping.

You can refine the authentication process by assigning a role with associated permissions to users, groups, and organizational units. This procedure allows you to map and assign users to the predefined roles in Prism Central such as, User Admin , Cluster Admin , and Viewer . To assign roles, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Role Mapping from the Settings page.

    The Role Mapping window appears.

    Figure. Role Mapping Window Click to enlarge displays annotated role mapping window

  2. To create a role mapping, click the New Mapping button.

    The Create Role Mapping window appears. Enter the required information in the following fields.

    Figure. Create Role Mapping Window Click to enlarge displays create role mapping window

  3. Directory or Provider : Select the target directory or identity provider from the pull-down list.
    Only directories and identity providers previously configured in the authentication settings are available. If the desired directory or provider does not appear in the list, add that directory or provider, and then return to this procedure.
  4. Type : Select the desired LDAP entity type from the pull-down list.
    This field appears only if you have selected a directory from the Directory or Provider pull-down list. The following entity types are available:
    • User : A named user. For example, dev_user_1.
    • Group : A group of users. For example, dev_grp1, dev_grp2, sr_dev_1, and staff_dev_1.
    • OU : organizational units with one or more users, groups, and even other organizational units. For example, all_dev, consists of user dev_user_1 and groups dev_grp1, dev_grp2, sr_dev_1, and staff_dev_1.
  5. Role : Select a user role from the pull-down list.
    You can choose one of the following roles:
    • Viewer : Allows users with view-only access to the information and hence cannot perform any administrative tasks.
    • Cluster Admin (Formerly Prism Central Admin): Allows users to view and perform all administrative tasks except creating or modifying user accounts.
    • User Admin : Allows users to view information, perform administrative tasks, and to create and modify user accounts.
  6. Values : Enter the entity names. The entity names are assigned with the respective roles that you have selected.
    The entity names are case sensitive. If you need to provide more than one entity name, then the entity names should be separated by a comma (,) without any spaces in between them.

    LDAP-based authentication

    • For AD

      Enter the actual names used by the organizational units (it applies to all users and groups in those OUs), groups (all users in those groups), or users (each named user) used in LDAP in the Values field.

      For example, entering sr_dev_1,staff_dev_1 in the Values field when the LDAP type is Group and the role is Cluster Admin, implies that all users in the sr_dev_1 and staff_dev_1 groups are assigned the administrative role for the cluster.

      Do not include the domain name in the value. For example, enter all_dev , and not all_dev@<domain_name> . However, when users log in to Cluster Admin, include the domain along with the username.

      User : Enter the sAMAccountName or userPrincipalName in the values field.

      Group : Enter common name (cn) or name.

      OU : Enter name.

    • For OpenLDAP

      User : Use the username attribute (that was configured while adding the directory) value.

      Group : Use the group name attribute (cn) value.

      OU : Use the OU attribute (ou) value.

    SAML-based authentication:

    You must configure the NameID attribute in the identity provider. You can enter the NameID returned in the SAML response in the Values field.

    For SAML, only User type is supported. Other types such as, Group and OU, are not supported.

    If you enable Identity and Access Management, see Security Management Using Identity and Access Management (Prism Central)

  7. Click Save .

    The role mapping configurations are saved, and the new role is listed in the Role Mapping window.

    You can create a role map for each authorized directory. You can also create multiple role maps that apply to a single directory. When there are multiple maps for a directory, the most specific rule for a user applies.

    For example, adding a Group map set to Cluster Admin and a User map set to Viewer for a few specific users in that group means all users in the group have administrator permission except those few specific users who have only viewing permission.

  8. To edit a role map entry, click the pencil icon for that entry.
    After clicking the pencil icon, the Edit Role Mapping window appears which is similar to the Create Role Mapping window. Edit the required information in the required fields and click the Save button to update the changes.
  9. To delete a role map entry, click the X icon for that entry and click the OK button to confirm the role map entry deletion.
    The role map entry is removed from the list.

Granular Role-Based Access Control (RBAC)

Granular Role-Based Access Control (RBAC) allows you to assign fine-grained VM operation permissions to users based on your specific requirements. This feature enables you to create custom roles with finer permission entities like "Allow VM Power On" or "Allow VM Power Off" as compared to the broader permissions categories like "Update VM".

The procedure to configure Granular RBAC to users is similar to the procedure outlined in Creating a Custom Role. For the complete list of available permissions, see Custom Role Permissions.

Note:
  • Granular RBAC is supported for VMs running on an AHV cluster.
  • Ensure that IAM is enabled, see Enabling IAM for details.
  • You must be running Prism Central version pc.2021.9 and AOS version 6.0.1 or above.

Cluster Role-Based Access Control (RBAC)

Cluster role-based access control (RBAC) feature enables a super-admin user to provide Prism Admin and Prism Viewer roles access to one or more clusters registered with Prism Central. A user with Prism Central admin or viewer role is able to view and act on the entities like VM, host, container, VM recovery points, and recovery plans from the allowed clusters.

Cluster RBAC is currently supported on an on-prem Prism Central instance hosted in a Prism Element cluster running AHV or ESXi. After you enable the Micro Services Infrastructure feature on Prism Central, the Cluster RBAC feature is then automatically enabled.

This feature supports clusters that are hosted on AHV and VMware ESXi.

Note: The Prism Central supports assigning up to 15 clusters to any user or user group.
Configuring Cluster RBAC

About this task

To configure Cluster RBAC in Prism Central for users or user groups, do the following.

Procedure

  1. Log on to Prism Central as an admin user or any user with super admin access.
  2. Configure active directory settings.
    Note: You can skip this step if an active directory is already configured.
    Go to Prism Central Settings > Authentication , click + New Directory and add your preferred active directory.
  3. Click the hamburger menu and go to Administration > Roles .
    The page displays system defined and custom roles.
  4. Select Prism Admin or Prism Viewer role, then click Actions > Manage Assignment .
  5. Click Add New to add a new user or user groups or OU (IDP users or user groups) to this role.
    Figure. Role Assignment Click to enlarge role assignment view

    You will add users or user groups and assign clusters to the new role in the upcoming steps.

  6. In the Select Users or User Groups or OUs field, do the following:
    1. Select the configured AD or IDP from the drop-down.
      The drop-down displays a list of available types of user or user group such as Local User, AD based user or user groups, or SAML based user or user groups. Select Organizational Units or OU for AD or directories that use SAML based IDP for authentication.
      Figure. User, User Group or OU selection Click to enlarge Displaying the User, User Group or OU selection drop-down list.

    2. Search and add the users or groups in the Search User field.

      Typing few letters in the search field displays a list of users from which you can select, and you can add multiple user names in this field.

  7. In the Select Clusters field, you can provide cluster access to AD users or User Groups using the Individual entity option (one or more registered clusters) or ALL Clusters option.
    Figure. Select Clusters Click to enlarge ahv cluster selection view

  8. Click Save .
    AD or IDP users or User Groups can log on and access Prism Central as a Prism Admin or Prism Viewer, and view or act on the entities like VM, host, and container from the configured clusters.

Cluster RBAC for Volume Group

Cluster role-based access control (RBAC) for Volume Group feature enables a super-admin user to provide Prism Admin and Prism Viewer roles access to one or more clusters registered with Prism Central. A user with Prism Admin role can view and update the entities like volume groups, virtual disks, and storage containers from the allowed clusters. However, a user with Prism Viewer role can only view the entities.

Cluster RBAC is currently supported on an on-prem Prism Central instance hosted in a Prism Element cluster running AHV. After you enable the Micro Services Infrastructure feature on Prism Central, the Cluster RBAC feature is then automatically enabled.

Cluster RBAC for Volume Group feature is supported on AHV and ESXi clusters.

Note: Prism Central supports Cluster RBAC for VG feature from PC.2022.6 release.
Table 1. List of Permissions for Prism Admin and Prism Viewer Roles
Role Privileges
Prism Admin Full administrator privileges except for creating or modifying the user accounts
Prism Viewer View-only privileges
Configuring Cluster RBAC for Volume Group

About this task

To configure Cluster RBAC for Volume Group in Prism Central for users or user groups, do the following.

Procedure

  1. Log on to Prism Central as an admin user or any user with super admin access.
  2. Configure Active Directory settings.
    Note: You can skip this step if an Active Directory is already configured.
    Go to Prism Central Settings > Authentication , click + New Directory and add your preferred Active Directory.
  3. Click the hamburger menu and go to Administration > Roles .
    The page displays system defined and custom roles.
  4. Select Prism Admin or Prism Viewer role, then click Actions > Manage Assignment .
    Figure. Prism Central Roles Click to enlarge roles view

    For illustration purpose, the Prism Admin role is selected in this step.
  5. Click Add New to add a new user or user groups to this role.
    Figure. Role Assignment Click to enlarge role assignment view

    You will add users or user groups and assign clusters to the Prism Admin or Prism Viewer role in the upcoming steps.

  6. In the Select Users or Groups field, do the following:
    1. Select the configured active directory (AD) from the drop-down.
    2. Search and add the users or user groups.
    To search a user or user group, start typing few letters and the system will automatically suggest the names.
  7. In the Select Clusters field, you can provide cluster access to AD users or User Groups using the Individual entity option (one or more registered clusters) or ALL Clusters option.
    Figure. Select Clusters Click to enlarge ahv cluster selection view

  8. Click Save .
    AD users or User Groups can log on and access Prism Central as a Prism Admin or Prism Viewer. They can view or act on the available entities in the configured clusters such as volume groups, virtual disks, and storage containers.

Assigning a Role

About this task

In addition to configuring basic role maps (see Configuring Role Mapping), you can configure more precise role assignments (AHV only). To assign a role to selected users or groups that applies just to a specified set of entities, do the following:

Procedure

  1. Log on to Prism Central as "admin" user or any user with "super admin" access.
  2. Configure Active Directory settings.
    Note: You can skip this step if an active directory is already configured.
    Go to Prism Central Settings > Authentication , click + New Directory and add your preferred active directory.
  3. Click the hamburger menu and go to Administration > Roles .
    The page displays system defined and custom roles.
  4. Select the desired role in the roles dashboard, then click Actions > Manage Assignment .
  5. Click Add New to add Active Directory based users or user groups, or IDP users or user groups (or OUs) to this role.
    Figure. Role Assignment Click to enlarge role assignment view

    You are adding users or user groups and assigning entities to the new role in the next steps.

  6. In the Select Users or User Groups or OUs field, do the following:
    1. Select the configured AD or IDP from the drop-down.
      The drop-down displays a list of available types of user or user group such as Local User, AD based user or user groups, or SAML based user or user groups. Select Organizational Units or OU for AD or directories that use SAML based IDP for authentication.
      Figure. User, User Group or OU selection Click to enlarge Displaying the User, User Group or OU selection drop-down list.

    2. Search and add the users or groups in the Search User field.

      Typing few letters in the search field displays a list of users from which you can select, and you can add multiple user names in this field.

  7. In the Select Entities field, you can provide access to various entities. The list of available entities depends on the role selected in Step 4.

This table lists the available entities for each role:

Table 1. Available Entities for a Role
Role Entities
Consumer AHV VM, Image, Image Placement Policy, OVA, Subnets: VLAN
Developer AHV VM, Cluster, Image, Image Placement Policy, OVA, Subnets:VLAN
Operator AHV VM, Subnets:VLAN
Prism Admin Individual entity (one or more clusters), All Clusters
Prism Viewer Individual entity (one or more clusters), All Clusters
Custom role (User defined role) Individual entity, In Category (only AHV VMs)

This table shows the description of each entity:

Table 2. Description of Entities
Entity Description
AHV VM Allows you to manage VMs including create and edit permission
Image Allows you to access and manage image details
Image Placement Policy Allows you to access and manage image placement policy details
OVA Allows you to view and manage OVA details
Subnets: VLAN Allows you to view subnet details
Cluster Allows you to view and manage details of assigned clusters (AHV and ESXi clusters)
All Clusters Allows you to view and manage details of all clusters
VM Recovery Points Allows you to perform recovery operations with recovery points.
Recovery Plan (Single PC only)

Allows you to view, validate, and test recovery plans. Also allows you to clean up VMs created after recovery plan test.
Individual entity Allows you to view and manage individual entities such as AHV VM, Clusters, and Subnets:VLAN
  1. Repeat Step 5 and Step 6 for any combination of users/entities you want to define.
    Note: To allow users to create certain entities like a VM, you may also need to grant them access to related entities like clusters, networks, and images that the VM requires.
  2. Click Save .

Displaying Role Permissions

About this task

Do the following to display the privileges associated with a role.

Procedure

  1. Go to the roles dashboard and select the desired role from the list.

    For example, if you click the Consumer role, the details page for that role appears, and you can view all the privileges associated with the Consumer role.

    Figure. Role Summary Tab Click to enlarge

  2. Click the Users tab to display the users that are assigned this role.
    Figure. Role Users Tab Click to enlarge

  3. Click the User Groups tab to display the groups that are assigned this role.
  4. Click the Role Assignment tab to display the user/entity pairs assigned this role (see Assigning a Role).

Installing an SSL Certificate

About this task

Prism Central supports SSL certificate-based authentication for console access. To install a self-signed or custom SSL certificate, do the following:
Important: Ensure that SSL certificates are not password protected.
Note: Nutanix recommends that you replace the default self-signed certificate with a CA signed certificate.

Procedure

  1. Click the gear icon in the main menu and then select SSL Certificate in the Settings page.
  2. To replace (or install) a certificate, click the Replace Certificate button.
  3. To create a new self-signed certificate, click the Replace Certificate option and then click the Apply button.

    A dialog box appears to verify the action; click the OK button. This generates and applies a new RSA 2048-bit self-signed certificate for Prism Central.

    Figure. SSL Certificate Window: Regenerate
    Click to enlarge

  4. To apply a custom certificate that you provide, do the following:
    1. Click the Import Key and Certificate option and then click the Next button.
      Figure. SSL Certificate Window: Import Click to enlarge
    2. Do the following in the indicated fields, and then click the Import Files button.
      Note:
      • All the three imported files for the custom certificate must be PEM encoded.
      • Ensure that the private key does not have any extra data (or custom attributes) before the beginning (-----BEGIN CERTIFICATE-----) or after the end (-----END CERTIFICATE-----) of the private key block.
      • See Recommended Key Configurations to ensure proper set of key types, sizes/curves, and signature algorithms.
      • Private Key Type : Select the appropriate type for the signed certificate from the pull-down list (RSA 4096 bit, RSA 2048 bit, EC DSA 256 bit, or EC DSA 384 bit).
      • Private Key : Click the Browse button and select the private key associated with the certificate to be imported.
      • Public Certificate : Click the Browse button and select the signed public portion of the server certificate corresponding to the private key.
      • CA Certificate/Chain : Click the Browse button and select the certificate or chain of the signing authority for the public certificate.
      Figure. Importing Certificate Click to enlarge
      In order to meet the high security standards of NIST SP800-131a compliance, the requirements of the RFC 6460 for NSA Suite B, and supply the optimal performance for encryption, the certificate import process validates the correct signature algorithm is used for a given key/cert pair. See Recommended Key Configurations to ensure proper set of key types, sizes/curves, and signature algorithms. The CA must sign all public certificates with proper type, size/curve, and signature algorithm for the import process to validate successfully.
      Note: There is no specific requirement for the subject name of the certificates (subject alternative names (SAN) or wildcard certificates are supported in Prism).
      You can use the cat command to concatenate a list of CA certificates into a chain file.
      $ cat signer.crt inter.crt root.crt > server.cert
      Order is essential. The total chain should begin with the certificate of the signer and end with the root CA certificate as the final entry.

Results

After generating or uploading the new certificate, the interface gateway restarts. If the certificate and credentials are valid, the interface gateway uses the new certificate immediately, which means your browser session (and all other open browser sessions) will be invalid until you reload the page and accept the new certificate. If anything is wrong with the certificate (such as a corrupted file or wrong certificate type), the new certificate is discarded, and the system reverts back to the original default certificate provided by Nutanix.

Note: The system holds only one custom SSL certificate. If a new certificate is uploaded, it replaces the existing certificate. The previous certificate is discarded.

Controlling Remote (SSH) Access

About this task

Nutanix supports key-based SSH access to Prism Central. Enabling key-based SSH access ensures that password authentication is disabled and only the keys you have provided can be used to access the Prism Central (only for nutanix/admin users). Thus making the Prism Central more secure.

You can create a key pair (or multiple key pairs) and add the public keys to enable key-based SSH access. However, when site security requirements do not allow such access, you can remove all public keys to prevent SSH access.

To control key-based SSH access to Prism Central, do the following:

Procedure

  1. Click the gear icon in the main menu and then select Cluster Lockdown in the Settings page.

    The Cluster Lockdown dialog box appears. Enabled public keys (if any) are listed in this window.

    Figure. Cluster Lockdown Window Click to enlarge displays cluster lockdown window

  2. To disable (or enable) remote login access, uncheck (check) the Enable Remote Login with Password box.

    Remote login access is enabled by default.

  3. To add a new public key, click the New Public Key button and then do the following in the displayed fields:
    1. Name : Enter a key name.
    2. Key : Enter (paste) the key value into the field.
    3. Click the Save button (lower right) to save the key and return to the main Cluster Lockdown window.

    There are no public keys available by default, but you can add any number of public keys.

  4. To delete a public key, click the X on the right of that key line.
    Note: Deleting all the public keys and disabling remote login access locks down the cluster from SSH access.

Security Policies using Flow

Nutanix Flow includes a policy-driven security framework that inspects traffic within the data center. For more information, see the Flow Microsegmentation Guide.

Data-in-Transit Encryption

Data-in-Transit Encryption allows you to encrypt service level traffic between the cluster nodes. Data-in-Transit Encryption, along with Data-at-Rest Encryption, protects the entire life cycle of data and is an essential countermeasure for unauthorized access of critical data.

To enable Data-in-Transit Encryption, see Enabling Data-in-Transit Encryption.
Note:
  • Data-in-Transit Encryption can have an impact on I/O latency and CPU performance.
  • Intra-cluster traffic encryption is supported only for the Stargate service.
  • RDMA traffic encryption is not supported.
  • When a Controller VM goes down, the traffic from guest VM to remote Controller VM is not encrypted.
  • Traffic between guest VMs connected to Volume Groups is not encrypted when the target disk is on a remote Controller VM.

Enabling Data-in-Transit Encryption

About this task

Data-in-Transit Encryption allows you to encrypt service level traffic between the cluster nodes. To enable Data-in-Transit Encryption, do the following.

Before you begin

  1. Ensure that the cluster is running AOS version 6.1.1 and Prism Central version pc.2022.4.
  2. Ensure that you allow port 2009, which is used for Data-in-Transit Encryption.

Procedure

  1. Log on to the Prism Central and click the gear icon.
  2. Go to Hardware > Clusters > Actions and select Enable Data-In-Transit Encryption .
    The following confirmation dialog box is displayed.
    Figure. Enable Data-in-Transit Encryption Click to enlarge

  3. Click Enable to confirm.

What to do next

You can disable Data-in-Transit Encryption after you have enabled it. To disable Data-in-Transit Encryption, see Disabling Data-in-Transit Encryption.

Disabling Data-in-Transit Encryption

About this task

You can disable Data-in-Transit Encryption after you have enabled it. To disable Data-in-Transit Encryption, do the following.

Procedure

  1. Log on to the Prism Central and click the gear icon.
  2. Go to Hardware > Clusters > Actions and select Disable Data-In-Transit Encryption .
    The following confirmation dialog box is displayed.
    Figure. Disable Data-in-Transit Encryption Click to enlarge

  3. Click Disable to confirm.

Password Retry Lockout

For enhanced security, Prism Central and Prim Element locks out the default 'admin' account for a period of 15 minutes after five unsuccessful login attempts. Once the account is locked out, the following message is displayed at the login screen.

Account locked due to too many failed attempts

You can attempt entering the password after the 15 minutes lockout period, or contact Nutanix Support in case you have forgotten your password.

Security Management Using Identity and Access Management (Prism Central)

Enabled and administered from Prism Central, Identity and Access Management (IAM) is an authentication and authorization feature that uses attribute-based access control (ABAC). It is disabled by default. This section describes Prism Central IAM prerequisites, enablement, and SAML-based standard-compliant identity provider (IDP) configuration.

After you enable the Micro Services Infrastructure (CMSP) on Prism Central, IAM is automatically enabled. You can configure a wider selection of identity providers, including Security Assertion Markup Language (SAML) based identity providers. The Prism Central web console presents an updated sign-on/authentication page.

The enable process migrates existing directory, identity provider, and user configurations, including Common Access Card (CAC) client authentication configurations. After enabling IAM, if you want to enable a client to authenticate by using certificates, you must also enable CAC authentication. For more information, see Identity and Access Management Prerequisites and Considerations. Also, see the Identity and Access Management Software Support topic in the Prism Central Release Notes for specific support requirements.

The work flows for creating authentication configurations and providing user and role access described in Configuring Authentication) are the same whether IAM is enabled or not.

IAM Features

Highly Scalable Architecture

Based on the Kubernetes open source platform, IAM uses independent pods for authentication (AuthN), authorization (AuthZ), and IAM data storage and replication.

  • Each pod automatically scales independently of Prism Central when required. No user intervention or control is required.
  • When new features or functions are available, you can update IAM pods independently of Prism Central updates through Life Cycle Manager (LCM).
  • IAM uses a rolling upgrade method to help ensure zero downtime.
Secure by Design
  • Mutual TLS authentication (mTLS) secures IAM component communication.
  • The Micro Services infrastructure (CMSP) on Prism Central provisions certificates for mTLS.
More SAML Identity Providers (IDP)

Without enabling CMSP/IAM on Prism Central, Active Directory Federation Services (ADFS) is the only supported IDP for Single Sign-on. After you enable it, IAM supports more IDPs. Nutanix has tested these IDPs when SAML IDP authentication is configured for Prism Central.

  • Active Directory Federation Services (ADFS)
  • Azure Active Directory Federation Services (Azure ADFS)
    Note: Azure AD is not supported.
  • Okta
  • PingOne
  • Shibboleth
  • Keycloak

Users can log on from the Prism Central web console only. IDP-initiated authentication work flows are not supported. That is, logging on or signing on from an IDP web page or site is not supported.

Updated Authentication Page

After enabling IAM, the Prism Central login page is updated depending on your configuration. For example, if you have configured local user account and Active Directory authentication, this default page appears for directory users as follows. To log in as a local user, click the Log In with your Nutanix Local Account link.

Figure. Sample Default Prism Central IAM Logon Page, Active Directory And Local User Authentication Click to enlarge Sample Prism Central IAM Logon Page shows new credential fields

In another example, if you have configured SAML authentication instances named Shibboleth and AD2, Prism Central displays this page.

Figure. Sample Prism Central IAM Logon Page, Active Directory , Identity Provider, And Local User Authentication Click to enlarge Sample Prism Central IAM Logon Page shows new credential fields

Note: After upgrade to pc.2022.9 if the Security Assertion Markup Language (SAML) IDP is configured, you need to download the Prism Central metadata and re-configure the SAML IDP to recognize Prism Central as the service provider. See Updating ADFS When Using SAML Authentication to create the required rules for ADFS.

Identity and Access Management Prerequisites and Considerations

Make sure you meet the requirements listed before you enable the microservices infrastructure, which enables IAM.

IAM Prerequisites

For specific minimum software support and requirements for IAM, see the Prism Central release notes.

For microservices infrastructure requirements, see Enabling Microservices Infrastructure in the Prism Central Guide .

Prism Central
  • The Microservices Infrastructure and IAM is supported on clusters running AHV or ESXi only. For ESXi clusters you may need to enter your vCenter credentials (user name and password) and a network for deployment.
  • The host cluster must be registered with this Prism Central instance.
  • During installation or upgrade, ensure that you allocate a Virtual IP address (VIP) for Prism Central. For information about how to set the VIP for the Prism Central VM, see Installing Prism Central (1-Click Method) in the Acropolis Upgrade Guide . Once set, do not change this address.
  • Ensure that you have created a fully qualified domain name (FQDN) for Prism Central. Once the Prism Central FQDN is set, do not change it. For more information about how to set the FQDN in the Cluster Details window, see Managing Prism Central in the Prism Central Guide .
  • When microservices infrastructure is enabled on a Prism Central scale-out three-node deployment, reconfiguring the IP address and gateway of the Prism Central VMs is not supported.
  • Ensure connectivity between Prism Central and its managed Prism Element clusters.
  • Enable Microservices Infrastructure on Prism Central (CMSP) first to enable and use IAM. For more information, see Enabling Microservices Infrastructure in the Prism Central Guide .
  • IAM supports small or large single PC VM deployments. However, you cannot expand the single VM deployment to a scale-out three-node deployment once CMSP has been enabled.
  • IAM supports scale-out three-node PC VM deployments. Reverting this deployment to a single PC VM deployment is not supported.
  • Make sure Prism Central is managing at least one Prism Element cluster. For more information about how to register a cluster, see Register (Unregister) Cluster with Prism Central in the Prism Central Guide .
  • You cannot unregister the Prism Element cluster that is hosting the Prism Central deployment where you have enabled CMSP and IAM. You can unregister other clusters being managed by this Prism Central deployment.
Prism Element Clusters

Ensure that you have configured the following cluster settings. For more information, see Modifying Cluster Details in Prism Web Console Guide .

  • Virtual IP address (VIP). Once set, do not change this address
  • iSCSI data services IP address (DSIP). Once set, do not change this address
  • NTP server
  • Name server

IAM Considerations

Existing Authentication and Authorization Migrated After Enabling IAM
  • When you enable IAM by enabling CMSP, IAM migrates existing authentication and authorization configurations, including Common Access Card client authentication configurations.
Upgrading Prism Central After Enabling IAM
  • After you upgrade Prism Central, if CMSP (and therefore IAM) was previously enabled, both the services are enabled by default. You must contact Nutanix Support for any custom requirement.
Note: After upgrade to pc.2022.9 if the Security Assertion Markup Language (SAML) IDP is configured, you need to download the Prism Central metadata and re-configure the SAML IDP to recognize Prism Central as the service provider. See Updating ADFS When Using SAML Authentication to create the required rules for ADFS.
User Session Lifetime
  • Each session has a maximum lifetime of 8 hours
  • Session idle time is 15 minutes. After 15 minutes, a user or client is logged out and must re-authenticate.
Client Authentication and Common Access Card (CAC) Support
  • IAM supports deployments where CAC authentication and client authentication are enabled on Prism Central. After enabling IAM, if you want to enable a client to authenticate by using certificates, you must also enable CAC authentication.
  • Ensure that port 9441 is open in your firewall if you are using CAC client authentication.
Hypervisor Support
  • You can deploy IAM on an on-premise Prism Central (PC) deployment hosted on an AOS cluster running AHV or ESXi. Clusters running other hypervisors are not supported.

Enabling IAM

Before you begin

  • IAM on Prism Central is disabled by default. When you enable the Micro Services Infrastructure on Prism Central, IAM is automatically enabled.
  • See Enabling Microservices Infrastructure in the Prism Central Guide .
  • See Identity and Access Management Prerequisites and Considerations and also the Identity and Access Management Software Support topic in the Prism Central release notes for specific support requirements.

Procedure

  1. Enable Micro Services Infrastructure on Prism Central as described in Enabling Micro Services Infrastructure in the Prism Central Guide .
  2. To view task status:
    1. Open a web browser and log in to the Prism Central web console.
    2. Go to the Activity > Tasks dashboard and find the IAM Migration & Bootstrap task.
    The task takes up to 60 minutes to complete. Part of the task is migrating existing authentication configurations.
  3. After the enablement tasks are completed, including the IAM Migration & Bootstrap task, log out of Prism Central. Wait at least 15 minutes before logging on to Prism Central.

    The Prism Central web console shows a new log in page as shown below. This confirms that IAM is enabled.

    Note:

    Depending on your existing authentication configuration, the log in page might look different.

    Also, you can go to Settings > Prism Central Management page to verify if Prism Central on Microservices Infrastructure (CMSP) is enabled. CMSP and IAM enablement happen together.

    Figure. Sample Prism Central IAM Logon Page Click to enlarge Sample Prism Central IAM Logon Page shows new credential fields

What to do next

Configure authentication and access. If you are implementing SAML authentication with Active Directory Federated Services (ADFS), see Updating ADFS When Using SAML Authentication.

Configuring Authentication

Caution: Prism Central does not support the SSLv2 and SSLv3 ciphers. Therefore, you must disable the SSLv2 and SSLv3 options in a browser before accessing Prism Central. This disabling avoids an SSL Fallback and access denial situations. However, you must enable TLS protocol in the browser.

Prism Central supports user authentication with these authentication options:

  • SAML authentication. Users can authenticate through a supported identity provider when SAML support is enabled for Prism Central. The Security Assertion Markup Language (SAML) is an open standard for exchanging authentication and authorization data between two parties: an identity provider (IDP) and Prism Central as the service provider.

    With IAM, in addition to ADFS, other IDPs are available. For more information, see Security Management Using Identity and Access Management (Prism Central) and Updating ADFS When Using SAML Authentication.

  • Local user authentication. Users can authenticate if they have a local Prism Central account. For more information, see Managing Local User Accounts .
  • Active Directory authentication. Users can authenticate using their Active Directory (or OpenLDAP) credentials when Active Directory support is enabled for Prism Central.

Enabling and Configuring Client Authentication/CAC

Before you begin

  • To enable a client to authenticate by using certificates, you must also enable CAC authentication.
  • Ensure that port 9441 is open in your firewall if you are using CAC client authentication. After enabling CAC client authentication, your CAC logon redirects the browser to use port 9441.

Procedure

  1. In the web console, click the gear icon in the main menu and then select Authentication in the Settings page.
  2. Click the Client tab, then do the following steps.
    1. Select the Configure Client Chain Certificate check box.
    2. Click the Choose File button, browse to and select a client chain certificate to upload, and then click the Open button to upload the certificate.
      Note: Uploaded certificate files must be PEM encoded. The web console restarts after the upload step.
    3. To enable client authentication, click Enable Client Authentication .
    4. To modify client authentication, do one of the following:
      Note: The web console restarts when you change these settings.
      • Click Enable Client Authentication to disable client authentication.
      • Click Remove to delete the current certificate. (This deletion also disables client authentication.)
      • To enable OCSP or CRL-based certificate revocation checking, see Certificate Revocation Checking.

    Client authentication allows you to securely access the Prism by exchanging a digital certificate. Prism validates if the certificate is signed by the trusted signing certificate of your organization.

    Client authentication ensures that the Nutanix cluster gets a valid certificate from the user. Normally, a one-way authentication process occurs where the server provides a certificate so the user can verify the authenticity of the server. When client authentication is enabled, this process becomes a two-way authentication where the server also verifies the authenticity of the user. A user must provide a valid certificate when accessing the console either by installing the certificate on the local machine, or by providing it through a smart card reader.

    Note: The CA must be the same for both the client chain certificate and the certificate on the local machine or smart card.
  3. To specify a service account that the Prism Central web console can use to log in to Active Directory and authenticate Common Access Card (CAC) users, select the Configure Service Account check box. Then do the following in the indicated fields:
    1. Directory : Select the authentication directory that contains the CAC users that you want to authenticate.
      This list includes the directories that are configured on the Directory List tab.
    2. Service Username : Enter the user name in the user name@domain.com format that you want the web console to use to logon to the Active Directory.
    3. Service Password : Enter the password for the service user name.
    4. Click Enable CAC Authentication .
      Note: For federal customers only.
      Note: The Prism Central console restarts after you change this setting.

    The Common Access Card (CAC) is a smart card about the size of a credit card, which some organizations use to access their systems. After you insert the CAC into the CAC reader connected to your system, the software in the reader prompts you to enter a PIN. After you enter a valid PIN, the software extracts your personal certificate that represents you and forwards the certificate to the server using the HTTP protocol.

    Nutanix Prism verifies the certificate as follows:

    • Validates that the certificate has been signed by the trusted signing certificate of your organization.
    • Extracts the Electronic Data Interchange Personal Identifier (EDIPI) from the certificate and uses the EDIPI to check the validity of an account within the Active Directory. The security context from the EDIPI is used for your PRISM session.
    • Prism Central supports both certificate authentication and basic authentication in order to handle both Prism Central login using a certificate and allowing REST API to use basic authentication. It is physically not possible for REST API to use CAC certificates. With this behavior, if the certificate is present during Prism Central login, the certificate authentication is used. However, if the certificate is not present, basic authentication is enforced and used.
    If you map a Prism Central role to a CAC user and not to an Active Directory group or organizational unit to which the user belongs, specify the EDIPI (User Principal Name, or UPN) of that user in the role mapping. A user who presents a CAC with a valid certificate is mapped to a role and taken directly to the web console home page. The web console login page is not displayed.
    Note: If you have logged on to Prism Central by using CAC authentication, to successfully log out of Prism Central, close the browser after you click Log Out .

Updating ADFS When Using SAML Authentication

With Nutanix IAM, to maintain compatibility with new and existing IDP/SAML authentication configurations, update your Active Directory Federated Services (ADFS) configuration - specifically the Prism Central Relying Party Trust settings. For these configurations, you are using SAML as the open standard for exchanging authentication and authorization data between ADFS as the identity provider (IDP) and Prism Central as the service provider. See the Microsoft Active Directory Federation Services documentation for details.

About this task

In your ADFS Server configuration, update the Prism Central Relying Party Trust settings by creating claim rules to send the selected LDAP attribute as the SAML NameID in email address format. For example, map the User Principal Name to NameID in the SAML assertion claims.

As an example, this topic uses UPN as the LDAP attribute to map. You could also map the email address attribute to NameID. See the Microsoft Active Directory Federation Services documentation for details about creating a claims aware Relying Party Trust and claims rules.

Procedure

  1. In the Relying Party Trust for Prism Central, configure a claims issuance policy with two rules.
    1. One rule based on the Send LDAP Attributes as Claims template.
    2. One rule based on the Transform an Incoming Claim template
  2. For the rule using the Send LDAP Attributes as Claims template, select the LDAP Attribute as User-Principal-Name and set Outgoing Claim Type to UPN .
    For User group configuration using the Send LDAP Attributes as Claims template, select the LDAP Attribute as Token-Groups - Unqualified-Names and set Outgoing Claim Type to Group .
  3. For the rule using the Transform an Incoming Claim template:
    1. Set Incoming claim type to UPN .
    2. Set the Outgoing claim type to Name ID .
    3. Set the Outgoing name ID format to Email .
    4. Select Pass through all claim values .

Adding a SAML-based Identity Provider

Before you begin

  • An identity provider (typically a server or other computer) is the system that provides authentication through a SAML request. There are various implementations that can provide authentication services in line with the SAML standard.
  • You can specify other tested standard-compliant IDPs in addition to ADFS. See the Prism Central release notes topic Identity and Access Management Software Support for specific support requirements and also Security Management Using Identity and Access Management (Prism Central).

    IAM allows only one identity provider at a time, so if you already configured one, the + New IDP link does not appear.

  • You must configure the identity provider to return the NameID attribute in SAML response. Prism Central uses the NameID attribute for role mapping.

Procedure

  1. In the web console, click the gear icon in the main menu and then select Authentication in the Settings page.
  2. To add a SAML-based identity provider, click the + New IDP link.

    A set of fields is displayed. Do the following in the indicated fields:

    1. Configuration name : Enter a name for the identity provider. This name appears in the logon authentication screen.
    2. Group Attribute Name (Optional) : Optionally, enter the group attribute name such as groups . Ensure that this name matches the group attribute name provided in the IDP configuration.
    3. Group Attribute Delimiter (Optional) : Optionally, enter a delimiter that needs to be used when multiple groups are selected for the Group attribute.
    4. Import Metadata : Click this option to upload a metadata file that contains the identity provider information.

      Identity providers typically provide an XML file on their website that includes metadata about that identity provider, which you can download from that site and then upload to Prism Central. Click + Import Metadata to open a search window on your local system and then select the target XML file that you downloaded previously. Click the Save button to save the configuration.

      Figure. Identity Provider Fields (metadata configuration) Click to enlarge

    This step completes configuring an identity provider in Prism Central, but you must also configure the callback URL for Prism Central on the identity provider. To configure the callback URL, click the Download Metadata link just below the Identity Providers table to download an XML file that describes Prism Central and then upload this metadata file to the identity provider.
  3. To edit an identity provider entry, click the pencil icon for that entry.

    After clicking the pencil icon, the relevant fields reappear. Enter the new information in the appropriate fields and then click the Save button.

  4. To delete an identity provider entry, click the X icon for that entry.

    After clicking the X icon, a window prompt appears to verify the delete action; click the OK button. The entry is removed from the list.

Restoring Identity and Access Management Configuration Settings

Prism Central regularly backs up the Identity and Access Management (IAM) database, typically every 15 minutes. This procedure describes how to restore a specific time-stamped IAM backup instance.

About this task

You can restore authentication and authorization configuration settings available from the IAM database. For example, use this procedure to restore your authentication and authorization configuration to a previous state. You can choose an available time-stamped backup instance when you run the shell script in this procedure, and your authentication and authorization configuration is restored to the settings in the point-in-time backup.

Procedure

  1. Log in to the Prism Central VM through an SSH session as the nutanix user.
  2. Run the backup shell script restore_iamv2.sh
    nutanix@pcvm$ sh /home/nutanix/cluster/bin/restore_iamv2.sh
    The script displays a numbered list of available backups, including the backup file time-stamp.
    Enter the Backup No. from the backup list (default is 1):
  3. Select a backup by number to start the restore process.
    The script displays a series of messages indicating restore progress, similar to:
    You Selected the Backup No 1
    Stopping the IAM services
    Waiting to stop all the IAM services and to start the restore process
    Restore Process Started
    Restore Process Completed
    ...
    Restarting the IAM services
    IAM Services Restarted Successfully

    After the script runs successfully, the command shell prompt returns and your IAM configuration is restored.

  4. To validate that your settings have been restored, log on to the Prism Central web console and go to Settings > Authentication and check the settings.

Accessing a List of Open Source Software Running on a Cluster

As an admin user, you can access a text file that lists all of the open source software running on a cluster.

About this task

Perform the following procedure to access a list of the open source software running on a cluster.

Procedure

  1. Log on to any Controller VM in the cluster as the admin user by using SSH.
  2. Access the text file by using the following command.
    less /usr/local/nutanix/license/blackduck_version_license.txt
Read article