This is one stop global knowledge base where you can learn about all the products, solutions and support features.
vSphere reports the following error while attempting to format a VMFS datastore using a Pure Storage iSCSI LUN:
"HostDatastoreSystem.CreateVmfsDatastore" for object "<...>" on vCenter Server "<...>" failed
The LUN will report as online and available under the "Storage Adapters" section in the vSphere Client.
This error can be due to improper configuration in the network path causing jumbo frames to be fragmented from the ESXi Host to the FlashArray.
How to confirm Jumbo Frames can pass through the network
Run the following command from the ESXi Host in question via SSH:
vmkping -d -s 8972 <target portal ipaddress>
If no response is received, or the following message is returned, then jumbo frames are not successfully traversing the network:
sendto() failed (Message too long)
sendto() failed (Message too long)
sendto() failed (Message too long)
There is an L2 device between the ESXi host and FlashArray that is not allowing jumbo frames to properly pass. Please have the customer check virtual and physical switches on the subnet to ensure jumbo frames are configured from end-to-end.
Make sure all network devices allow jumbo frames to pass from the ESXi host to the Pure Storage FlashArray.
Enabling CHAP authentication leads to ESXi hosts disconnecting and they are unable to reconnect.
The array has CHAP authentication enabled and is unable to reconnect after configuring CHAP on the ESXi host.
Purity does not support Dynamic Discovery with CHAP.
Follow this blog post for a more detailed guide.
Configure the ESXi host to use static CHAP, confirm Dynamic CHAP is not set up, and inherit from parent is not checked.
Two methods of configuring CHAP to the pure array:
When trying to increase an existing volume you will see the new volume size in vCenter. However, when the increase button is clicked for the datastore, the available capacity on the volume does not show up.
If the correct volume size is not being reflected, the management services on the ESX hosts that aren't reflecting the volume size correctly may need to be restarted and storage rescanned.
Please refer to VMware support for this recommendation as restarting management services has the potential to impact tasks that are running on the ESXi host at the time of the restart.
There are a few reasons this may be happening, you may have one fo the following issues , please refer to the following:
If none of these references fix the issue, please open a VMware Support case and reach out to Pure Storage Support for further assistance.
The Pure Storage FlashArray is consistently used with VMware environments and there's a good chance that Support will have cases where they need to troubleshoot and diagnose how the FlashArray interacts with the VMware environment.
During live troubleshooting, both the customer and Pure Support can look at the logs live as needed, however for an investigation into events that have already occurred VMware Support Logs will need to be provided to Support to move that investigation forward.
VMware has a detailed KB on how to Gather vCenter and ESXi Logs. Please review VMware's documentation.
With vCenter 6.7 release, there has been more adoption of the HTML5 Client now. There is no export option in the monitor tab, so the process is a little different.
Here is an example of the HTML5 Client.
Right Click on the vCenter and Select Export System Logs |
---|
Check the box to include vCenter logs. If the Support case is related to vVols, SRM or Plugins, this is very important to gather. |
---|
Then you can export the logs. The default selections for the hosts are usually enough for Pure Support. |
---|
Keep in mind, the HTML5 Log exports will be named a little different then the from the flash client. |
---|
Then here is an example of the Flash client and the monitor tab.
Navigate to the vCenter that the logs are being gathered for.
Go to the Monitor tab and click on system logs. Then click on export system logs. |
---|
Check the box to include vCenter logs. If the Support case is related to vVols, SRM or Plugins, this is very important to gather. |
---|
Then click Finish. Another window will pop up asking where to save the compressed file. |
---|
Once the logs are downloaded, they can be uploaded via Pure1 for Support.
There is a Uploading Files to a Support Case KB that outlines that process.
The SRA's logs are located in:
/var/log/vmware/srm/SRAs/sha256{characterSpamHere}
This directory should have the logs referenced below.
The SRM's logs are located in:
/var/log/vmware/srm/
The SRA 's logs are located in:
%PROGRAMDATA%\VMware\VMware vCenter Site Recovery Manager\Logs\SRAs\purestorage
Each invocation of the SRA produces one log file. Sort by Date Modified to see the commands executed in chronological order.
The SRM 's logs are located in :
%PROGRAMDATA%VMware\VMware vCenter Site Recovery Manager\Logs
Look for vmware-dr-##.log files. The file with the largest ## is the most recent. The SRM logs are useful for diagnosing problems when (a) The SRA responded correctly, but SRM still failed an operation, and (b) The SRA crashed on launch before being able to log anything. Before collecting SRM logs, be sure to quit SRM and wait a few seconds for all the logs to be flushed to disk.
The SRA installer's logs are located in:
%PROGRAMDATA%\PureStorage\PureSRAInstaller.log
To verify that the SRA has been installed correctly, do the following:
HKEY_LOCAL_MACHINE\SOFTWARE\VMware, Inc.\VMware vCenter Site Recovery Manager\InstallPath
in the Registry. It is usually
C:\Program Files\VMware\VMware vCenter Site Recovery Manager\bin
.
command.pl PureSRA.exe PureSRA.pdb PureSRA.exe.config PureStorage.Rest.dll PureStorage.Rest.pdb Newtonsoft.Json
%windir%\Microsoft.NET\Framework\v3.5
The SRA log starts by logging the SRA version and the info about the environment it runs in. The SRA expects to run as an admin, and as a 64-bit process. It will not work correctly otherwise. Look for something like to confirm this is the case:
[02/12/2015 11:03:30,Logging session for discoverArrays,V] Process is 64-Bit. [02/12/2015 11:03:30,Logging session for discoverArrays,V] Running as the administrator.
The SRA logs the input from SRM. Look for "Received input:" followed by an XML string near the top of the log file. In the XML string, there are usually two Connection nodes, with the id "localArray" and and "peerArray". They are the pair of array the operation is being applied on. These info are entered by the user when configuring the SRA from inside SRM. Make sure they are actually the arrays you are using for SRM. Example:
<Connection id="localArray"> <Addresses> <Address id="spA">vss-purity-vm1.dev.purestorage.com</Address> </Addresses> ... </Connection>
<Connection id="peerArray"> <Addresses> <Address id="spA">vss-purity-vm2.dev.purestorage.com</Address> </Addresses> ... </Connection>
Each entry in the SRA log is associated with a verbosity level. Search for ", E ]" and ", W ]" in the log file to see the logged E rrors and W arnings, respectively. They are usually indicative of what went wrong. For example, the entries below indicate that the SRA could not connect to an array:
[02/11/2015 09:40:56,SRMCommandHandlerBase.cs:ConnectToInputAr<wbr/>rays,W] Connection failed to FlashArray at 10.66.50.90 using connection localArray [02/11/2015 09:40:56,SRMCommandHandlerBase.cs:ConnectToInputAr<wbr/>rays,E] "PureRestException: HttpStatusCode = 'BadRequest', RestErrorCode = 'InvalidVersion', Details = '', InnerException = ''"
If the entire operation failed, the SRA will output an error. Look for "Setting output:" followed by some XML string. You should see an Error node with an error ID, such as:
<Error code="1004"> <purestorage:PureExceptionMessage>...</purestorage<wbr/>:PureExceptionMessage> <purestorage:LogFile>...</purestorage:LogFile> </Error>
The meaning of the error codes are listed below. For example, 1004 stands for "array unreachable".
WarningSyncInProgress = 500, // Defined by VMware ErrorUnhandledException = 1001, ErrorUnknownCommand = 1002, ErrorPureException = 1003, ErrorArrayUnreachable = 1004, ErrorArrayUnauthorized = 1005, ErrorArrayIdNotAvailable = 1006, ErrorArrayIDMissing = 1007, ErrorBadArrayPair = 1008, ErrorVolumeNotInPGroup = 1009, ErrorCannotFindSyncStatus = 1010, ErrorCannotFindSnapshot = 1011, ErrorCannotFindVolume = 1012, ErrorTestFailoverStartInProgress = 1013, ErrorVolumeConnectionFailed = 1014, WarningCannotFindPgroup = 1015, ErrorCannotCreatePgroup = 1016, WarningDeviceAlreadyFailedOver = 1017, WarningPrepareFailoverInProgress = 1018, ErrorArrayInsufficientPermissions = 1019, WarningHostConnectionFailed = 1020, ErrorCannotCreateVolume = 1021, ErrorCannotDisconnectVolume = 1022, ErrorCannotRenameVolume = 1023, ErrorVolumeNotDisconnected = 1024, ErrorCannotDeleteVolume = 1025, ErrorCannotSnapshotPGroup = 1026, ErrorEmptyOrMissingDeviceID = 1027, WarningAlreadyPerformedPrepareRestoreReplication = 1028, WarningAlreadyPerformedRestoreReplication = 1029, WarningAlreadyPerformedPrepareReverseReplication = 1030
The "PureExceptionMessage" portion of the error will contain more specific information about why the operation failed. Examples are under the subheadings for specific errors below.
Additionally, the SRA logs the HTTP requests (URL only) and their response codes. Look for "Rest Library transcript:" in the log. In the HTTP transcript that follows, look for "PureStorage.Rest Error:". Note that many HTTP errors are benign and expected (e.g. we test for the existence of a volume by asking the array about it; if the array responds with a does-not-exist error, we know it doesn't exist), so view errors in the HTTP transcript in the context of other clues in the log.
Example error:
[01/04/2018 11:54:01,DiscoverArrays.cs:ProcessCommand,V] Exiting Setting output: <?xml version="1.0" encoding="utf-8"?> <Response xmlns="http://www.vmware.com/srm/sra/v2" xmlns:purestorage="http://www.purestorage.com/sra"> <Error code="1004"> <purestorage:PureExceptionMessage>The remote server returned an error: (400) Bad Request. Message from Purity='ctx:,msg:invalid credentials'</purestorage:PureExceptionMessage> <purestorage:LogFile>C:\ProgramData\VMware\VMware vCenter Site Recovery Manager\Logs\SRAs\purestorage\discoverArrays_2018-01-04-11-53-58-443789-6607c4a5-ed27-4f4a-8d3b-46c8db303e85.log</purestorage:LogFile> </Error> </Response>
This usually means the array managers are configured incorrectly. If it is a "one to many" or "many to one" configuration, ensure that the Pure FlashArray username and password are the same for each array.
Example error:
[01/04/2018 11:54:01,DiscoverArrays.cs:ProcessCommand,V] Exiting Setting output: <?xml version="1.0" encoding="utf-8"?> <Response xmlns="http://www.vmware.com/srm/sra/v2" xmlns:purestorage="http://www.purestorage.com/sra"> <Error code="1004"> <purestorage:PureExceptionMessage>The remote server returned an error: (400) Bad Request. Message from Purity='ctx:,msg:invalid credentials'</purestorage:PureExceptionMessage> <purestorage:LogFile>C:\ProgramData\VMware\VMware vCenter Site Recovery Manager\Logs\SRAs\purestorage\discoverArrays_2018-01-04-11-53-58-443789-6607c4a5-ed27-4f4a-8d3b-46c8db303e85.log</purestorage:LogFile> </Error> </Response>
This usually means the array managers are configured incorrectly. If it is a "one to many" or "many to one" configuration, ensure that the Pure FlashArray username and password are the same for each array.
Example error:
[01/04/2018 11:54:01,DiscoverArrays.cs:ProcessCommand,V] Exiting Setting output: <?xml version="1.0" encoding="utf-8"?> <Response xmlns="http://www.vmware.com/srm/sra/v2" xmlns:purestorage="http://www.purestorage.com/sra"> <Error code="1004"> <purestorage:PureExceptionMessage>The remote server returned an error: (400) Bad Request. Message from Purity='ctx:,msg:invalid credentials'</purestorage:PureExceptionMessage> <purestorage:LogFile>C:\ProgramData\VMware\VMware vCenter Site Recovery Manager\Logs\SRAs\purestorage\discoverArrays_2018-01-04-11-53-58-443789-6607c4a5-ed27-4f4a-8d3b-46c8db303e85.log</purestorage:LogFile> </Error> </Response>
This usually means the array managers are configured incorrectly. If it is a "one to many" or "many to one" configuration, ensure that the Pure FlashArray username and password are the same for each array.
Example error:
[01/04/2018 11:54:01,DiscoverArrays.cs:ProcessCommand,V] Exiting Setting output: <?xml version="1.0" encoding="utf-8"?> <Response xmlns="http://www.vmware.com/srm/sra/v2" xmlns:purestorage="http://www.purestorage.com/sra"> <Error code="1004"> <purestorage:PureExceptionMessage>The remote server returned an error: (400) Bad Request. Message from Purity='ctx:,msg:invalid credentials'</purestorage:PureExceptionMessage> <purestorage:LogFile>C:\ProgramData\VMware\VMware vCenter Site Recovery Manager\Logs\SRAs\purestorage\discoverArrays_2018-01-04-11-53-58-443789-6607c4a5-ed27-4f4a-8d3b-46c8db303e85.log</purestorage:LogFile> </Error> </Response>
This usually means the array managers are configured incorrectly. If it is a "one to many" or "many to one" configuration, ensure that the Pure FlashArray username and password are the same for each array.
If the SRA does not hear a response from the array with 60 seconds for a REST call, it times out (indicated by verbiage about "timed out" in the logs).
To change the HTTP timeout value from the default value of 60 seconds, do the following (on all machines where the SRA is installed):
HKEY_LOCAL_MACHINE\SOFTWARE\PureStorage\SRA
(or navigate to it if it already exists).
The change will take effect the next time the SRA is invoked.
By default, the SRA prioritizes host group connections when asked by SRM to connect to hosts (e.g. if a hostgroup HG contains a host H, the SRA will connect to HG when asked to connect to H). Most users should use this behavior.
However, if the user wishes to disable this behavior (
i.e. only connect to hosts on failover)
, the can add a
DWORD
Value (named "DisableHostGroupConnectionOnFailover") under the registry key
HKEY_LOCAL_MACHINE\SOFTWARE\PureStorage\SRA
, and set its value to 1.
One useful tool of debugging problems is to use Fiddler. You will need to follow Decrypting HTTPS-protected traffic to set up debugging for HTTPS traffic. Repeat the failed SRM operation with Fiddler running to see the HTTP traffic. You can save Fiddler trace to file and give to the dev team for further debugging.
Another noteworthy reminder is that the user needs to rescan for the SRA after upgrading it (it's covered in the manual accompanying the SRA), or they may see errors.
Additionally, it might be useful to enable a higher level of logging if what you have is not providing sufficient information. Please note that greater detail in logging can fill up whatever the logs are writing to, so you need to be really careful when enabling this and should only enable it for short periods of time while the customer is reproducing an issue and then disable it immediately after. More details can be found in this VMware KB.