Morning! welcome to virtualcloudblog.com and thanks for checking it out. Today, I’ll write this post about vSAN healty commands. Those are relly useful just in case vSAN is having issues or to check current vSAN status.
Firstly I kindly to recommend you to read the RVC post which explains what RVC is and the way to connect RVC.
vSAN Health commands
vsan.health.health_summary <clusterID> or <cluster_name>
Provides you an Overall vSAN health report. This is really useful to check what is wrong, and what should be remediated.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
/localhost/Prod-vSAN/computers> vsan.health.health_summary 1 Overall health: red (Physical disk issue) +------------------------------------------------------+---------+ | Health check | Result | +------------------------------------------------------+---------+ | Data | Error | | vSAN object health | Error | +------------------------------------------------------+---------+ | Limits | Error | | Current cluster situation | Warning | | After 1 additional host failure | Error | | Host component limit | Passed | +------------------------------------------------------+---------+ | Physical disk | Warning | | Overall disks health | Warning | | Metadata health | Passed | | Disk capacity | Warning | | Software state health | Passed | | Congestion | Warning | | Component limit health | Passed | | Component metadata health | Passed | | Memory pools (heaps) | Passed | | Memory pools (slabs) | Passed | +------------------------------------------------------+---------+ | Cluster | Warning | | ESXi vSAN Health service installation | Passed | | vSAN Health Service up-to-date | Passed | | Advanced vSAN configuration in sync | Passed | | vSAN CLOMD liveness | Passed | | vSAN Disk Balance | Warning | | Resync operations throttling | Passed | | vCenter state is authoritative | Passed | | vSAN cluster configuration consistency | Passed | | Time is synchronized across hosts and VC | Passed | | vSphere cluster members match vSAN cluster members | Passed | | Software version compatibility | Passed | | Disk format version | Passed | +------------------------------------------------------+---------+ | Hardware compatibility | Warning | | vSAN HCL DB up-to-date | Warning | | vSAN HCL DB Auto Update | skipped | | SCSI controller is VMware certified | Passed | | Controller is VMware certified for ESXi release | Passed | | Controller driver is VMware certified | Passed | | Controller firmware is VMware certified | Passed | | Controller disk group mode is VMware certified | Passed | +------------------------------------------------------+---------+ | Network | Passed | | Hosts disconnected from VC | Passed | | Hosts with connectivity issues | Passed | | vSAN cluster partition | Passed | | All hosts have a vSAN vmknic configured | Passed | | All hosts have matching subnets | Passed | | vSAN: Basic (unicast) connectivity check | Passed | | vSAN: MTU check (ping with large packet size) | Passed | | vMotion: Basic (unicast) connectivity check | Passed | | vMotion: MTU check (ping with large packet size) | Passed | | Network latency check | Passed | +------------------------------------------------------+---------+ | Performance service | Passed | | Stats DB object | Passed | | Stats master election | Passed | | Performance data collection | Passed | | All hosts contributing stats | Passed | | Stats DB object conflicts | Passed | +------------------------------------------------------+---------+ | vSAN Build Recommendation | Passed | | vSAN Build Recommendation Engine Health | skipped | | vSAN build recommendation | Passed | +------------------------------------------------------+---------+ | Online health | skipped | | Customer experience improvement program (CEIP) | skipped | +------------------------------------------------------+---------+ |
Just in case of inaccessible objects, same commands reports them. Note: in case of an onging vSAN resync I would bet the object will be accessible as soon as the resync.
Kindly check these two posts to check the following:
1 |
| inaccessible | 1 | dda6a259-1001-a6f1-816b-246e9631f334 |
it also shows us vSAN Global limits, and limits in case of one vSAN hosts is down. This is really useful for capacity planning.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
Limits - Current cluster situation: yellow +-------------------------+----------------------------+---------+ | Resource | Utilization | Health | +-------------------------+----------------------------+---------+ | Component utilization | 4% (3210 of 72000) | Passed | | Disk space utilization | 82% (218446GB of 263243GB) | Warning | | Read Cache reservations | 0% (0.0GB of 16549.6GB) | Passed | +-------------------------+----------------------------+---------+ Limits - After 1 additional host failure: red +-------------------------+----------------------------+--------+ | Resource | Utilization | Health | +-------------------------+----------------------------+--------+ | Component utilization | 5% (3210 of 63000) | Passed | | Disk space utilization | 95% (218446GB of 229708GB) | Error | | Read Cache reservations | 0% (0.0GB of 14480.9GB) | Passed | +-------------------------+----------------------------+--------+ |
An Overall disk health for all disks in vSAN cluster. Please not this is an extract.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
Physical disk - Overall disks health: yellow +-----------------------------+----------------------------------------------------------+---------+----------+-------------+----------------------------------+---------------------------+--------------------------------------+ | Host | Disk | Overall | Metadata | Operational | Operational State | Recommendation | UUID | +-----------------------------+----------------------------------------------------------+---------+----------+-------------+----------------------------------+---------------------------+--------------------------------------+ | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61aa3b) | Warning | Passed | Passed | | | 52da4df4-e09e-8f71-0b20-019a5913df79 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61a83b) | Warning | Passed | Passed | | | 521aac7e-f4a1-0a8e-f31f-e7a373dd1e66 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf6196ef) | Warning | Passed | Passed | | | 5290b671-ff00-978f-bebc-0f31347ebc61 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61a9f3) | Warning | Passed | Passed | Impending permanent disk failure, data is being evacuated | 52401c0a-d8fc-835b-bc7d-51248f7f5ea2 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf62088f) | Warning | Passed | Passed | | | 52a0e869-f473-cf4c-8c3e-7529ea1ace3f | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61cce7) | Warning | Passed | Passed | | | 526cbfcd-b007-34e6-4abd-afa94dbaff56 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf618293) | Warning | Passed | Passed | | | 5229166c-bdb4-f308-4196-bd8cf905ae9f | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61d94f) | Warning | Passed | Passed | | | 52e48ccc-e11c-06de-59f8-70b67fbb5617 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61c347) | Warning | Passed | Passed | | | 522f96c1-1145-2dde-96b0-f68ab1a0fdb2 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf619e4b) | Warning | Passed | Passed | | | 52a69b78-690c-d337-fbf5-bc7994517757 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61ac37) | Warning | Passed | Passed | | | 521182a6-f5d8-d259-9c23-53a2b1a12b76 | |
and an Overall physical disk capacity
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Physical disk - Disk capacity: yellow +-----------------------+----------------------------------------------------------+----------+------------------+-----------------------------------------+--------------------------------------+ | Host | Disk | Capacity | Free Space | Rebalance State | UUID | +-----------------------+----------------------------------------------------------+----------+------------------+-----------------------------------------+--------------------------------------+ | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61aa3b) | Warning | 161.95 GB (9 %) | Reactive rebalance task is in progress | 52da4df4-e09e-8f71-0b20-019a5913df79 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61a83b) | Warning | 200.93 GB (11 %) | Proactive rebalance is needed | 521aac7e-f4a1-0a8e-f31f-e7a373dd1e66 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf6196ef) | Warning | 142.81 GB (8 %) | Reactive rebalance task is in progress | 5290b671-ff00-978f-bebc-0f31347ebc61 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61a9f3) | Warning | 139.98 GB (8 %) | Reactive rebalance task is in progress | 52401c0a-d8fc-835b-bc7d-51248f7f5ea2 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf62088f) | Warning | 119.50 GB (7 %) | Reactive rebalance task is in progress | 52a0e869-f473-cf4c-8c3e-7529ea1ace3f | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61cce7) | Warning | 125.01 GB (7 %) | Reactive rebalance task is in progress | 526cbfcd-b007-34e6-4abd-afa94dbaff56 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf618293) | Warning | 119.77 GB (7 %) | Reactive rebalance task is in progress | 5229166c-bdb4-f308-4196-bd8cf905ae9f | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61d94f) | Warning | 166.54 GB (9 %) | Reactive rebalance task is in progress | 52e48ccc-e11c-06de-59f8-70b67fbb5617 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf61c347) | Warning | 196.55 GB (11 %) | Proactive rebalance is needed | 522f96c1-1145-2dde-96b0-f68ab1a0fdb2 | | vSANHost1.contoso.com | TOSHIBA Serial Attached SCSI Disk (naa.xxxxyzzzzf619e4b) | Warning | 269.96 GB (16 %) | Proactive rebalance is needed | 52a69b78-690c-d337-fbf5-bc7994517757 | |
and hardware compatibly view, showing how vSAN HCL database is.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Hardware compatibility - vSAN HCL DB up-to-date: yellow +--------------------------------+---------------------+ | Entity | Time in UTC | +--------------------------------+---------------------+ | Current time | 2019-02-01 09:28:03 | | Local HCL DB copy last updated | 2018-12-25 08:04:46 | +--------------------------------+---------------------+ vSAN Build Recommendation - vSAN Build Recommendation Engine Health: skipped +----------------------------------------------------------------------------------------------------------------------+ | Issue Information | +----------------------------------------------------------------------------------------------------------------------+ | Internet access is unavailable. Please restore Internet connectivity for vSAN to come up with build recommendations. | +----------------------------------------------------------------------------------------------------------------------+ |
vsan.health.multicast_speed_test <clusterID> or <cluster_name>
Performing a multicast speed test. One host is selected to send multicast traffic, all other hosts will attempt to receive the packets. The test is designed such that the sender sends more than most physical networks can handle, i.e. it is expected that the physical network may drop packets which then won’t be received by the receivers.
Assuming a TCP speed test shows good performance, the most likely suspect for failing the multicast speed test are multicast bottlenecks in physical switches.
The key question this test tries to answer is: What bandwidth is the receiver able to get? For vSAN to work well, this number should be at least 20MB/s. Typical enterprise environments should be able to do 50MB/s or more.
vsan.health.silent_health_check_status <clusterID> or <cluster_name>
It provides if vSAN services are running ok and extra info which is also part of the first command explained previously.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
/localhost/Prod-vSAN/computers> vsan.health.silent_health_check_status 1 +-----------------------------------------------------------------+-------------------------------------+---------------+ | Health Check | Health Check Id | Silent Status | +-----------------------------------------------------------------+-------------------------------------+---------------+ | Cloud Health | | | | Customer experience improvement program (CEIP) | vsancloudhealthceipexception | Silent | | Online health connectivity | vsancloudhealthconnectionexception | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ | Cluster | | | | Advanced vSAN configuration in sync | advcfgsync | Normal | | Deduplication and compression configuration consistency | physdiskdedupconfig | Normal | | Deduplication and compression usage health | physdiskdedupusage | Normal | | Disk format version | upgradelowerhosts | Normal | | ESXi vSAN Health service installation | healtheaminstall | Normal | | Resync operations throttling | resynclimit | Normal | | Software version compatibility | upgradesoftware | Normal | | Time is synchronized across hosts and VC | timedrift | Normal | | vCenter state is authoritative | vcauthoritative | Normal | | vSAN CLOMD liveness | clomdliveness | Normal | | vSAN Disk Balance | diskbalance | Normal | | vSAN Health Service up-to-date | healthversion | Normal | | vSAN cluster configuration consistency | consistentconfig | Normal | | vSphere cluster members match vSAN cluster members | clustermembership | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ | Data | | | | vSAN VM health | vmhealth | Normal | | vSAN object health | objecthealth | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ | Encryption | | | | CPU AES-NI is enabled on hosts | hostcpuaesni | Normal | | vCenter and all hosts are connected to Key Management Servers | kmsconnection | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ | Hardware compatibility | | | | Controller disk group mode is VMware certified | controllerdiskmode | Normal | | Controller driver is VMware certified | controllerdriver | Normal | | Controller firmware is VMware certified | controllerfirmware | Normal | | Controller is VMware certified for ESXi release | controllerreleasesupport | Normal | | Host issues retrieving hardware info | hclhostbadstate | Normal | | SCSI controller is VMware certified | controlleronhcl | Normal | | vSAN HCL DB Auto Update | autohclupdate | Silent | | vSAN HCL DB up-to-date | hcldbuptodate | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ | Limits | | | | After 1 additional host failure | limit1hf | Normal | | Current cluster situation | limit0hf | Normal | | Host component limit | nodecomponentlimit | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ | Network | | | | Active multicast connectivity check | multicastdeepdive | Normal | | All hosts have a vSAN vmknic configured | vsanvmknic | Normal | | All hosts have matching multicast settings | multicastsettings | Normal | | All hosts have matching subnets | matchingsubnet | Normal | | Hosts disconnected from VC | hostdisconnected | Normal | | Hosts with connectivity issues | hostconnectivity | Normal | | Multicast assessment based on other checks | multicastsuspected | Normal | | Network latency check | hostlatencycheck | Normal | | vMotion: Basic (unicast) connectivity check | vmotionpingsmall | Normal | | vMotion: MTU check (ping with large packet size) | vmotionpinglarge | Normal | | vSAN cluster partition | clusterpartition | Normal | | vSAN: Basic (unicast) connectivity check | smallping | Normal | | vSAN: MTU check (ping with large packet size) | largeping | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ | Performance service | | | | All hosts contributing stats | hostsmissing | Normal | | Performance data collection | collection | Normal | | Performance service status | perfsvcstatus | Normal | | Stats DB object | statsdb | Normal | | Stats DB object conflicts | renameddirs | Normal | | Stats master election | masterexist | Normal | | Verbose mode | verbosemode | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ | Physical disk | | | | Component limit health | physdiskcomplimithealth | Normal | | Component metadata health | componentmetadata | Normal | | Congestion | physdiskcongestion | Normal | | Disk capacity | physdiskcapacity | Normal | | Memory pools (heaps) | lsomheap | Normal | | Memory pools (slabs) | lsomslab | Normal | | Metadata health | physdiskmetadata | Normal | | Overall disks health | physdiskoverall | Normal | | Physical disk health retrieval issues | physdiskhostissues | Normal | | Software state health | physdisksoftware | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ | Stretched cluster | | | | Invalid preferred fault domain on witness host | witnesspreferredfaultdomaininvalid | Normal | | Invalid unicast agent | hostwithinvalidunicastagent | Normal | | No disk claimed on witness host | witnesswithnodiskmapping | Normal | | Preferred fault domain unset | witnesspreferredfaultdomainnotexist | Normal | | Site latency health | siteconnectivity | Normal | | Unexpected number of fault domains | clusterwithouttwodatafaultdomains | Normal | | Unicast agent configuration inconsistent | clusterwithmultipleunicastagents | Normal | | Unicast agent not configured | hostunicastagentunset | Normal | | Unsupported host version | hostwithnostretchedclustersupport | Normal | | Witness host fault domain misconfigured | witnessfaultdomaininvalid | Normal | | Witness host not found | clusterwithoutonewitnesshost | Normal | | Witness host within vCenter cluster | witnessinsidevccluster | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ | vSAN Build Recommendation | | | | vSAN Build Recommendation Engine Health | vumconfig | Silent | | vSAN build recommendation | vumrecommendation | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ | vSAN iSCSI target service | | | | Home object | iscsihomeobjectstatustest | Normal | | Network configuration | iscsiservicenetworktest | Normal | | Service runtime status | iscsiservicerunningtest | Normal | +-----------------------------------------------------------------+-------------------------------------+---------------+ |
vsan.health.health_check_interval_status <clusterID> or <cluster_name>
From time to time , by default 60 minutes, a vSAN check is performed.
1 2 3 4 5 6 |
/localhost/Prod-vSAN/computers> vsan.health.health_check_interval_status 1 +------------------+-----------------------+ | Cluster | Health Check Interval | +------------------+-----------------------+ | Workload-DataHub | 60 mins | +------------------+-----------------------+ |
I hope it helps you!
Please use these commands at your own risk