Normally both VMware default settings and common sense used to go together. There are just a few scenarios when you need to deal with some particular changes meaning that Advanced Settings, those you’ve never been aware of before, are going to be your best friend for a while.
After that surprising moment when you want to know why an ESXi host is down, you finally end up checking that DCUI has that purple color you’ll never going to miss. This situation has just started and now it’s time to reboot it and put that host in maintenance mode while you spend a lot of time checking logs trying to find out why this has happened to you. That’s another tough task you’ll get used to, but most of the time if not a HW problem you probably might need to check your firmware versions.
The point here is that sometimes “shit happens” and your are going to assume that error for a long time unless you decide to upgrade the entire HW (Blades, Enclosures, etc..). I wouldn’t recommend it but you could find your own reason to balance whether it is more important having the server back online if this PSOD happens. In that case, be prepared to learn about new secret corners and commands 🙂 .
This time open a SSH session and run this command that will show you lots of ESXi parameters:
Lets focus on the one we are talking on this post, so write:
esxcfg-advcfg -l | grep BlueScreen
And now you’ll get:
/Misc/BlueScreenTimeout [Integer] : timeout in seconds, 0 is no timeout
Meaning that no timeout is set, that’s why your host get stuck on th POSD. So now we are going to set a 60 sec timeout.
esxcfg-advcfg -s 60 /Misc/BlueScreenTimeout
Next unexpected ESXi crash it will reboot after 60 seconds. I do not recommend this, so do it at home 🙂
Obviously you can arrange this parameter using your vSphere client and modifying the Advanced Settings for this Host, but it is quite interesting having a look at esxcfg-advcfg command, at least knowing what it can do. Just in case 🙂