The last thing you want in a clustered virtualization environment is to lose access to your storage. No storage means no virtual server.
A former RHEV client of mine had a scenario where their storage was some how inaccessible to all the hypervisors in the cluster. They power cycled the RHEV hypervisors and even rebooted the RHEV-M management server a few times, but no matter what they did, all the virtual servers were offline, yet the RHEV-M web interface still reported that they were in an “Unknown” state. Not up and online where you could power it off, and not down where you could power it up, but “Unknown” where basically you can’t do anything.
Without a doubt, the first thing you should be doing in this case is phone Red Hat support and let them know what’s going on. They will always know the best way to proceed.
I’d like to thank ‘eprasad’ on the #rhev channel on FreeNode for the assistance with this issue.
I have just had the same problem as my former client. All my virtual guests were in an unknown state after my storage domain became inaccessible.
I am using RHEV 3.1, with RHEV-H hypervisors connected to NFS storage.
Basically, the in’s and out’s of this problem seems to be that the RHEV-M database is in a state of limbo after losing the storage and a bit of manual intervention is required.
Again, I stress you should be speaking with Red Hat before doing this.
If you are in the exact scenario, you can manipulate the backend RHEV-M database and manually set the virtual guests to a down state.
To do this, login to a shell on the RHEV-M server and connect to the local Postgres database.
[root@rhevm ~]# psql -U engine Password for user engine: psql (8.4.13) Type "help" for help. engine=>
Once you’ve connect to the postgres shell, we need to find the VM GUID number that is associated to our virtual guest.
In this example, we will use the virtual guest “server01-example-com”.
To grab the VM GUID, run the following.
engine=> select vm_guid from vm_static where vm_name='server01-example-com'; vm_guid -------------------------------------- df13ca7f-b6fa-4820-8a7b-28d785221b64 (1 row)
Now that we have the GUID, copy that long output and we will use this in our next command.
Here we are telling RHEV-M to set the status of our virtual guest to a status of “0”. The “0” represents a status of “down”.
engine=> UPDATE vm_dynamic SET status=0 WHERE vm_guid='df13ca7f-b6fa-4820-8a7b-28d785221b64'; UPDATE 1
You should now see in the RHEV-M web interface that the status of your affected virtual guest has now been set to “down”. If your storage and hypervisors are back online, you will now be able to turn on your virtual guest and let it perform a normal boot up process.