Azure VM Unresponsive, Can’t SSH

My VM Was Non-Responsive

Today I had an Azure virtual machine go down very unexpectedly.

I received error reports from users and tried to go to the related service endpoint myself… and sure enough, it didn’t come up.  Then, I tried to ssh onto the VM and I couldn’t.

I hopped into the Azure portal, went to the VM, and things actually looked alright… it wasn’t stopped, or de-allocated, or anything.

Why?

After multiple minutes of digging around the Azure portal for more information, suddenly the “Activity Log” popped up with a new entry.   This was relatively disconcerting as the issue had been reported over half an hour ago and I had been on the portal for multiple minutes.

The activity log said I had a “health event” which was “updated”.  Upon expanding it, I could see more events that had been “in progress”.  When you click the “in progress” event, you can get JSON for it and look into the details.  In my case, the bottom of the details said this:

    "properties": {
        "title": "We're sorry, your virtual machine isn't available because an unexpected failure on the host server",
        "details": null,
        "currentHealthStatus": "Unavailable",
        "previousHealthStatus": "Unknown",
        "type": "Downtime",
        "cause": "PlatformInitiated"
    }

So, the physical host which was running my VM in azure died. Azure automatically noticed this and moved it to a new physical host, though much slower than I would have appreciated.

The VM came up after a few more minutes and all was right with the world. So… the moral of the story is that if your VM is unresponsive, it may be because the host died, and you may have to wait quite a while to see information on that in the activity log. But it does auto resolve apparently which is nice.

Azure – Linux VM Image Creation – Powershell – With Service Principal/Account

Overview

I was working on creating generalized VM images for use with scale sets and auto-scaling and I found it rather painful to get the complete set of examples for:

  1. De-provision user/etc from VM.
  2. Use Azure Powershell with a Service principal.
  3. Generalize the VM and create an image.

So, here’s a short mostly-code post on how to do that.

Specific Steps

Fair warning… as far as I know, you can’t use the VM after doing this… but you can create a new copy of it from the image, so that doesn’t matter much.

Before getting to Powershell, run this in your VM to de-provision the most recently set up user account (e.g. I’ll install everything on user “john” created with the Azure VM).  This will remove that user.

sudo waagent -deprovision+user

Now, just run the below command after setting your own values for the 5 variables up top.  This will log in to the RM with the credentials you provide in the pop-up, and then it will stop and generalize the VM, adn tehn create an image from it and store that image in the same resource group as the VM.

$vmName = "YOUR_VM_NAME"
$rgName = "YOUR_RG_NAME"
$location = "YOUR_REGION"
$imageName = "YOUR_IMAGE_NAME"
$tenant = "YOUR_TENANT_ID"

$c = Get-Credential # Input your service principal client-id/secret.
Connect-AzureRmAccount -Credential $c -ServicePrincipal -Tenant $tenant

Stop-AzureRmVM -ResourceGroupName $rgName -Name $vmName -Force
Set-AzureRmVm -ResourceGroupName $rgName -Name $vmName -Generalized
$vm = Get-AzureRmVM -Name $vmName -ResourceGroupName $rgName
$image = New-AzureRmImageConfig -Location $location -SourceVirtualMachineId $vm.Id
New-AzureRmImage -Image $image -ImageName $imageName -ResourceGroupName $rgName

Configuration Trouble?

  • If you’re not sure what a service account / principal is or how to create one, the process is quite involved and I highly recommend following one of the many Microsoft-provided tutorials.
  • You can find your tenant ID by clicking the directory + subscription button at the top of the portal OR by hovering over your name/info at the top right corner.
  • The region strings can be tricky; but just Google the Microsoft site if you’re not sure.  A US East 2 example is “EastUS2”.

What’s Next?

Your VM image can now be found in that resource group – go to the portal and see.  You can go into the image in the portal and create a new VM from it, or you can use it to boot up a scale set, etc.