Why Was I Worried?
I have been habitually using scale sets for all of my needs as long as my requirements only involved needing multiple copies of a VM image running safely. Then I started to worry about the difference between a scale set and an availability set… were my scale set VMs not safe?
TLDR; I actually read Azure and Azure CLI documentation and made a simple but cool command below that put my mind at ease for scale sets, so feel free to skip to that if you like.
There is a good stack overflow right here which I added to just now. It has quite a few good answers about availability sets vs scale sets, including some info about a scale set by default having 5 fault domains. So, I recommend starting there if you’re interested in digging in.
A good summary of what I found is that:
- Availability sets by default will spread your resources over fault domains to ensure that outage of one due to a power or network issue, etc does not affect another.
- Availability sets also allow mixing of resources; e.g. 2 VMs with different configuration.
- Scale sets only allow you to have an identical image deployed and they provide the ability to scale it out linearly.
- Scale sets implicitly have one “placement group”. If you want to go over 100 VMs, you have to remove that restriction.
- A placement group has 5 fault domains and is similar (or maybe the same as) an availability set.
As I’m responsible for highly available infrastructure, I wasn’t keen on just accepting this. So, I fiddled around with the Azure CLI for scale sets and made this simple command which indeed shows my 10 instance scale set is indeed spread across multiple fault domains – I hope you find it useful too.
az vmss get-instance-view --subscription "your-subscription-id" \ --resource-group "your-rg" --name "your-scale-set-name" \ --instance-id "*" | grep platformFaultDomain "platformFaultDomain": 0, "platformFaultDomain": 1, "platformFaultDomain": 2, "platformFaultDomain": 4, "platformFaultDomain": 0, "platformFaultDomain": 1, "platformFaultDomain": 3, "platformFaultDomain": 4, "platformFaultDomain": 2, "platformFaultDomain": 3
Here are some additional good resources: