Data disksedit
Managed disks can be attached to Data nodes to use as the data directory for the node. The ARM template can attach Standard HDD disks or Premium managed disks, for those VM SKUs that support them:
-
storageAccountType -
The performance tier of managed disks.
Standardwill use Standard HDD disks, whilstDefaultwill use Premium managed disks for those VM SKUs that support Premium managed disks, and Standard HDD disks for those that do not. The default isDefault. -
vmDataDiskSize -
The size of each attached managed disk. Choose between
32TiB32 Tebibytes
16TiB16 Tebibytes
8TiB8 Tebibytes
4TiB4 Tebibytes
2TiB2 Tebibytes
1TiB1 Tebibyte
512GiB512 Gibibytes
256GiB256 Gibibytes
128GiB126 Gibibytes
64GiB64 Gibibytes
32GiB32 Gibibytes
Default is
1TiB. -
vmDataDiskCount -
The number of managed disks to attach to each data node. The total number of managed disks will be
vmDataNodeCount * vmDataDiskCount
If the number of disks selected is more than can be attached to the data node VM SKU, the maximum number of disks that can be attached for the data node VM SKU size will be used. This is equivalent to
Math.min(vmDataDiskCount, data node VM SKU maximum attached disks)
Must be greater than or equal to 0. Default is the maximum number of disks supported by the data node VM SKU.
Disks are partitioned with fdisk when less than 2TB, and with parted when larger,
with an ext4 filesystem and 4096 byte block size.
Data is striped across attached disks per data node in a RAID 0 array, using mdadm on Linux. When only one managed disk is attached, no RAID 0 array is configured. When a value of 0 is specified, the data node will use the temp storage of the VM.
Temp storage, with filesystem /dev/sdb1 mounted on /mnt in Ubuntu,
is present on the physical machine hosting the VM. It is ephemeral in nature and
not persistent; A VM can move to a different host at any point in time for various
reasons, including hardware failures. When this happens, the VM will be created on
the new host using the OS disk from the storage account, and new temp storage
will be created on the new host.
Using temp storage can be a cost effective way of running an Elasticsearch cluster on Azure with decent performance, so long as you understand the tradeoffs in doing so, by snapshotting frequently and ensuring adequate data redundancy through sufficient replica shards.
Striping data across attached disks is recommended to improve Input/Output operations per second (IOPS) performance, since the IOPS and throughput limit per disk can be combined. The IOPS for Premium disks is higher than for Standard HDD disks, so Premium disks are recommended where application performance is paramount.