Invalid CPU reservation for the latency-sensitive VM

Recently some VMs went down during regular patching. When I checked they were not powered on and when I tried to power them on I got an error – “Invalid CPU reservation for the latency-sensitive VM, (sched.cpu.min) should be at least 6990 MHz.”.

What had happened is that someone had changed this VM latency sensitivity to “High” without doing proper CPU and RAM reservations. It would not have been a problem during a restart of a VM but I had set advanced setting called “vmx.reboot.PowerCycle” to TRUE since I needed VM to get some new CPU features. This setting causes VM to power cycle during normal OS reboot. And since the reservations were not properly done VM failed to power on. After fixing the reservations VM successfully powered on. The error message about RAM reservation looks like this – “Invalid memory setting: memory reservation (sched.mem.min) should be equal to memsize(4096)”.

To check the VM latency sensitivity using PowerCLI I use module written by Florian Grehl (his blog) – This module has 3 useful commands for viewing and changing VM latency sensitivity – Get-VMLatencySensitivity, Get-VMLatencySensitivityBulk and Set-VMLatencySensitivity.

Virtual Machine network latency issue

Recently we were debugging an issue where VM network latency was higher than usual on some VMs as soon as vCPU was utilized. When CPU was loaded we were seeing ping response times up to 30ms in the same VLAN. Normal value usully is below 0.5ms. After several failed attempts one of my colleagues found a thread on SUSE Forums which described the issue we were having – From the thread we found a hint – VM advanced setting called “sched.cpu.latencySensitivity”. On problematic VMs this option was set to “low”. It was the exactly this issue in our environment as well – all problematic VMs had this setting set to “low”. We shut downed the VMs and changed “sched.cpu.latencySensitivity” setting value to “normal” and the issue was fixed. Now the latency is constantly below 0.5ms.

To check value for individual VM you can use Web Client or following command:
Get-VM -Name <VMNAME> | Get-AdvancedSetting -Name sched.cpu.latencySensitivity

If the response is empty then the setting does not exist and I guess “normal” value is used.

To check this setting on all VMs I used this script (developed from script I got from this page):
Get-VM | Select Name, @{N=”CPU Scheduler priority”;E={
($_.ExtensionData.Config.ExtraConfig | `
where {$_.Key -eq “sched.cpu.latencySensitivity”}).Value}}

To fix the setting through PowerCli I used this script (developed from script I got from this page):
Get-VM -Name “VMNAME” | Get-View | foreach {
$ConfigSpec = New-Object VMware.Vim.VirtualMachineConfigSpec
$OValue = New-Object VMware.Vim.optionvalue
$OValue.Key = “sched.cpu.latencySensitivity”
$OValue.Value = “normal”
$ConfigSpec.extraconfig += $OValue
$task = $_.ReconfigVM_Task($ConfigSpec)
Write “$($_.Name) – changed”

We found several other VMs where this setting was set to “low”. Currently we don’t have any idea why some VMs had this setting set to low. There is a VMWare Community thread where at least two other persons claiming that they have faced similar issues with this setting.

SSD caching could decrease performance – part 2

In the second part of the “SSD caching could decrease performance” I will cover IO read and write ratio and IO size affects to SSD. Part 1 is accessible here.

Read IO and write IO ratio

Most real workloads are mixed IO workloads – both disk reads and writes. Read and Write ratio is split between disk reads and disk writes. Many cases enterprise SSD disks have equally good read and write performance. But lately MLC and especially TLC drives have made their way into enterprise market and with some of them read and write performance is not equal. In addition SSD disks may become slower over time due to small amount of free blocks. To mitigate the free block issue SSD vendors are installing extra capacity inside the disks – example Intel DC S3700 SSD has about 32% extra capacity.

SSD disks usually handle reads better than writes. Before selecting your SSD disk vendor and model I recommend to do some research. If possible purchase some different models that would suite your needs and test them in real life scenarios.

IO size

When it comes to performance IO size matters. Large IO could potentially overload a small number of SSD disks and with it affect the performance. I would avoid caching workloads with IO size above 128KB. In my personal experience I have seen a situation with a database where SSD caching was hindering performance due to database multi-block reads.

My recommendations for a more successful SSD caching project

  • Determine VMs that would benefit from SSD caching – VMs doing certain amount or more IO.
  • Analyze the IO workloads – no point of doing read caching when server is only writing. IO size.
  • Check your hardware – controller speeds and capabilities. No point to connect fast SSD to a crappy controller.
  • Find a suitable SSD disks for caching. Price vs performance.
  • Talk with storage admins – might be that SSD in array would make more sense than SSD in server.

SSD caching could decrease performance – part 1

Recently we did a small POC for SSD caching and we got some interesting results. One would believe that if you cache on local SSD inside the server you would get significant improvement in IO. Generally you will see some performance increase but not always like we experienced.

There are several things affecting the performance of local SSD. In this post I will cover SSD disks and disk controllers.

SSD disks

Most common SSD disks are 2.5″ SATA factor. SATA interface performance greatly limits throughput of a disk. Below in the table I’ve listed throughput speeds of different interfaces.

Storage interface speeds

As you can see SATA1-SATA3 interfaces, SAS and SAS 6G interfaces are slower than widely adopted dual port 4Gbit FC, 8Gbit FC or 10G iSCSI (in the table above FC and iSCSI speeds are for single link). Only high end PCIe/NVMe devices can match or outperform the throughput performance of  FC and modern iSCSI. If a workload generates large amount of throughput it could potentially slow down if caching device maximum throughput is slower then your original storage.

One way to increase local disk performance is to increase amount of disks. Example instead of one 800 GB disk use two 400 GB disks. Below in the table I calculated maximum read speed of all disks that you would need to create 800 GB cache. I took different sizes of Intel SSD DC S3710 Series disks.

Intel SSD read speed As you can see single 800 GB disk is rated at 550 MB per seconds, but same amount of cache (800 GB) with four 200 GB disks will give us about 2200 MB per second of read speed.

When selecting a caching software make sure to check if it distributes cache to all SSD disks to maximize the performance. If the software does not distribute cache to all SSDs consider using striped (raid 0) disk as a cache device.

Most caching software vendors have recommendations about different SSD disks – check them out before purchasing the disks.

Disk controllers

Disk controllers can have a huge affects to performance of SSD disks. Some entry level controllers may not perform as good as high end controllers. My advice here is to check your vendor recommendations for your disk controller or if you have not bought a controller yet check out the vendor recommendations and also caching software vendor recommendations.

Also disk controller settings can affect performance – check vendor recommendations. Some keywords – read/write cache, drive write cache, SSD pass-trough. In my experience enable drive write cache, change read/write ratio to 25% read / 75% write and if possible enable SSD pass-trough which enables OS to bypass disk controller.

Related posts

SSD caching could decrease performance – part 2

Enabled CPU hot-add setting in vSphere disables vNUMA

Few days ago I was helping a database admin – he was complaining that his large MS SQL server is seeing only one 20 vCPU vNUMA node and this is hindering performance.

So I did some research …

  • It’s better to increase socket count rather then core count – this allows more flexible vNUMA configurations
  • vNUMA will be automatically enabled when VM has more than 8 vCPUs
  • vNUMA-aware VM must be hardware version 8
  • Enabling CPU hot-add disables vNUMA (VMware KB2040375)

All four points were valid for this database server – 20 vCPUs (20 sockets * 1 core), VM hardware is version 8 and CPU hot-add was enabled. Enabled CPU hot-add setting disabled vNUMA and forced VM to use UMA (Uniform memory access). When CPU hot-add was disabled vNUMA kicked in and created 2 vNUMA nodes with 10 vCPU each.

To view how many vNUMA nodes server has following software can be used – Windows: Coreinfo from Sysinternals (command: coreinfo -n) and Linux: numactl (command: numactl  –hardware).

There is a good blog post about effects of wrongly configured vNUMA on VMware vSphere Blog titled – Does corespersocket Affect Performance?

It is highly probable that miss-configured vNUMA can have performance impact. I will disable CPU hot-add on all of the large VMs and in the future I will not enable CPU hot-add on large VMs to get all the benefits of enabled vNUMA.

More information about vNUMA:
VMware CPU Hot Plug vNUMA Effects on SQL Server
vNUMA: What it is and why it matters
Checking the vNUMA Topology