Error joining ESXi host to Active Directory

I was trying to join ESXi host to Active Directory using PowerCli ( Get-vmhost <vmhost> | Get-VMHostAuthentication | Set-VMHostAuthentication -JoinDomain -Domain “” -User “<username>” -Password “<password>” ) and I was getting an error:

Active directory authentication store is not supported for VMHost <hostname>

Tried to join it via old C client – that failed also.

AD join finally worked via VMware Host Client running on the host. VMWare Host Client can be accessed via web browser – https://<hostname>/ui/#/login.

Error during vMotion: The source detected that the destination failed to resume.

During a vMotion one of my VMs refused to move and vCenter was giving a following error:

The VM failed to resume on the destination during early power on.
Module DiskEarly power on failed.
Cannot open the disk ‘/vmfs/volumes/…./…./???????.vmdk’ or one of the snapshot disks it depends on.
Could not open/create change tracking file

I turned the server off and tried to delete the change tracking file. Got an error:

Cannot delete file [<Datastore>] VMFolder/VM-ctk.-vmdk

Migrated the VM to another host and tried power it on. Got an error:

An error was received from the ESXi host while powering on VM VMName.
Failed to start the virtual machine.
Cannot open the disk ‘/vmfs/volumes/5783364b-08fc763e-1389-00215a9b0098/’ or one of the snapshot disks it depends on.
Could not open/create change tracking file

Next I rebooted the ESXi host on which the problematic VM was initially and after that I was able to delete the *-ctk.vmdk file and power on the VM. It seems that for some reason there was a file locks on the change tracking files and it prevented operations on the VM.

No coredump target has been configured

I was doing a hardware upgrade for some of the VMWare hosts in lazy fashion. I pulled the disk (small SATA SSD) from old host (HP BL460c Gen8) and inserted it into new host (HP BL460c Gen9). The old hosts used SAS HBA and the new hosts are using integrated SATA controller.

Using this method I avoided the unnecessary re-installation of the hosts. But when they booted I got warning message – “No coredump target has been configured. Host core dumps cannot be saved.”

Update 12.08.2016 – Starting from ESXi 5.5 U3 build 4179633 this issue will fix it self automatically. You can read more from here.

I turned to KB article 2004299 to fix it but I had to do a little more than described there.

My fix process was following:

  • Logged in to ESxi via SSH
  • esxcli system coredump partition get returned – “Not a known device: naa.xxxxxxxxxxxxxxxxxxxxxxxxxxx”
  • Executed following command to list all the disks and find the SSD disk I was booting from: esxcli storage core path list | more
  • My SSD disk was “Runtime Name: vmhba0:C0:T0:L0”, “Device: t10.ATA_____MO0100EBTJT_____________________________S0RFNEAC603385______”
  • Executed following command to list partitions on my disk –  “esxcli storage core device partition list -d t10.ATA_____MO0100EBTJT_____________________________S0RFNEAC603385______”
  • Identified that my coredump partition is number 7 (type fc)
  • Executed following command to set the partition: esxcli system coredump partition set –partition=”t10.ATA_____MO0100EBTJT_____________________________S0RFNEAC603385______:7″
  • Executed the following command to activate the partition: esxcli system coredump partition set –enable true
  • esxcli system coredump partition get now returns and the warning message dissipated:
    Active: t10.ATA_____MO0100EBTJT_____________________________S0RFNEAC603385______:7
    Configured: t10.ATA_____MO0100EBTJT_____________________________S0RFNEAC603385______:7


128GB DDR4-2400 Memory Kit available for HPE servers

HPE has made available 128GB (1x128GB) Octal Rank x4 DDR4-2400 CAS-20-18-18 Load Reduced Memory Kit (HPE info page). The price as of writing this (13.07.2016) is quite high – 9499 USD.

The 128GB RAM module open up possibilities to build single socket systems with up to 1,5TB of RAM.

Example HP ProLiant DL380 Gen9 config:

  • 1 x Intel Xeon E5-2699v4 (2.2GHz/22-core/55MB/145W)
  • 12 x HPE 128GB (1x128GB) Octal Rank x4 DDR4-2400 CAS-20-18-18 Load Reduced Memory Kit

Reducing the number of sockets could lower the amount of software licenses needed – eg VMWare.

Virtual Machine network latency issue

Recently we were debugging an issue where VM network latency was higher than usual on some VMs as soon as vCPU was utilized. When CPU was loaded we were seeing ping response times up to 30ms in the same VLAN. Normal value usully is below 0.5ms. After several failed attempts one of my colleagues found a thread on SUSE Forums which described the issue we were having – From the thread we found a hint – VM advanced setting called “sched.cpu.latencySensitivity”. On problematic VMs this option was set to “low”. It was the exactly this issue in our environment as well – all problematic VMs had this setting set to “low”. We shut downed the VMs and changed “sched.cpu.latencySensitivity” setting value to “normal” and the issue was fixed. Now the latency is constantly below 0.5ms.

To check value for individual VM you can use Web Client or following command:
Get-VM -Name <VMNAME> | Get-AdvancedSetting -Name sched.cpu.latencySensitivity

If the response is empty then the setting does not exist and I guess “normal” value is used.

To check this setting on all VMs I used this script (developed from script I got from this page):
Get-VM | Select Name, @{N=”CPU Scheduler priority”;E={
($_.ExtensionData.Config.ExtraConfig | `
where {$_.Key -eq “sched.cpu.latencySensitivity”}).Value}}

To fix the setting through PowerCli I used this script (developed from script I got from this page):
Get-VM -Name “VMNAME” | Get-View | foreach {
$ConfigSpec = New-Object VMware.Vim.VirtualMachineConfigSpec
$OValue = New-Object VMware.Vim.optionvalue
$OValue.Key = “sched.cpu.latencySensitivity”
$OValue.Value = “normal”
$ConfigSpec.extraconfig += $OValue
$task = $_.ReconfigVM_Task($ConfigSpec)
Write “$($_.Name) – changed”

We found several other VMs where this setting was set to “low”. Currently we don’t have any idea why some VMs had this setting set to low. There is a VMWare Community thread where at least two other persons claiming that they have faced similar issues with this setting.

Corrupted server profile in HP blade server after firmware upgrade

Recently we ere applying a SPP 2016.04 for some our blade servers. After upgrade one the server did not have network. From ESXi console everything looked OK. Tried cold boot – nothing. Tried downgrade of Emulex CNA firmware – nothing. Tried latest Emulex firmware again – nothing. Finally turned off server, went to VCEM (Virtual Connect Enterprise Manager) and edited the faulty profile by just clicking edit and then saved the profile again. Powered up the server and now everything was OK. I guess firmware update somehow damaged the profile and by re-applying the profile using VCEM it got fixed.

Change tracking target file already exists

After upgrading to VMWare ESXi 5.5 U3 we started seeing random snapshot errors during backups with following message – “An error occurred while saving the snapshot: Change tracking target file already exists.”. Issue is caused by leftover cbt file that is not deleted when snapshot is removed by backup software.

After submitting several logs and traces to VMWare they acknowledged that issue exists and it will be fixed for ESXi 5.5 in June patch release and for ESXi 6.0 in July patch release.

Right now when we detect a problematic VM we browse the datastore and delete the leftover cbt file.