VMWare vCenter 6.5 U2 and SRM 8.1 – Server certificate chain is not trusted and thumbprint verification is not configured

Recently during upgrade we stumbled on a issue with SRM not been able to work with vSphere vCenter 6.5 U2 which was migrated from vSphere 5.5. SRM 8.1 went into error loop after creating a site pair. Looking into different SRM log files we discovered error in the dr.log in “C:\ProgramData\VMware\VMware vCenter Site Recovery Manager\runtime\srm-client\logs” folder. Error was –  com.vmware.vim.vmomi.client.exception.SslException: Failed to connect to Lookup Service at https://<vcenterhostname>/lookupservice/sdk. Reason: com.vmware.vim.vmomi.core.exception.CertificateValidationException: Server certificate chain is not trusted and thumbprint verification is not configured

After few days and no usable help from VMWare support we decided to try process described in couple of blog posts and KB articles:

https://vlenzker.net/2016/11/vcenter-6-5-srm-vsphere-replication-nsx-problems-after-ssl-change-ls_update_certs-py/
https://theithollow.com/2017/03/13/nsx-issues-replacing-vmware-self-signed-certs/
http://www.vjenner.com/2015/12/nsx-manager-6-2-lookup-service-error-when-using-vmca-enterprise-mode/
http://martinsvspace.blogspot.com/2015/08/srm-6-nightmare.html
https://kb.vmware.com/s/article/2121701
https://kb.vmware.com/kb/2121689

Before we did anything we created snapshots from vCenter servers while they were both turned off at the same time.

After determining that we had issues with one cert which was not updated we performed the fix against both vCenters and in one them 7 services were updated by ls_update_certs.py. After that SRM worked correctly.

 

Advertisements

“Invalid configuration for device 0” when removing a virtual disk

I tried to remove RDM disk from VM but it failed with an error “invalid configuration for device 0”. I have usually seen this message related with vNIC but this time it was the disk.

After some searching I found a solution – I changed SCSI ID from 0:1 to 0:2 for the disk I wanted to remove. After that remove operation worked.

Extended my home lab

I’ve recently extended my home lab with additional capacity. In addition to my Windows Server + VMware Workstation (info here) I’ve added refurbished HPE DL380 G7 server with following configuration:

1 x Intel Xeon Processor X5650 2.66Ghz
96GB RAM
1TB HDD
will add SSD in the future

The added server is running VMWare ESXi 6.7. It hosts vCenter 6.7 appliance and also few virtual ESXi 6.7 instances. HPE G7 series servers are not officially supported by VMWare to run ESXi 6.7 but it seems to be working for now.

I found my refurbished HPE G7 server from Ebay.

The required VMWare Tools ISO image does not exist or is inaccessible.

Recently I deployed some VMs from an template which I had created some years ago. When I tried to update VMWare Tools I received an error “The required VMWare Tools ISO image does not exist or is inaccessible.”

After some digging around in Google I found a thread in VMware forums which also pointed me to a right direction. I had set following advanced option in VM configuration – isolation.tools.autoInstall.disable = TRUE.

To allow VMWare Tools ISO mounting I set the value from TRUE to FALSE.

Modify VMware Update Manager host reboot timeouts in vSphere vCenter 6.5 appliance

I recently changed from Windows based VMware Update Manager (VUM) to Update Manager which is embedded in to the appliance of vCenter. In old VUM I had increased host reboot timeouts to allow host firmware patching during reboot without timing out remediation job.  In appliance the vci-integrity.xml file located in “/usr/lib/vmware-updatemgr/bin”. You need to restart VUM service or appliance after the change.

Lines which need to be change are following:

<HostRebootWaitMaxSeconds>1800</HostRebootWaitMaxSeconds>
<HostRebootWaitMinSeconds>600</HostRebootWaitMinSeconds>

Changed the values to:

<HostRebootWaitMaxSeconds>5400</HostRebootWaitMaxSeconds>
<HostRebootWaitMinSeconds>1800</HostRebootWaitMinSeconds>

This change allows me to patch ESXi host and install new firmware’s with a same reboot and with as least operations as possible.

Illegal OpCode while booting a HPE Proliant server

I was installing a new ESXi and after some steps I got an error “Illegal OpCode” while booting. It happened after ESXi patching with VUM. After some debugging I found the issue.

The server had local storage where I created a VMFS datastore before patching. In BIOS boot order was CD/DVD ROM, Hard Disk and USB. ESXi was installed onto USB. The error happened when server tried to boot from disk which contained VMFS datastore. After I moved USB before Hard Disk in boot order server booted correctly.

 

ScaleIO software no longer available for download

I saw an article in The Register that Dell EMC will discontinue the software-only version of ScaleIO and you can only get it if you buy it together with hardware (VxRack Flex). Today I tried searching Dell EMC website for ScaleIO downloads but all the links redirected to Dell EMC’s Converged Infrastructure homepage. Seems Dell EMC has removed the possibility download ScaleIO from their website.

ScaleIO is a software-defined storage product. It converts direct-attached storage into shared block storage over LAN.

More info: wikipedia