Upgrading to ESXi 7.0 – The upgrade has VIBS that are missing dependencies

I was upgrading a cluster with HPE servers from ESXi 6.7 to ESXi 7.0 and one host was complaining about missing VIB dependencies – “The upgrade has VIBS that are missing dependencies”. That’s normal when you upgrade ESXi hosts and you have installed 3rd party tools and drivers. I encountered similar issues while upgrading my home lab (blog post about it). The easy fix for that is to remove those vibs. In this case what threw me off was that the vibs reported to have missing dependencies were – dell-configuration-vib and dellemc_osname_idrac.

Since it was an HPE server the Dell vibs made no sense. I was not able to remove them as they did not appear in the list when I listed installed VIBs and the remove command also failed.

What had happened with this server is that I had mistakenly installed this server using custom ISO for Dell servers and not the custom ISO for HPE servers.

To fix the issue I’m going to be performing a clean installation of ESXi 7.0 using custom HPE ISO this time.

Home lab upgraded to vSphere 7 … almost. Updated!!

I have updated my home lab to vSphere 7 for exception of one host. Currently I have following hardware in my home lab – two HPE Proliant DL380 Gen8 (E5-2600 v2 series CPU), SuperMicro SYS-2028R-C1R4+ (E5-2600 v3 series CPU) and HPE Proliant DL380 G7 (X5600 series CPU). I used the VMware original ISO to perform the upgrades.

Supermicro SYS-2028R-C1R4+

Started with Supermicro. It was complaining about unsupported devices -> “Unsupported devices [8086:10a7 15d9:10a7] [8086:10a7 15d9:10a7] found on the host.”. During remediation I checked “Ignore warnings about unsupported hardware devices” and after some time the host was upgraded.

HPE Proliant DL380 Gen8

The HPE Proliant DL380 Gen8 servers also had unsupported devices detected -> “Unsupported devices [8086:105e 103c:7044] [8086:105e 103c:7044] found on the host.”

They also had some VIBs installed that were missing dependencies:

QLC_bootbank_qfle3f_1.0.68.0-1OEM.670.0.0.8169922
HPE_bootbank_scsi-hpdsa_5.5.0.68-1OEM.550.0.0.1331820
QLC_bootbank_qedi_2.10.15.0-1OEM.670.0.0.8169922
HPE_bootbank_scsi-hpdsa_5.5.0.68-1OEM.550.0.0.1331820
QLC_bootbank_qedf_1.3.36.0-1OEM.600.0.0.2768847
QLC_bootbank_qedf_1.3.36.0-1OEM.600.0.0.2768847
QLC_bootbank_qedf_1.3.36.0-1OEM.600.0.0.2768847
QLC_bootbank_qedi_2.10.15.0-1OEM.670.0.0.8169922

I used following commands to remove them:

esxcli software vib remove –vibname qedf
esxcli software vib remove –vibname qedi
esxcli software vib remove –vibname qfle3f
esxcli software vib remove –vibname scsi-hpdsa

After this I upgraded the hosts while again checking the “Ignore warnings about unsupported hardware devices” option.

HPE Proliant DL380 G7

The HPE Proliant DL380 G7 has an unsupported X5650 CPU and I was not able to update it. I guess it needs to be replaced with something newer.

I used “AllowLegacyCPU=true” option to enable upgrade on HPE DL380 G7 with X5650 CPU. More info – https://www.virtuallyghetto.com/2020/04/quick-tip-allow-unsupported-cpus-when-upgrading-to-esxi-7-0.html

HPE ProLiant Gen9 servers loose connection to SD-card

In resent months we have had several issues with different HPE ProLiant BL460c Gen9 servers where we have seen errors in ESXi when it needs to access OS disk which in this case has been SD-card. In some cases when we have restarted ESXi the server has no longer booted after that since the OS SD-card is no longer visible to BIOS. Initially we thought that our SD-cards were dead, but when we replaced some of them and checked the failed cards they appeared to be OK. So next time when we had a failed SD-card we did a E-fuse restart for the server though Onboard Administrator and it booted up correctly. SD-card was again visible for the BIOS and ESXi booted correctly.

Command to perform e-fuse reset from Onboard Administrator -> server reset <bay_number>

HPE iLO problem with Embedded Flash/SD-CARD

Some time ago I discovered two HPE BL490c Gen9 servers with iLO in “Degraded” status. From diagnostic page it was visible that error was related with Embedded Flash/SD-CARD – “Embedded media manager failed initialization”. The Login banner was also showing a warning.

With ILO4 firmware 2.61 or newer there is a “Format” button to format the embedded Flash/SD-CARD. If you format the embedded Flash/SD-CARD the iLO will reset and and hopefully the error is fixed. It worked on one of my servers. The other one was still showing error after iLO reset. Then I performed a power-cycle to the blade server using E-FUSE process. Logged into Onboard Administrator and issued “server reset <bay_number>”. After the server re-started the error about the iLO disappeared.

Advisory from HPE regarding the issue – https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c04996097

 

Dell software repository for VMware Update Manager

I have written in the past about HPE software and driver repositories – Automatically download HP drivers to VMware Update Manager and New URLs for HP(E) Online Depot for VMWare

There is also similar repository for Dell – https://vmwaredepot.dell.com/index.xml. By adding this to Update Manager it will download additional Dell software – Dell EMC OpenManage Server Administrator and Dell EMC iDRAC Service Module.

We have found the Dell EMC iDRAC Service Module very useful. It provides OS information to iDRAC and also installs binaries which you can use to reset iDRAC from ESXi if it becomes unresponsive (command: /opt/dell/srvadmin/iSM/bin/Invoke-iDRACHardReset -f )

Home lab update – 2018

Past couple of months I have been working on to update and upgrade my home lab.

My LAB now includes:

3 node VSAN cluster with HPE DL360 G7 SFF, HPE DL380 G7 SFF and HPE DL380 G7 LFF.
A standalone ESXi running on HPE DL380 G7 for running vCenter 6.7 U1 and other supporting services.
A standalone Windows Server HPE DL380 Gen8 to run VMs in WMware Workstation and file server service.

Currently network is 1G. Planning to upgrade to 10G in the future.

Some things I discovered during building the lab.
HPE DL380 G7 LFF with HP P410i also accepts 8TB disks. HPE quick specs only include disks as large as 4TB.
HPE DL380 Gen8 also works with DDR3 16GB 1067Mhz Quad Rank RDIMM memory modules. I was able to install 128GB per CPU. The operating frequency was reduced to 800Mhz.

Extended my home lab

I’ve recently extended my home lab with additional capacity. In addition to my Windows Server + VMware Workstation (info here) I’ve added refurbished HPE DL380 G7 server with following configuration:

1 x Intel Xeon Processor X5650 2.66Ghz
96GB RAM
1TB HDD
will add SSD in the future

The added server is running VMWare ESXi 6.7. It hosts vCenter 6.7 appliance and also few virtual ESXi 6.7 instances. HPE G7 series servers are not officially supported by VMWare to run ESXi 6.7 but it seems to be working for now.

I found my refurbished HPE G7 server from Ebay.

Modify VMware Update Manager host reboot timeouts in vSphere vCenter 6.5 appliance

I recently changed from Windows based VMware Update Manager (VUM) to Update Manager which is embedded in to the appliance of vCenter. In old VUM I had increased host reboot timeouts to allow host firmware patching during reboot without timing out remediation job.  In appliance the vci-integrity.xml file located in “/usr/lib/vmware-updatemgr/bin”. You need to restart VUM service or appliance after the change.

Lines which need to be change are following:

<HostRebootWaitMaxSeconds>1800</HostRebootWaitMaxSeconds>
<HostRebootWaitMinSeconds>600</HostRebootWaitMinSeconds>

Changed the values to:

<HostRebootWaitMaxSeconds>5400</HostRebootWaitMaxSeconds>
<HostRebootWaitMinSeconds>1800</HostRebootWaitMinSeconds>

This change allows me to patch ESXi host and install new firmware’s with a same reboot and with as least operations as possible.

Illegal OpCode while booting a HPE Proliant server

I was installing a new ESXi and after some steps I got an error “Illegal OpCode” while booting. It happened after ESXi patching with VUM. After some debugging I found the issue.

The server had local storage where I created a VMFS datastore before patching. In BIOS boot order was CD/DVD ROM, Hard Disk and USB. ESXi was installed onto USB. The error happened when server tried to boot from disk which contained VMFS datastore. After I moved USB before Hard Disk in boot order server booted correctly.

 

Firmware update fails on HPE server when Serial Number and Product ID is missing

Recently I was having issues updating HPE ProLiant BL460c G7 with latest SPP (2016.10). Firmware update just stopped on Step 1. Also HPE custom ESXi ISO failed to work.

After some digging around I discovered that server Serial Number and Product ID were missing. I went to BIOS and filled in the correct Serial Number and Product ID and after that the firmware update worked and I was also able to install HPE custom ESXi.

I suspect that the Serial Number and Product ID were lost when this blade server was removed from one Virtual Connect infrastructure and placed to another.