I am running ESXi 5.5 on a Dell R620- I got some issue with my Perc Battery which I am researching in another post on the Dell forum

(Controller event log: Battery charging was suspended due to high battery temperature: Controller 0 (PERC H710P Mini)

But in the midst of all this I suddenly got a alert from vCenter-

Health of SEL Changed from gray to red. Sensor name IMPI SEL.

So I researched that- and VMware said to clear event log and then reset sensors.

So I did that and suddenly got new alerts popping up all over the place

Processor 2 Status- Thermal Trip Alert

Processor 2 M23 VDDQ PG 0 Voltage Warning

and Add-in Card 4 ROMB Battery 0 Failed.

Instrumentation ServiceAutomatic System Recovery (ASR) action was performed

Ok- so now I am a little worried -since besides the battery issue I have no idea what these mean. So I immediately log into my iDrac Card- Everything Green. So then I log into OpenManage Server Administrator for my Dell R620 that I installed on ESXi- and it also is everything green.

Should I re-set these hardare sensors again in vSphere? I didn’t like it when I did it last time! I really have no clue if this is something to worry about. The machine is 40 days old and I am wondering if I can get some sleep or not.

@Dell_Technologies

6 Spice ups

Yikes. This is why I like having multiple host.

vMotion VM’s off, then let the fun begin.

If it is only 40 days old, Call Dell! They have vmware techs, that is going to be your best option!

1 Spice up

Have you actually checked the server location to make sure the server is physically sound, not over heating, room location is cool?

Run Dell diagnostic tools to check the hardware.

Phone Dell and get them on to it asap.

Make sure you have recent good backups of the VMs.

1 Spice up

Thanks guys- I called Dell- they thought it just might be related to a need for an iDrac 7 crucial firmware update. So far so good- alarms are all good- in the green. Doesn’t mean it won’t come back I guess but good enough for a Sunday morning.

All of that for a iDrac update? I think Dell may be blowing smoke…

Some of this was part of the 1.9.x update for the iDrac and the 1.6 Lfe Cycle controller firmware.

Makes sure that you enable system inventory scan on the restart so that the components are ID’d properly.

Just a note on this for future reference.

The VMware alerts just pass on what the firmware for he devices are saying. If your RAID controller says that there is a predictive failure, then so will vCenter.

So if alerts start popping up, best to check firmware first to see what is up. If there is nothing there, then like you did, it is best to give your vendor a call.

Well so far so good- everything still in the green. According to the Dell tech I talked to - the iDrac cards handle the hardware status- IPMI etc. The firmware upgrade seems to have fixed the problem. I have another Dell- an R320- (no ESXi just running windows) with exactly same problem iDrac 7 firmware- so I looked through it’s event logs- I missed this before because it was from weeks ago- but it has EXACTLY the same errors:

Controller event log: Battery charging was suspended due to high battery temperature: Controller 0 (PERC H710 Mini)

It also has some voltage errors. So this actually makes me feel a little more at ease- that it was truly a firmware issue rather than Dell batteries hating me.

I of course immediately updated the iDrac Firmware on it as well- and everything seems to be going well on both machines (cross fingers).

So- anyone with new Dell iDrac 7 cards- and firmware > 1.50.50 then I urge you to do the critical update to 1.56.55.

For the first time in my life when support told me to update firmware- it actually might have fixed something.

1 Spice up