The Story
Actually, not quite the end of the story! After doing the cf forcetakeover and getting some services back up, alas - a bit of a comedy of errors here - the maintenance engineers - who had to move 2 PDUs to make room for the failed controller to slide out - managed to pull out both power cables from the FAS3140A! Fortunately, the surviving controller survived its unplanned hard-reset and came back up again without any complaint.
END OF STORY!!
A dual-controller FAS3140 (HA Pair.) 11 x DS14 shelves in 5
loops of:
2 x Mk2-AT (Single Path), 2 x Mk4-FC (Single Path), 3 x Mk2-AT
(MPHA), 2 x Mk2-AT (Single Path), 2 x Mk2-FC (Single path)
Onboard unused FC target ports disabled and type changed
from target to initiator via the below commands (ports get automatically
enabled on reboot, cannot enable before – this is 7DOT 8.0.2P5):
*this is so we can
later use them for MPHA cabling of fibre cabled shelves since an FC HBA needs
to be removed to make way for a SAS one!
fcadmin config –d 0a
fcadmin config –d 0a
fcadmin config –d 0a
fcadmin config –d 0a
fcadmin config –t initiator 0a
fcadmin config –t initiator 0a
fcadmin config –t initiator 0a
fcadmin config –t initiator 0a
Controllers shutdown with (the same commands run on each
controller except the cf disable done on only one, and snapmirror off additionally done on any remote controller with a
snapmirror pull* relationship with this pair):
*in 7-mode snapmirror
is pull, C-mode it is push!
options autosupport.doit
“Maintenance window!”
options autosupport.enable off
snapmirror off
cifs terminate
cf disable
halt
Both controllers’ hardware was modified by removing from
slot 1 the Quad port FC HBA, and replacing with a Quad port SAS HBA (as per
Hardware Universe slot recommendations.)
The 11 x DS14 shelves were re-cabled in 3 loops (5 loops
consolidated into 3) of:
3 x Mk2-AT (MPHA), 4 x Mk2-AT (MPHA), 4 x Mk2-FC (MPHA)
Two new DS4243 shelves were cabled in an MPHA stack.
Shelf IDs set.
Ready to power up!
Shelves powered up (a few power-cycles on some of the old
DS14’s to get the shelf ID to stay solid)!
Controllers powered up (a 5 minutes wait after the shelves)!
Controller 1 came up fine and we ran:
aggr status
vol status
storage show disk –p
Which showed all aggregates online, all volumes online, and alls
disk as multipathed via A and B paths.
Controller 2
Failed to Boot
Had we done something
wrong?
After looping through the boot process a couple of times it
hit a:
PANIC
Uncorrectable Machine Check Error CPU0
And came to a final resting place at the LOADERA> prompt.
Running the below from the LOADER prompt:
boot_diags
First thing we see is a message stating:
Failed
NVRAM module. Powercycle system. If message persists replace motherboard.
Further digging into the boot_diags > mb > NVRAM – “NVRAM IB0 failed to initialize /
uninitialized”
So, We Did a Few Things
Powercycled the controller.
Shutdown the controller, removed the NVRAM battery, reseated
the NVRAM memory.
But all to no avail!
So a motherboard tray was ordered from support (X3540-R5)
and later replaced.
In the meantime we had half our services down with this
being a dual-controller HA pair running on one controller with and cf disabled. The solution was to run:
cf forcetakeover
So, all services on the downed partner controller became
available on the one surviving controller.
END OF STORY!
Actually, not quite the end of the story! After doing the cf forcetakeover and getting some services back up, alas - a bit of a comedy of errors here - the maintenance engineers - who had to move 2 PDUs to make room for the failed controller to slide out - managed to pull out both power cables from the FAS3140A! Fortunately, the surviving controller survived its unplanned hard-reset and came back up again without any complaint.
END OF STORY!!
A bit of background to
the story:
These controllers and
shelves had not been powered off in a good couple of years, and were over 4
years old. It’s almost to be expected that machinery that’s running happily
when nice, warm and cosy; then is shut down and gets cold; on power up it might
have a few grumbles (thermal expansion > contraction > expansion again.)
There are plenty connected with businesses who're increasing to employ more professionals inside most of these fields to help you develop the actual good results on the company.
ReplyDeleteactuaries Jobs