Shelfchk is your Friend!

The Story

An invaluable learning experience over the weekend!

So, my task was pretty simple, hot add two DS14mk2 AT-FCX shelves to one loop, and hot add another two DS14mk2 AT-FCX shelves to another loop. Unfortunately, my diligence let me down and I failed to see that the loops continued into cabinets further away, and managed to hot-add the shelves into the middle of a loop twice - that’s this didn’t panic the controllers is testimony to the resilience of NetApp FAS-Series Controllers (they are nearly idiot proof)!

I had run the following command, but not given it the attention it deserved:
fcstat device_map
Note: The command for SAS shelves is
sasadmin expander_map

Also, the following command showed the secondary paths going down and then coming up as expected during the hot adds:
storage show disk -p

And, I was even able to assign the disks after each shelf hot-add (only doing one at a time with the FC shelves - okay to do more with SAS):
disk assign all
Note 1: To see all unassigned disks, use
disk assign -n
Note 2: Recommend turning disk auto_assign off before start adding shelves
options disk.auto_assign off

Strangely, Config Advisor was run and it said everything was okay!

Alas, I should have paid attention to these errors:
filera: ses.access.sesUnavailable:CRITICAL: Enclosure Services unavailable for one or more shelves on channel 1d.
DBG: ses_incorporate_path(): Enclosure Logical Identifier Mismatch.

When it dawned that the job had gone wrong -
shelfchk
- was invaluable in visualizing the loops (stacks), and leading to a resolution.
Note: Shelfchk will turn on the amber LEDs for all disks in a loop/stack.

The fix required downing both heads and correcting the shelf IDs. Alas, this is the only way to resolve this!

Graphical Representation

Here’s a graphical representation of what happened:

Image: Key

Image: Pre hot-adding shelves

Image: Post hot-adding shelves (notice the shelf ID conflicts in red)

Image: Corrected shelf IDs

Comments