Saturday, 26 October 2013

“Aggregate Relocate” with Data ONTAP 8.1 7-Mode

UPDATE: As pointed out by Mizen (thank you very much Mizen), there's a NetApp KB article that much better details doing this if you've got a HA-pair - How to move an aggregate between software disk-owned HA pairs. - and it also includes the following to offline an aggregate with volumes in:
FILER1>priv set diag
FILER1*>aggr offline -a


Aggregate Relocate (ARL) was introduced in Clustered Data ONTAP 8.2, and this allowed a storage admin to non-disruptively relocate aggregate ownership within a HA-pair. The main use case being for nondisruptive controller upgrade operations.

The Clustered ONTAP CLI command for ARL is*:
storage aggregate relocation start
*Check out the “Clustered Data ONTAP 8.2 High-Availability Configuration Guide” for more information.

This post though is about 7-Mode, and ARL is not a feature in Data ONTAP 8.2 7-Mode and it is extremely unlikely it will ever be a feature of Data ONTAP 7-Mode (do a search for “relocate” in the “Data ONTAP 8.2 Release Notes For 7-Mode” or “Data ONTAP 8.2 High Availability and MetroCluster Configuration Guide For 7-Mode” and you will find no mention of relocate in 7-Mode since it does not exist!)

So, what is the title getting at?

Well, one feature of 7-mode that is not a feature of Clustered ONTAP**, is that you can offline an aggregate, remove ownership of this disks, then assign those disks to another 7-mode controller, online the aggregate, and have all the volumes, LUNs, etcetera return. Of course, this operation is disruptive.

**Why is it not possible in Clustered ONTAP? Well, Clustered ONTAP has four replicated database (RDB) units and one of these is the VLDB (volume location database), if disks holding a foreign aggregate and volumes are added to a controller running Clustered ONTAP, these volumes do not exist in the VLDB so cannot be brought back!

Walkthough: How to Disruptively Relocate an Aggregate from one 7-Mode Controller to Another

In the following lab illustration, we do not have a physical HA-Pair to play with, just the 8.1.2 7-Mode Simulator. We have a controller (ctra) with two aggregates - aggr0 (3 disks) and aggr1 (24 disks); aggr0 has the root volume (vol0) and will not be touched, aggr1 has 3 volumes and each volume has a LUN. To demonstrate the process, we will offline aggr1 and remove ownership of all the disks from ctra, then do a 4a wipe of ctra and bring this up as ctrb, then assign aggr1’s disks to ctrb, bring aggr1 online, and verify the volumes and LUNs are still there!

Note: A fuller CLI output is contained in the Appendix - here we essentially just record the commands.

# Verify Aggregates, Volumes, LUNs and Disks #

ctra> aggr status
          aggr1 online
          aggr0 online

ctra> vol status
           vol0 online
           vol1 online 
           vol2 online
           vol3 online

ctra> aggr status -v aggr1
        Volumes: vol1, vol2, vol3

ctra> lun show
        /vol/vol1/lun1   (r/w, online, mapped)
        /vol/vol2/lun2   (r/w, online, mapped)
        /vol/vol3/lun3   (r/w, online, mapped)

ctra> aggr status -r aggr0
Aggregate aggr0 (online, raid_dp)
v5.16 v5.17 v5.18

ctra> aggr status -r aggr1
Aggregate aggr1 (online, raid_dp)
v4.16 v4.17 v4.18 v4.19 v4.20 v4.21 v4.22 v4.24 v4.25 v4.26 v4.27 v4.28 v4.29  
v5.19 v5.20 v5.21 v5.22 v5.24 v5.25 v5.26 v5.27 v5.28 v5.29 v5.32  

ctra> aggr status -s
Pool0 spare disks

# Unmap LUNs, Offline LUNs, Offline Volumes, and try to Offline aggr1 #

ctra> lun unmap /vol/vol1/lun1 igroup1
ctra> lun unmap /vol/vol2/lun2 igroup1
ctra> lun unmap /vol/vol3/lun3 igroup1

ctra> lun offline /vol/vol1/lun1
ctra> lun offline /vol/vol2/lun2
ctra> lun offline /vol/vol3/lun3

ctra> vol offline vol1
ctra> vol offline vol2
ctra> vol offline vol3

ctra> priv set advanced
ctra*> aggr offline aggr1
aggr offline: Cannot offline aggregate 'aggr1' because it contains one or more flexible volumes.

Note 1: aggr1 could not be set offline!
Note 2: If you try to remove_ownership of a disk from on online aggregate you get:

ctra*> disk remove_ownership v4.16 -f
Disk v4.16 will have its ownership removed
Ownership Remove request failed for disk v4.16. Reason:Disk is part of an online aggregate or volume. Changing its owner is not allowed, because that may cause aggregate, volume, or filer outage.
Disk ownership remove request failed.

# For a physical HA-pair disable controller failover #

ctra> cf disable

# Disable disk autoassign #

ctra> options disk.auto_asign off
ctrb> options disk.auto_asign off

# Boot into maintenance mode #

ctra> reboot

Press Ctrl-C for Boot Menu.
Selection (5) for Maintenance mode boot

# Offline aggregate aggr1 and remove disk ownership of its disks #

*> aggr status
*> aggr offline aggr1
*> disk remove_ownership -f v4.16 v4.17 v4.18 v4.19 v4.20 v4.21 v4.22 v4.24 v4.25 v4.26 v4.27 v4.28 v4.29
*> disk remove_ownership -f v5.19 v5.20 v5.21 v5.22 v5.24 v5.25 v5.26 v5.27 v5.28 v5.29 v5.32
*> disk show -a
*> halt

# Reassign the disks to ctrb and bring aggregates, volumes, and LUNs online #

ctrb> aggr status -r
ctrb> disk assign v4.16 v4.17 v4.18 v4.19 v4.20 v4.21 v4.22 v4.24 v4.25 v4.26 v4.27 v4.28 v4.29
ctrb> disk assign v5.19 v5.20 v5.21 v5.22 v5.24 v5.25 v5.26 v5.27 v5.28 v5.29 v5.32
ctrb> aggr status
ctrb> aggr status -r
ctrb> aggr online aggr1
ctrb> vol status
ctrb> vol online vol1
ctrb> vol online vol2
ctrb> vol online vol3
ctrb> lun show
ctrb> lun online /vol/vol1/lun1
ctrb> lun online /vol/vol1/lun1
ctrb> lun online /vol/vol1/lun1

ctrb> lun show
        /vol/vol1/lun1   (r/w, online)
        /vol/vol2/lun2   (r/w, online)
        /vol/vol3/lun3   (r/w, online)

# Re-enable disk autoassign #

ctra> options disk.auto_asign on
ctrb> options disk.auto_asign on

# For a physical HA-pair re-enable controller failover #

ctrb> cf enable

THE END: 7-Mode Disruptive Aggregate Relocate is done!

No comments:

Post a Comment