UPDATE: As pointed out by Mizen (thank you very much Mizen), there's a NetApp KB article that much better details doing this if you've got a HA-pair - How to move an aggregate between software disk-owned HA pairs. - and it also includes the following to offline an aggregate with volumes in:
FILER1>priv set diag
FILER1*>aggr offline
Introduction
Aggregate Relocate (ARL) was introduced in Clustered Data
ONTAP 8.2, and this allowed a storage admin to non-disruptively relocate aggregate
ownership within a HA-pair. The main use case being for nondisruptive
controller upgrade operations.
The Clustered ONTAP CLI command for ARL is*:
storage aggregate relocation
start
*Check out the “Clustered
Data ONTAP 8.2 High-Availability Configuration Guide” for more information.
This post though is about 7-Mode, and ARL is not a
feature in Data ONTAP 8.2 7-Mode and it is extremely unlikely it will ever be a
feature of Data ONTAP 7-Mode (do a search
for “relocate” in the “Data ONTAP 8.2 Release Notes For 7-Mode” or “Data ONTAP
8.2 High Availability and MetroCluster Configuration Guide For 7-Mode” and you
will find no mention of relocate in 7-Mode since it does not exist!)
So, what is the title getting at?
Well, one feature of 7-mode that is not a feature of
Clustered ONTAP**, is that you can offline an aggregate, remove ownership of
this disks, then assign those disks to another 7-mode controller, online the
aggregate, and have all the volumes, LUNs, etcetera return. Of course, this
operation is disruptive.
**Why is it not
possible in Clustered ONTAP? Well, Clustered ONTAP has four replicated database
(RDB) units and one of these is the VLDB (volume location database), if disks
holding a foreign aggregate and volumes are added to a controller running
Clustered ONTAP, these volumes do not exist in the VLDB so cannot be brought
back!
Walkthough: How
to Disruptively Relocate an Aggregate from one 7-Mode Controller to Another
In the following lab illustration, we do not have a
physical HA-Pair to play with, just the 8.1.2 7-Mode Simulator. We have a
controller (ctra) with two aggregates - aggr0 (3 disks) and aggr1 (24 disks); aggr0
has the root volume (vol0) and will not be touched, aggr1 has 3 volumes and
each volume has a LUN. To demonstrate the process, we will offline aggr1 and
remove ownership of all the disks from ctra, then do a 4a wipe of ctra and
bring this up as ctrb, then assign aggr1’s disks to ctrb, bring aggr1 online,
and verify the volumes and LUNs are still there!
Note: A fuller CLI
output is contained in the Appendix - here we essentially just record the
commands.
# Verify Aggregates,
Volumes, LUNs and Disks #
ctra> aggr status
aggr1 online
aggr0 online
ctra> vol status
vol0 online
vol1 online
vol2 online
vol3 online
ctra> aggr status -v
aggr1
Volumes: vol1, vol2, vol3
ctra> lun show
/vol/vol1/lun1 (r/w, online, mapped)
/vol/vol2/lun2 (r/w,
online, mapped)
/vol/vol3/lun3 (r/w,
online, mapped)
ctra> aggr status -r
aggr0
Aggregate aggr0
(online, raid_dp)
Device
------
v5.16 v5.17 v5.18
ctra> aggr status -r
aggr1
Aggregate aggr1
(online, raid_dp)
Device
------
v4.16 v4.17 v4.18
v4.19 v4.20 v4.21 v4.22 v4.24 v4.25 v4.26 v4.27 v4.28 v4.29
v5.19 v5.20 v5.21
v5.22 v5.24 v5.25 v5.26 v5.27 v5.28 v5.29 v5.32
ctra> aggr status -s
Pool0 spare disks
Device
------
v4.32
# Unmap LUNs, Offline
LUNs, Offline Volumes, and try to Offline aggr1 #
ctra> lun unmap
/vol/vol1/lun1 igroup1
ctra> lun unmap
/vol/vol2/lun2 igroup1
ctra> lun unmap
/vol/vol3/lun3 igroup1
ctra> lun offline
/vol/vol1/lun1
ctra> lun offline
/vol/vol2/lun2
ctra> lun offline
/vol/vol3/lun3
ctra> vol offline
vol1
ctra> vol offline
vol2
ctra> vol offline vol3
ctra> priv set
advanced
ctra*> aggr offline
aggr1
aggr offline: Cannot offline
aggregate 'aggr1' because it contains one or more flexible volumes.
Note 1: aggr1 could
not be set offline!
Note 2: If you try
to remove_ownership of a disk from on online aggregate you get:
ctra*> disk
remove_ownership v4.16 -f
Disk v4.16 will
have its ownership removed
Ownership Remove request failed for disk v4.16. Reason:Disk is part of an
online aggregate or volume. Changing its owner is not allowed, because that may
cause aggregate, volume, or filer outage.
Disk ownership
remove request failed.
# For a physical
HA-pair disable controller failover #
ctra> cf disable
# Disable disk
autoassign #
ctra> options
disk.auto_asign off
ctrb> options
disk.auto_asign off
# Boot into
maintenance mode #
ctra> reboot
Press Ctrl-C for
Boot Menu.
Selection (5) for
Maintenance mode boot
# Offline
aggregate aggr1 and remove disk ownership of its disks #
*> aggr status
*> aggr offline
aggr1
*> disk
remove_ownership -f v4.16 v4.17 v4.18 v4.19 v4.20 v4.21 v4.22 v4.24 v4.25 v4.26
v4.27 v4.28 v4.29
*> disk
remove_ownership -f v5.19 v5.20 v5.21 v5.22 v5.24 v5.25 v5.26 v5.27 v5.28 v5.29
v5.32
*> disk show -a
*> halt
# Reassign the
disks to ctrb and bring aggregates, volumes, and LUNs online #
ctrb> aggr status -r
ctrb> disk assign v4.16
v4.17 v4.18 v4.19 v4.20 v4.21 v4.22 v4.24 v4.25 v4.26 v4.27 v4.28 v4.29
ctrb> disk assign v5.19
v5.20 v5.21 v5.22 v5.24 v5.25 v5.26 v5.27 v5.28 v5.29 v5.32
ctrb> aggr status
ctrb> aggr status -r
ctrb> aggr online aggr1
ctrb> vol status
ctrb> vol online vol1
ctrb> vol online vol2
ctrb> vol online vol3
ctrb> lun show
ctrb> lun online
/vol/vol1/lun1
ctrb> lun online
/vol/vol1/lun1
ctrb> lun online
/vol/vol1/lun1
ctrb> lun show
/vol/vol1/lun1 (r/w, online)
/vol/vol2/lun2 (r/w, online)
/vol/vol3/lun3 (r/w, online)
# Re-enable disk
autoassign #
ctra> options
disk.auto_asign on
ctrb> options
disk.auto_asign on
# For a physical
HA-pair re-enable controller failover #
ctrb> cf enable
THE END: 7-Mode Disruptive
Aggregate Relocate is done!
Hello :
ReplyDeleteI have a question if you can help me in case of one controller is broken doan , can we migrate its aggregates to the partner controller with this method in 7-mode ?
Regards
Sofane
Hello Sofane,
ReplyDeleteDid the storage takeover work?
If cf was enabled ("cf enable" in 7-mode) then one controller failing shouldn't be a problem, it's partner will take it over.
If you halt the whole system and boot into maintenance mode, then your should be able to assign all disks to the one working node (no guarantees there), but you'd need to recreate CIFS shares/NFS exports etc. (for the imported aggregates) once the system comes up.
Once you assign all disks, check you can see all the aggregates ("aggr status" in maintenance mode). If you can, all good.
Cheers, VC
HI VC,
ReplyDeleteThanks for your reply and for your time
Yes the second node is on takeover mode the first node is down and I want to Manage their disks by 2nd node no matter If I loss the volumes I don't have shares juste vmware data store that I can recreate
My question if I enter second node in maintenance mode so I can manger the disk of the broken node and assign theme to 2nd node as you said Because in normal mode noway to get access to them .?
My 2nd question if you don't mind if I have a node hardware is OK but I think a problem in firmware software is theore a way to get it back ? when I connect by console no thing shown .
Regards
Sofiane
Hi VC
ReplyDeleteThanks for your time and reply
So as I understand in take over mode i have to go in maintenance mode on the survived node and assign partner's dead node disk to it . for me no matter i lose data of that disks.
My 2nd question if you don't mind Is theire a way to kake node up if I t z a firmware problem the node is up but no signal what I connect the console.
Regards
Sofiane