Performing a
non-disruptive ARL (Aggregate Relocate) Headswap for NetApp Clustered Data
ONTAP - Step-by-Step Walkthrough Series:
Caveat Lector: Unofficial information!
18) Replace NODE-B with NODE-D
18.1) Cable NODE-D
Moving connections from NODE-B to NODE-D
18.2) Boot NODE-D
to the Boot Menu
Power on NODE-D (if not already) and press Ctrl-C to
access the boot loader environment.
At the LOADER> prompt type:
boot_ontap
prompt
Interrupt the boot sequence by pressing Ctrl-C to get to
the Boot Menu
18.3) Enter
Maintenance Mode
At the Boot Menu, select (5) Maintenance mode boot
18.4) Verify MPHA
*>
storage
show disk -p
18.5) Acquire the
System ID of NODE-D *>
sysconfig
disk
show -a
18.6) Assign
NODE-B’s Disks to NODE-D
Note: Only root
aggr and spare disks get assigned to NODE-D. Check with ‘disk show -a’.
(8.2.x)
Reassign disks from NODE-B to NODE-D *>
disk
reassign -s NODE-B_SYSID -d NODE-D_SYSID
(8.3.x) Reassign
disks from NODE-B to NODE-D *>
disk
reassign -s NODE-B_SYSID -d NODE-D_SYSID -p NODE-C_SYSID
Note: The -p option
is only required in maintenance mode when shared disks are present.
Enter n to ‘Abort reassignment (y/n)?’
Enter y to ‘After the node becomes operation ... Do you want to
continue (y/n)?’
Enter y to ‘Disk ownership will be updated ... Do you want to
continue (y/n)?’
18.7) Verify ha
setting shows “ha” *>
ha-config
show
Note: To modify
use*>
ha-config
modify controller ha
ha-config
modify chassis ha
18.8) Destroy mailbox disks *>
mailbox
destroy local
mailbox destroy partner
mailbox destroy partner
18.9) Verify
personality of FC/UTA ports *>
ucadmin
show
ucadmin
modify
18.10) Exit
maintenance mode *>
halt
This returns us to the LOADER> prompt.
18.11) Check Date
On NODE-C check the date in Clustershell::>
date
On NODE-D check the date and time at LOADER>
show
date
show
time
And if
necessary>
set date mm/dd/yyyy
set time hh:mm:ss
18.12) Boot>
boot_ontap
prompt
18.13) UPDATE FLASH FROM BACKUP CONFIG
IMPORTANT - DO NOT LET THE NODE
BOOT WITHOUT DOING OPTION 6 FIRST!
Interrupt the boot by pressing Ctrl-C.
At the boot menu, select option (6) Update flash from backup config.
This
will replace all flash-based configurations with the last backup to disks. Are
you sure you want to continue? y
Enter y
The boot proceeds normally and the system then asks you
to confirm the system ID mismatch. Confirm
the mismatch.
Example:
WARNING: System id mismatch. This usually
occurs when replacing CF or NVRAM cards! Override system id (y|n) ? [n] y
The node might go through one round of reboot before
booting normally.
18.14) Verify
NODE-D
Log in to NODE-D and from Clustershell::>
cluster
show
event
log show -messagename scsiblade.*
storage
aggregate show -owner-name NODE-C
19) Return data LIFs to NODE-D
19.1)
Broadcast-Domains, Failover-Groups, VLANs, IFGRPs
Handle as required.
19.2) Return data
LIFs to NODE-D
Typical
commands::>
net int show -curr-node
NODE-D
net int modify -vserver
VSERVER -lif LIF_NAME -home-port PORT -home-node NODE-D
net int revert -vserver
VSERVER -lif LIF_NAME
19.3) Restore SAN
LIFs
Typical
commands::>
net int show -curr-node
NODE-D
net int modify -vserver
VSERVER -lif LIF_NAME -status-admin up
20) Handle NODE-D’s Other LIFs
20.1) Cluster LIFs
(Node Local)
Re-home as required (one LIF at a time).
20.2) Node-Mgmt
LIF (Node Local)
Re-home as required.
20.3) Intercluster
LIF (Node Local)
Re-home as required.
21) Use ARL to Relocate (Selected) Aggregates from
NODE-C to NODE-D
21.1) Perform
Aggregate Relocation
Note: “-override-vetoes true” may be required
Relocate only aggregates that originally belonged to
NODE-B back to NODE-D::>
set
adv
aggr
show -owner-name NODE-D -root false
aggr
relocation start -node NODE-C -destination NODE-D -aggregate-list AGGR_NAME
-ndo-controller-upgrade true
##
REPEAT UNTIL NODE-D’s DATA AGGREGATES HAVE BEEN RELOCATED BACK HOME ##
storage
aggregate relocation show -node NODE-C
Proceed once all data aggregates originally owned by
NODE-B have been successfully relocated to NODE-D.
21.2) Verify Aggregates
are Online and Check for Offline Volumes::>
storage
aggregate show -nodes NODE-D -root false
volume
show -node NODE-D -state offline
Note: If any
non-root aggregates are offline or foreign, online the aggregate(s).
21.3) Verify does not say “Node owns partner aggregates as part
of the non-disruptive head upgrade procedure”::>
storage
failover show
22) Re-Enable SFO
22.1) Enable
Storage Failover::>
storage
failover show
storage
failover modify -enabled true -node NODE-C
22.2) (2-Node
Cluster) Verify cluster ha has re-enabled::>
cluster
ha show
22.3) Re-enable
AUTOBOOT
If you disabled AUTOBOOT, re-enable::>
set
-c off; set d
debug
kenv show -node * -variable AUTOBOOT
debug
kenv modify -node NODE-C -variable AUTOBOOT -value true -persist true
debug
kenv modify -node NODE-D -variable AUTOBOOT -value true -persist true
22.4) Checks::>
cluster
show
storage
failover show -fields local-missing-disks,partner-missing-disks
network
interface show
storage
aggregate show -owner-name NODE-C
storage
aggregate show -owner-name NODE-D
volume
show -state offline
22.5) Send
AutoSupports::>
system
node autosupport invoke -node NODE-C -type all -message "ARL Process
Completed!"
system
node autosupport invoke -node NODE-D -type all -message "ARL Process
Completed!"
23) ARL Finishing Touches
23.1) Configure
Service Processor IPs
Typical
Commands::>
service-processor
network modify -node NODE-C -address-type IPv4 -enable true -dhcp none
-ip-address NODE-C_SP_IP -netmask SP_NETMASK -gateway SP_GATEWAY
service-processor
network modify -node NODE-D -address-type IPv4 -enable true -dhcp none
-ip-address NODE-D_SP_IP -netmask SP_NETMASK -gateway SP_GATEWAY
23.2) Tidy up
licenses::>
license
clean-up -unused true -simulate
license
clean-up -unused true
23.3) (As
required) Correct Cluster Ports for the New Platform
Example
Commands::>
net port show -role cluster
net int show -home-node
NODE-C -home-port e0a
net int show -home-node
NODE-D -home-port e0a
net port modify -node NODE-C
-port e0a -role cluster -mtu 9000 -flowcontrol-admin none
net port modify -node NODE-D
-port e0a -role cluster -mtu 9000 -flowcontrol-admin none
net port show -role cluster
net int show -role cluster
net int modify -vserver
NODE-C -lif clus1 -home-port e0a -home-node NODE-C
## Move cable from NODE-C:e0c
to e0a ##
net int modify -vserver
NODE-C -lif clus2 -home-port e0c -home-node NODE-C
## Move cable from NODE-C:e0e
to e0c ##
net int modify -vserver
NODE-D -lif clus1 -home-port e0a -home-node NODE-D
## Move cable from NODE-D:e0c
to e0a ##
net int modify -vserver
NODE-D -lif clus2 -home-port e0a -home-node NODE-D
## Move cable from NODE-D:e0e
to e0c ##
net port modify -port e0e
-node NODE-C -role data
net port modify -port e0e
-node NODE-D -role data
24) Test Failover
24.1) Send
AutoSupports and wait for AutoSupports to send::>
system
node autosupport invoke -node NODE-C -type all -message "TESTING
FAILOVER"
system
node autosupport invoke -node NODE-D -type all -message "TESTING
FAILOVER"
system
node autosupport history show -node NODE-C
system
node autosupport history show -node NODE-D
24.2)
Failover/Giveback NODE-C::>
storage
failover show
storage
failover takeover -ofnode NODE-C
storage
failover show-takeover
storage
failover giveback -ofnode NODE-C
storage
failover show-giveback
cluster
show
net
int show -is-home false
net
int revert *
24.3) Failover/Giveback
NODE-D::>
storage
failover show
storage
failover takeover -ofnode
storage
failover show-takeover
storage
failover giveback -ofnode
storage
failover show-giveback
cluster
show
net
int show -is-home false
net
int revert *
24.4) Send
AutoSupports::>
system
node autosupport invoke -node NODE-C -type all -message "FINISHED
TESTING"
system
node autosupport invoke -node NODE-D -type all -message "FINISHED
TESTING"
Comments
Post a Comment