ARL Headswap: Part 4/4 - Node D and Finish

Performing a non-disruptive ARL (Aggregate Relocate) Headswap for NetApp Clustered Data ONTAP - Step-by-Step Walkthrough Series:


Caveat Lector: Unofficial information!

18) Replace NODE-B with NODE-D

18.1) Cable NODE-D
Moving connections from NODE-B to NODE-D

18.2) Boot NODE-D to the Boot Menu
Power on NODE-D (if not already) and press Ctrl-C to access the boot loader environment.
At the LOADER> prompt type:

boot_ontap prompt

Interrupt the boot sequence by pressing Ctrl-C to get to the Boot Menu

18.3) Enter Maintenance Mode
At the Boot Menu, select (5) Maintenance mode boot

18.4) Verify MPHA *>

storage show disk -p


18.5) Acquire the System ID of NODE-D *>

sysconfig
disk show -a


18.6) Assign NODE-B’s Disks to NODE-D
Note: Only root aggr and spare disks get assigned to NODE-D. Check with ‘disk show -a’.
(8.2.x) Reassign disks from NODE-B to NODE-D *>

disk reassign -s NODE-B_SYSID -d NODE-D_SYSID

(8.3.x) Reassign disks from NODE-B to NODE-D *>

disk reassign -s NODE-B_SYSID -d NODE-D_SYSID -p NODE-C_SYSID

Note: The -p option is only required in maintenance mode when shared disks are present.
Enter n to ‘Abort reassignment (y/n)?’
Enter y to ‘After the node becomes operation ... Do you want to continue (y/n)?’
Enter y to ‘Disk ownership will be updated ... Do you want to continue (y/n)?’

18.7) Verify ha setting shows “ha” *>

ha-config show

Note: To modify use*>

ha-config modify controller ha
ha-config modify chassis ha


18.8) Destroy mailbox disks *>

mailbox destroy local
mailbox destroy partner


18.9) Verify personality of FC/UTA ports *>

ucadmin show
ucadmin modify


18.10) Exit maintenance mode *>

halt

This returns us to the LOADER> prompt.

18.11) Check Date
On NODE-C check the date in Clustershell::>

date

On NODE-D check the date and time at LOADER>

show date
show time

And if necessary>

set date mm/dd/yyyy
set time hh:mm:ss


18.12) Boot>

boot_ontap prompt


18.13) UPDATE FLASH FROM BACKUP CONFIG
IMPORTANT - DO NOT LET THE NODE BOOT WITHOUT DOING OPTION 6 FIRST!
Interrupt the boot by pressing Ctrl-C.
At the boot menu, select option (6) Update flash from backup config.

This will replace all flash-based configurations with the last backup to disks. Are you sure you want to continue? y

Enter y

The boot proceeds normally and the system then asks you to confirm the system ID mismatch. Confirm the mismatch.
Example:

WARNING: System id mismatch. This usually occurs when replacing CF or NVRAM cards! Override system id (y|n) ? [n] y

The node might go through one round of reboot before booting normally.

18.14) Verify NODE-D
Log in to NODE-D and from Clustershell::>

cluster show
event log show -messagename scsiblade.*
storage aggregate show -owner-name NODE-C


19) Return data LIFs to NODE-D

19.1) Broadcast-Domains, Failover-Groups, VLANs, IFGRPs
Handle as required.

19.2) Return data LIFs to NODE-D
Typical commands::>

net int show -curr-node NODE-D
net int modify -vserver VSERVER -lif LIF_NAME -home-port PORT -home-node NODE-D
net int revert -vserver VSERVER -lif LIF_NAME


19.3) Restore SAN LIFs
Typical commands::>

net int show -curr-node NODE-D
net int modify -vserver VSERVER -lif LIF_NAME -status-admin up


20) Handle NODE-D’s Other LIFs

20.1) Cluster LIFs (Node Local)
Re-home as required (one LIF at a time).

20.2) Node-Mgmt LIF (Node Local)
Re-home as required.

20.3) Intercluster LIF (Node Local)
Re-home as required.

21) Use ARL to Relocate (Selected) Aggregates from NODE-C to NODE-D

21.1) Perform Aggregate Relocation
Note: “-override-vetoes true” may be required
Relocate only aggregates that originally belonged to NODE-B back to NODE-D::>

set adv
aggr show -owner-name NODE-D -root false

aggr relocation start -node NODE-C -destination NODE-D -aggregate-list AGGR_NAME -ndo-controller-upgrade true
## REPEAT UNTIL NODE-D’s DATA AGGREGATES HAVE BEEN RELOCATED BACK HOME ##

storage aggregate relocation show -node NODE-C

Proceed once all data aggregates originally owned by NODE-B have been successfully relocated to NODE-D.

21.2) Verify Aggregates are Online and Check for Offline Volumes::>

storage aggregate show -nodes NODE-D -root false
volume show -node NODE-D -state offline

Note: If any non-root aggregates are offline or foreign, online the aggregate(s).

21.3) Verify does not say “Node owns partner aggregates as part of the non-disruptive head upgrade procedure”::>

storage failover show


22) Re-Enable SFO

22.1) Enable Storage Failover::>

storage failover show
storage failover modify -enabled true -node NODE-C


22.2) (2-Node Cluster) Verify cluster ha has re-enabled::>

cluster ha show


22.3) Re-enable AUTOBOOT
If you disabled AUTOBOOT, re-enable::>

set -c off; set d
debug kenv show -node * -variable AUTOBOOT
debug kenv modify -node NODE-C -variable AUTOBOOT -value true -persist true
debug kenv modify -node NODE-D -variable AUTOBOOT -value true -persist true


22.4) Checks::>

cluster show
storage failover show -fields local-missing-disks,partner-missing-disks
network interface show
storage aggregate show -owner-name NODE-C
storage aggregate show -owner-name NODE-D
volume show -state offline


22.5) Send AutoSupports::>

system node autosupport invoke -node NODE-C -type all -message "ARL Process Completed!"
system node autosupport invoke -node NODE-D -type all -message "ARL Process Completed!"


23) ARL Finishing Touches

23.1) Configure Service Processor IPs
Typical Commands::>

service-processor network modify -node NODE-C -address-type IPv4 -enable true -dhcp none -ip-address NODE-C_SP_IP -netmask SP_NETMASK -gateway SP_GATEWAY
service-processor network modify -node NODE-D -address-type IPv4 -enable true -dhcp none -ip-address NODE-D_SP_IP -netmask SP_NETMASK -gateway SP_GATEWAY


23.2) Tidy up licenses::>

license clean-up -unused true -simulate
license clean-up -unused true


23.3) (As required) Correct Cluster Ports for the New Platform
Example Commands::>

net port show -role cluster
net int show -home-node NODE-C -home-port e0a
net int show -home-node NODE-D -home-port e0a
net port modify -node NODE-C -port e0a -role cluster -mtu 9000 -flowcontrol-admin none
net port modify -node NODE-D -port e0a -role cluster -mtu 9000 -flowcontrol-admin none
net port show -role cluster
net int show -role cluster
net int modify -vserver NODE-C -lif clus1 -home-port e0a -home-node NODE-C
## Move cable from NODE-C:e0c to e0a ##
net int modify -vserver NODE-C -lif clus2 -home-port e0c -home-node NODE-C
## Move cable from NODE-C:e0e to e0c ##
net int modify -vserver NODE-D -lif clus1 -home-port e0a -home-node NODE-D
## Move cable from NODE-D:e0c to e0a ##
net int modify -vserver NODE-D -lif clus2 -home-port e0a -home-node NODE-D
## Move cable from NODE-D:e0e to e0c ##
net port modify -port e0e -node NODE-C -role data
net port modify -port e0e -node NODE-D -role data


24) Test Failover

24.1) Send AutoSupports and wait for AutoSupports to send::>

system node autosupport invoke -node NODE-C -type all -message "TESTING FAILOVER"
system node autosupport invoke -node NODE-D -type all -message "TESTING FAILOVER"
system node autosupport history show -node NODE-C
system node autosupport history show -node NODE-D


24.2) Failover/Giveback NODE-C::>

storage failover show
storage failover takeover -ofnode NODE-C
storage failover show-takeover
storage failover giveback -ofnode NODE-C
storage failover show-giveback
cluster show
net int show -is-home false
net int revert *


24.3) Failover/Giveback NODE-D::>

storage failover show
storage failover takeover -ofnode
storage failover show-takeover
storage failover giveback -ofnode
storage failover show-giveback
cluster show
net int show -is-home false
net int revert *


24.4) Send AutoSupports::>

system node autosupport invoke -node NODE-C -type all -message "FINISHED TESTING"
system node autosupport invoke -node NODE-D -type all -message "FINISHED TESTING"


Comments