I’ve
done a few ARL headswaps and always worked out a strategy to ensure at least
one cluster port maps correctly to the new head (example: FAS3220 to FAS6220 ARL headswap by moving one of the FAS3220’s
10 GbE cards to slot 3 and moving cluster ports onto that, then taking the card
across to the FAS6220 and slot 3, and post-ARL correcting the cluster ports to
follow best practice). I was wondering what I’d do if I absolutely
couldn’t come up with a cunning plan to map a cluster port from old
controller to new controller. And using a 2-node SIM cluster, I demonstrate
what I’d do.
Note: This is of course all unofficial stuff. On
production systems, only NetApp support/personnel
with a valid support case should be using commands to manipulate the
CDB like we do here. Only the absolute minimum required amount of CDB modification
is done to get the node back into quorum.
To
demonstrate this, I have a 2-node cluster (C91) with nodes C91-01 and C91-02.
What I do is ensure epsilon is on node 2 (C91-02), then I shutdown C91-01 and
remove the two ports (network adapter 5 and 6 which map to e0e and e0f) used for
cluster ports.
Image: Removing cluster ports e0e and e0f from
the simulator
1) Gathering a few outputs and halting node 1
C91::*> version
NetApp Release 9.1: Thu Dec
22 23:05:58 UTC 2016
C91::*> cluster show
Node Health
Eligibility Epsilon
------ ------- -----------
-------
C91-01 true true
false
C91-02 true true
true
C91::*> net int show
-role cluster
Logical Status
Network Current Current
Is
Interface Admin/Oper Address/Mask Node
Port Home
---------- ----------
---------------- ------- ------- ----
C91-01_clus1 up/up 169.254.94.74/16 C91-01 e0e
true
C91-01_clus2 up/up 169.254.94.84/16 C91-01 e0f
true
C91-02_clus1 up/up 169.254.47.70/16 C91-02 e0e
true
C91-02_clus2 up/up 169.254.47.80/16 C91-02 e0f
true
C91::*> net port show
-role cluster
Node: C91-01
Speed(Mbps) Health
Port IPspace Broadcast Link
MTU Admin/Oper Status
---- ------- --------- ----
---- ----------- -------
e0e Cluster Cluster up
1500 auto/1000 healthy
e0f Cluster Cluster up
1500 auto/1000 healthy
Node: C91-02
Speed(Mbps)
Health
Port IPspace Broadcast Link
MTU Admin/Oper Status
---- ------- --------- ----
---- ----------- -------
e0e Cluster Cluster up
1500 auto/1000 healthy
e0f Cluster Cluster up
1500 auto/1000 healthy
C91::*> halt -node C91-01
2) Remove network
adapter 5 and 6 from the simulator
3) Power up node 1
4) Check cluster
quorum
Notice that node 1 (C91-01) is out-of-quorum (cluster
health = false).
C91::*> node show local
-fields node
node
------
C91-01
C91::*> cluster show
Node Health
Eligibility Epsilon
------ ------- -----------
-------
C91-01 false true false
C91-02 false true
true
Notice that node 2 (C91-02) is in quorum (cluster health
= true).
C91::*> node show local
-fields node
node
------
C91-02
C91::*> cluster show
Node Health
Eligibility Epsilon
------ ------- -----------
-------
C91-01 false true
false
C91-02 true true
true
5) Fix the problem
We modify ports e0a and e0b to be cluster ports.
Then we modify the cluster LIFs to be on e0a and e0b.
Finally we reboot node 1.
C91::*> net port show
-role cluster
There are no entries
matching your query.
C91::*> net int show
-role cluster
Logical Status
Network Current Current
Is
Interface Admin/Oper Address/Mask Node
Port Home
---------- ----------
---------------- ------- ------- ----
C91-01_clus1 up/down 169.254.94.74/16 C91-01 e0e
true
C91-01_clus2 up/down 169.254.94.84/16 C91-01 e0f
true
C91::*> broadcast-domain
show
Error: show failed: Cannot
run this command because the system is not fully initialized. Wait a few
minutes, and then try the command again.
C91::*> set diag
C91::*> network ipspace
cdb show
IPspace ID
------- -------
Cluster
4294967294
Default
4294967295
C91::*> network port cdb
show
Auto-Neg Duplex
Speed Flowcontrol
Node Port Role MTU
Admin Admin Admin Admin
------ ---- --------- ----
-------- ------ ----- -----------
C91-01
e0a
data 1500 true auto auto
full
e0b
data 1500 true auto
auto full
e0c
node-mgmt 1500 true auto auto
full
e0d
data 1500 true auto
auto full
Warning: Unable to list
entries on node C91-02. RPC: Couldn't make connection
C91::*> network interface
cdb show
Status Network Valid
Node ID
Name Admin Address
Netmask Id
------ ----- ------- ------
------------- ----------- -----
C91-01
1023
C91-01_ up 169.254.94.84
255.255.0.0 true
clus2
1024
C91-01_ up 169.254.94.74
255.255.0.0 true
clus1
Warning: Unable to list
entries on node C91-02. RPC: Couldn't make connection
C91::*> net port cdb
modify -port e0a -node C91-01 -role cluster -mtu 1500 -flowcontrol-admin none
-ipspace-id 4294967294
C91::*> net port cdb
modify -port e0b -node C91-01 -role cluster -mtu 1500 -flowcontrol-admin none
-ipspace-id 4294967294
C91::*> net port cdb show
Auto-Neg Duplex
Speed Flowcontrol
Node Port Role MTU
Admin Admin Admin Admin
------ ---- --------- ----
-------- ------ ----- -----------
C91-01
e0a
cluster 1500 true auto
auto none
e0b
cluster 1500 true auto
auto none
e0c
node-mgmt 1500 true auto auto
full
e0d
data 1500 true auto
auto full
Warning: Unable to list
entries on node C91-02. RPC: Couldn't make connection
C91::*> net int cdb show
Status Network Valid
Node ID
Name Admin Address
Netmask Id
------ ----- ------- ------
------------- ----------- -----
C91-01
1023
C91-01_ up 169.254.94.84
255.255.0.0 true
clus2
1024
C91-01_ up 169.254.94.74
255.255.0.0 true
clus1
Warning: Unable to list
entries on node C91-02. RPC: Couldn't make connection
C91::*> net int cdb
modify -lif-id 1024 -node C91-01 -home-port e0a -home-node C91-01 -curr-port
e0a -curr-node C91-01
C91::*> net int cdb
modify -lif-id 1023 -node C91-01 -home-port e0b -home-node C91-01 -curr-port
e0b -curr-node C91-01
C91::*> net int show
-role cluster
Logical Status
Network Current Current
Is
Interface Admin/Oper Address/Mask Node
Port Home
---------- ----------
---------------- ------- ------- ----
C91-01_clus1 up/down 169.254.94.74/16 C91-01 e0a
true
C91-01_clus2 up/down 169.254.94.84/16 C91-01 e0b
true
C91::*> node show local
-fields node
node
------
C91-01
C91::*> reboot local
Notice above that even when we’ve got the cluster LIFs on
the correct port, they are still marked as operationally down; this is why we have to
do the reboot so the CDB can reload correctly.
6) Check everything
is okay
C91::*> net int show
-role cluster
Logical Status
Network Current Current
Is
Interface Admin/Oper Address/Mask Node
Port Home
---------- ----------
---------------- ------- ------- ----
C91-01_clus1 up/up 169.254.94.74/16 C91-01 e0b
true
C91-01_clus2 up/up 169.254.94.84/16 C91-01 e0a
true
C91-02_clus1 up/up 169.254.47.70/16 C91-02 e0e
true
C91-02_clus2 up/up 169.254.47.80/16 C91-02 e0f
true
C91::*> cluster show
Node Health Eligibility Epsilon
------ ------ -----------
-------
C91-01 true true
false
C91-02 true true
true
C91::*> network port
broadcast-domain show -ipspace Cluster
IPspace Broadcast Update
Name Domain Name MTU Port List
Status
------- ----------- ----
---------- ------
Cluster Cluster 1500
C91-01:e0a complete
C91-01:e0b complete
C91-02:e0e complete
C91-02:e0f complete
THE END
UPDATE!
This should be easier in ONTAP 9.0 and greater, but the above still works. Check out this KB article:
Headswap mapping ports steps fail with "Error: Cannot run this command because the system is not fully initialized."
https://kb.netapp.com/app/answers/answer_view/a_id/1084223/loc/en_US
Also related:
Cluster LIFs not visible after headswap
https://kb.netapp.com/app/answers/answer_view/a_id/1087737/loc/en_US
UPDATE!
This should be easier in ONTAP 9.0 and greater, but the above still works. Check out this KB article:
Headswap mapping ports steps fail with "Error: Cannot run this command because the system is not fully initialized."
https://kb.netapp.com/app/answers/answer_view/a_id/1084223/loc/en_US
Also related:
Cluster LIFs not visible after headswap
https://kb.netapp.com/app/answers/answer_view/a_id/1087737/loc/en_US
Comments
Post a Comment