In the post we
demonstrate how to fix an interesting problem that’s not likely to crop up
much!
Problem
A NetApp FAS/V-Series controller running Clustered ONTAP
(here the version is 8.1.2). VLANs are created on a physical network adapter
that is subsequently removed from the controller. The output of the command (see
output in APPENDIX A) -
clust::> network port
show -node clust-01
- shows the VLANs and old physical ports still there. But
attempting the -
clust::> network port
vlan delete -node clust-01 -vlan-name e0g-80
- results in -
Error: command failed: invalid
operation
Going into the advanced privilege level using -
clust::> set -privilege
advanced
- and trying the -
clust::*> network port
delete -node clust-01 -port e0g
- results in -
Error: command failed: Cannot
delete a port that has a vlan on it.
Resolution
IMPORTANT NOTE: This
is a suggested fix for a specific problem. Before trying this on a production
system, please engage with NetApp Global Support first! This blog (like
all blogs) comes with a health warning:
Caveat lector (let
the reader beware) - this is not official vendor literature and comes with no warranty!
To clear the ghost VLAN ports from Clustered ONTAP, go
into the diagnostic privilege level using -
clust::*> set -privilege
diagnostic
Warning: These diagnostic commands are for use by NetApp
personnel only.
Do you want to continue? {y|n}: y
Then we delete the vlan from the cdb (configuration
database) and recreate it using -
clust::*> net port cdb
delete -node clust-01 -port e0g-80
clust::*> net port cdb
create -node clust-01 -port e0g-80 -role data -mtu 1500 -autonegotiate-admin false -duplex-admin full -speed-admin auto -up-admin false -type vlan -flowcontrol-admin none -aggr-node clust-01
-aggr-port
e0a -vlan-node clust-01 -vlan-port e0g-80 -vlan-tag 80
Note i: The net port cdb delete will delete the
VLAN port from display if you run a net
port cdb show but it still remains for a net port show.
Note ii: The aggr-port (Interface Group Parent Port)
must be an existing port or this won’t work (here we use e0a.)
Note iii: After
running the above, from the output of the net
port show, you will see the Auto-Negot Admin = false (see output APPENDIX
B.)
Note iv: Pretty
much all the switches above are required inputs for a net port cdb create (gotta love Clustered ONTAP CLI tab completion!)
Repeat this for all the rogue VLAN ports.
Then reboot the controller!
clust::*> system reboot
-node clust-01
After the reboot, if we run a net port show we will see the link status for the VLAN ports are
all down (see output APPENDIX C.) Now we can delete our VLANs:
clust::> network port
vlan delete -node clust-01 -vlan-name e0g-80
Repeat the above line for all the rogue VLAN ports.
Finally, enter the advanced privilege level and delete
the rogue port.
clust::> set -priv
advanced
clust::*> network port
delete -node clust-01 -port e0g
Verify that the VLANs and port has indeed gone using the
command (see output APPENDIX D.)
clust::*> net port show
-node clust-01
THE END
APPENDIX A: ‘net
port show -node clust-01’ at the start
clust::> net
port show -node clust-01
Auto-Negot
Node Port
Role Link MTU Admin/Oper
------ ------
------------ ---- ----- -----------
clust-01
e0a
data up 1500
true/true
e0b
data up 1500
true/true
e0e
cluster up 1500
true/true
e0f
cluster up 1500
true/true
e0g
data - -
true/-
e0g-80 data - -
true/-
e0g-81 data - -
true/-
e0g-82 data - -
true/-
e0g-83 data - -
true/-
e0g-84 data - -
true/-
APPENDIX B: ‘net
port show -node clust-01’ after net port cdb create
clust::*>
net port show -node clust-01
Auto-Negot
Node Port
Role Link MTU Admin/Oper
------ ------
------------ ---- ----- -----------
clust-01
e0a
data up 1500
true/true
e0b
data up 1500
true/true
e0e
cluster up 1500
true/true
e0f
cluster up 1500
true/true
e0g
data - -
true/-
e0g-80 data - - false/-
e0g-81 data - - false/-
e0g-82 data - - false/-
e0g-83 data - - false/-
e0g-84 data - - false/-
APPENDIX C: ‘net
port show -node clust-01’ after reboot
clust::> net
port show -node clust-01
Auto-Negot
Node Port
Role Link MTU Admin/Oper
------ ------
------------ ---- ----- -----------
clust-01
e0a
data up 1500
true/true
e0b
data up 1500
true/true
e0e
cluster up 1500
true/true
e0f
cluster up 1500
true/true
e0g
data - -
true/-
e0g-80 data down - false/false
e0g-81 data down - false/false
e0g-82 data down - false/false
e0g-83 data down - false/false
e0g-84 data down - false/false
APPENDIX D: ‘net
port show -node clust-01’ after deleting VLANs
clust::*>
net port show -node clust-01
Auto-Negot
Node Port
Role Link MTU Admin/Oper
------ ------
------------ ---- ----- -----------
clust-01
e0a
data up 1500
true/true
e0b
data up 1500
true/true
e0e
cluster up 1500
true/true
e0f
cluster up 1500
true/true
APPENDIX E: Some Advanced Troubleshooting Commands
set diag
network port cdb show
net port show -smftrace
net port vlan show -smftrace
debug smdb table netport_vlans
show
debug smdb table
netport_vlans_byname show
debug smdb table
vifmgr_netport_vlans show
debug smdb table
bootconf_netports show
debug smdb table
mgwd_cdb_netports show
debug smdb table netports show
debug smdb table
netports_byname show
debug smdb table
vifmgr_netports show
To collect some output from the “vifmgrclient”:
security login unlock -username
diag
security login password
-username diag
systemshell -node nodename
{login as diag
user}
vifmgrclient -debug
Are you sure you
want to risk a full-reinstall and continue? y
At the "Command:" prompt, enter "show" and it will dump a lot of output to the
screen.
When the "Command:" prompt returns, then you
can enter “quit” to exit the systemshell.
a much safer resolution can be had by using
ReplyDeletenetwork interface show -home-port e0g-80
followed by the appropriate network interface modify
and BING, Bob's your uncle.
I wish it were that easy. If your e0g's disappeared from your machine (say someone took out a NIC for whatever reason) and you try the -
Deletenet int show -home-port e0g-80
- it comes back with -
There are no entries matching your query.
Actually if someone removes a card, the -home-port will still report e0g, but the -curr-port will have changed (hopefully, if failover-groups...). That is after all the info in the cdb. It is a bit messy as the error message is misleading.
DeleteIf nothing shows on the net int show -home-port e0g-80, then net int show -curr-port e0g-80. But that is not often overlooked. Most often, it is the LIF has been migrated, but not modified, thus locking the delete and giving misleading errors.
In this article, e0g-80 is a VLAN and not a LIF. VLANs are associated with a fixed physical port or ifgrp, they have no home-port attribute. Actually, the example was taken from a SIM (where you can recreate this problem), in reality you'd never see this problem with an e0?-VLAN since you can't remove an onboard port, but can see this problem if someone removes a physical Ethernet IO card without deleting the VLANs that were on it first.
Delete