Clustered ONTAP Storage Admins' Health Checks

Here is a very rough and ready list of a few health-check type commands for the Clustered Data ONTAP storage admin. Of course, with OnCommand Unified Manager, there shouldn’t be much of a need to run manual checks; still, it’s nice keep a list up your sleeve. They’ve been grouped into type of object. Some of these commands will require going into advanced or diag privilege level (set -privilege advanced/diag.)

Note: Some of these commands only work in Clustered ONTAP 8.2 or later.

Dashboard Show Commands

dashboard alarm show

dashboard health vserver show

dashboard performance show

dashboard storage show

Cluster Health Check Commands

cluster show

cluster ring show

cluster ha show

cluster ping-cluster -node NODENAME

date {or} cluster date show

system license show

system license show -fields expiration-date

debug vreport show
event log show -severity EMERGENCY
event log show -severity ALERT
event log show -severity CRITICAL
event log show -severity ERROR
event log show -severity WARNING

Node Health Check Commands

node show

storage failover show

storage failover show -instance

storage failover show -fields hwassist,hwassist-partner-ip,hwassist-partner-port,hwassist-health-check-interval,hwassist-retry-count,hwassist-status

system node image show

event route show

event destination show

system node autosupport show

ndpmd status

spm show -node * -state !running
system health ?
system health node-connectivity shelf show -node NODENAME
system health node-connectivity disk show -node NODENAME -status !OK
system node run -node NODENAME storage show acp

system node run -node NODENAME sysconfig -c

system node run -node NODENAME sysconfig -V

system node run -node NODENAME netstat -s

system node run -node NODENAME netstat -p tcp
system node run -node NODENAME fru_led status
system environment sensors show -node NODENAME

Aggregate Health Check Commands

storage aggregate show

storage aggregate show -state !online

storage aggregate show -aggregate * -percent-used >75

storage aggregate show -aggregate * -raidstatus !”raid_dp,normal”

storage aggregate show -fields free-space-realloc

storage aggregate show -fields percent-snapshot-space

system node run -node NODENAME snap sched -A

system node run -node NODENAME snap reserve -A

system node run -node NODENAME snap list -A

Disk Health Check Commands

storage disk show -state broken

storage disk show -state reconstructing

storage disk show -state spare
storage disk show -average-latency > 10 -fields average-latency,aggregate

Volume Health Check Commands

vol show

vol show -state !online

vol show -vserver * -volume * -percent-used >79

vol show -vserver * -volume * -percent-used <60

vol show -snapshot-policy none -type !DP

vol show -fields percent-snapshot-space

vol show -snapshot-space-used > 80 -fields percent-snapshot-space,snapshot-space-used
vol show -percent-snapshot-space 0
df -gigabyte -volume USING_OUTPUT_FROM_THE_ABOVE {to check for volumes consuming lots of snapshots space but with no snapshot reserve}

vol show -space-guarantee volume

vol show -is-sis-logging-enabled false

vol show -fields read-realloc

vol show -snapshot-count 0

vol show -snapshot-count > 200

vol snap show -create-time <"mon Dec 29 00:00:00 2013"

vol snap show -snapshot *snapmirror* -create-time <"mon Dec 29 00:00:00 2013"

vol snap show -snap !hourly.*,!weekly.*,!daily.*

df -i -vserver * -volume * -percent-inodes-used >79

vol efficiency show

vol efficiency show -fields schedule,last-op-end

Network Port/Interface Health Check Commands

net port show

net int show

net int show -is-home false

net int show -status-oper !up

net int failover-groups show

net int show -fields failover-policy,failover-group,use-failover-group

LUN Health Check Commands

lun show -mapped unmapped

lun show -lun *.rws

lun show -lun *.aux

SnapMirror Health Check Commands

snapmirror show
snapmirror show -healthy false

snapmirror show -status !Idle

snapmirror check -destination-path PATH -foreground true

snapmirror show -schedule "-"
snapmirror show -state !Snapmirrored

Performance Health Check Commands

statistics show -object ?

statistics show -node ? -object ? -instance ?

statistics periodic

statistics periodic -object lif -node NODENAME -instance NODENAME:LIFNAME

::> system node run -node NODENAME

> sysstat -x 1

> sysstat -M 1

> stats

> stats start

> stats stop

Cosonok's IT Blog

Search This Blog

Clustered ONTAP Storage Admins' Health Checks

Comments

Post a Comment