Clustered ONTAP Daily Health Checks Script

The following post is part based on the earlier post from 1^st January: Clustered ONTAP Storage Admins' Health Checks. Here we present a few Clustered ONTAP Storage Admins daily health checks (really there’s too much for a daily checks list …) Please feel free to modify as you see fit. Some of the commands quite nicely display the power of the Clustershell CLI.

Note:

## Two hashes or more is a comment

# One hash you can unhash (for Data ONTAP 8.2+/ where the command needs a date/ where the command needs the local cluster name)

##############################

## CDOT DAILY CHECKS SCRIPT ##

##############################

rows 0

set diag

###########################

## Analyze The Event Log ##

###########################

event log show -severity emergency

event log show -severity alert

event log show -severity critical

event log show -severity error

event log show -severity warning

## example for the last 24 hours

# event log show -time "01/21/2014 09:00:00".."01/22/2014 09:00:00" -severity !informational,!notice,!debug

#############################

## Display Some Dashboards ##

#############################

dashboard alarm show

dashboard performance show

####################

## Cluster Checks ##

####################

cluster show

storage failover show

## 2-Node Clusters

# cluster ha show

date

## CDOT 8.2+

# cluster date show

###############################################

## License Checks (not really a daily check) ##

###############################################

system license show -fields expiration-date

#################

## Node Checks ##

#################

node show -fields health

system health alert show -fields indication-time

## ... and if they're old alerts you can delete them with

# system health alert delete -node * -monitor * -alert-id * -alerting-resource *

system node run -node * -command fru_led status

###############################################

## NDMPD check for jobs running and snapshot ##

###############################################

ndmpd status -fields data-state,data-operation,mover-state,mover-mode

snapshot show -snapshot snapshot_for_backup.* -fields create-time

########################

## Autosupport checks ##

########################

system node autosupport show -state !enable

system node autosupport history show -status !ignore -fields status,last-update

###############################

## Aggregate and Disk Checks ##

###############################

storage aggregate show -state !online

storage aggregate show -aggregate * -percent-used >75

storage aggregate show -aggregate * -raidstatus !”raid_dp,normal”

storage disk show -state broken

storage disk show -container unassigned

sto disk show -container-type aggregate -average-latency > 20 -fields average-latency,aggregate

###################

## Volume Checks ##

###################

vol show -state !online

vol show -vserver * -volume * -percent-used >79 -fields state,size,available,percent-used,space-guarantee -type RW

vol show -vserver * -volume * -percent-used <33 -fields state,size,available,percent-used,space-guarantee -type RW
vol show -snapshot-policy none -type RW -fields volume,size,available,used

vol show -snapshot-space-used > 99 -type RW -fields percent-snapshot-space,snapshot-space-used

vol show -space-guarantee volume -type RW -fields volume,size,available,used

vol show -is-sis-logging-enabled true -type RW -fields volume,sis-space-saved-percent

vol show -is-sis-logging-enabled false -type RW -volume !vol0 -fields volume,size,used

df -i -vserver * -volume * -percent-inodes-used >79

vol efficiency show -fields progress,schedule,policy,last-op-end,state

#####################

## Snapshot Checks ##

#####################

## CDOT 8.2+

# vol show -snapshot-count 0

## CDOT 8.2+

# vol show -snapshot-count > 200

## CHANGE THE DATE - use http://www.timeanddate.com/date/dateadd.html

# vol snap show -create-time <"Wed Oct 09 00:00:00 2013" -fields state,size,create-time,owners

vol snap show -snap !hourly.*,!weekly.*,!daily.*,!snapmirror.*,!*smvi*,!eloginfo*,!exchsnap* -fields state,size,create-time,owners

####################

## Network Checks ##

####################

net port show -link !up

net int show -is-home false

net int show -status-oper !up

################

## SAN Checks ##

################

lun show -mapped unmapped -lun !*rws,!*aux

lun show -lun *.rws # Example - SMBR flexclones

lun show -lun *.aux # Example - failed SME jobs

fcp int show -status-oper !up

iscsi int show -status-oper !up

#######################

## SnapMirror Checks ##

#######################

snapmirror show -healthy false

snapmirror show -status !Idle

snapmirror show -state !snapmirrored

## CHANGE THE LOCAL to CLUSTER you're running the command from

## ... for the snapmirror command below since schedule displays on destination cluster only!

# snapmirror show -schedule "-" -fields state,status -source-cluster !LOCAL

## Compare the following two outputs, should roughly have same number of not RW vols as snapmirrors to this cluster

# snapmirror show -destination-cluster LOCAL -fields destination-volume

# vol show -type !RW

Comments

Smo Trevor22 June 2014 at 23:40
i like your article such a helpful article
Puritan Pride Coupon Code
Unknown21 July 2014 at 00:45
am new into cdot, its very usefull. Thank you!!

Also am looking cdot-administration commands and performance and cdot-ontap upgrade cmds. (solaipoovan@yahoo.com)

I appreciate your help!
Sm@sher19 January 2015 at 07:50
Hi. First of all you have really great blog. Thanks for your posts.

Can you explain me syntaxis of this row line "vol show -vserver * -volume * -percent-used <33 -fields="" -type="" available="" o:p="" percent-used="" rw="" size="" space-guarantee="" state="">"
It's not clear for me.

Cosonok's IT Blog

Search This Blog

Clustered ONTAP Daily Health Checks Script

Comments

Post a Comment