Sunday, 1 December 2019

Troubleshooting SDU 5.3.1 and ONTAP 9.3 error 0001-242


Two slightly different errors depending on whether SDU is configured to use https or http.

0001-242 Admin error: Unable to connect using https to storage system: 192.168.1.16...
0001-242 Admin error: Unable to connect using http to storage system: 192.168.1.16...

... Ensure that 192.168.1.16 is a valid storage system name/address, if the storage system that you configure is running on a Data ONTAP operating in 7-mode, add the host to the trusted hosts (options trusted.hosts) and enable SSL on the storage system 192.168.1.16 or modify the snapdrive.conf to use http for communication and restart the snapdrive daemon. If the storage system that you configure is running on a clustered Data ONTAP, ensure that the Vserver name is mapped to IP address of the Vserver's management LIF.

I don’t do much with SDU, so an opportunity for me to learn something new. We kept hitting the 0001-242 error with SDU trying to connect to an ONTAP 9.3 SVM. Everything seemed fine with the configuration but a few things it could be:

1) ONTAP 9.3 needs SDU 5.3.1P1 or above. See note below from the IMT:
8783: SDU version 5.3.1P1 or above need to be installed to use ONTAP 9.2, ONTAP 9.3, ONTAP 9.4 and ONTAP 9.5, ONTAP 9.6

2) Using the wrong SVM management IP.
If you use the wrong SVM IP (i.e. an IP that doesn’t even exist), you still get the 0001-242 error.

3) Firewall policy on the SVM management LIF needs to allow HTTP/HTTPS.
The standard firewall policy mgmt will work.

4) SDU is not connecting to the IP you think it should be connecting to on the SVM.
You can set the mgmt and data LIFs to use the mgmt firewall policy, then even if SDU is connecting to the wrong LIF, it should still work.

5) SDU is not using the port on the Linux server that you think it should be using.
The is an option ‘blacklist-interfaces’ in the SDU snapdrive.conf file that allows you to control this. See:
Also SDU 5.2.2 installation and admin guide: https://library.netapp.com/ecmdocs/ECMP1552621/html

Testing Examples
                                                                                                        
EXAMPLE: Successful telnet test
A telnet test to port 80 (port 443 for HTTPS) of the mgmt. IP/SVM DNS name, should be successful. Of course, it depends on telnet running from the interface on the Linux box that you want to talk to the ONTAP SVM, and this may not be the case.

[root@rhel1 ~]# telnet 192.168.1.16 80
Trying 192.168.1.16...
Connected to 192.168.1.16.


EXAMPLE: snapdrive config list

[root@rhel1 ~]# snapdrive config list
username   appliance name   appliance type
------------------------------------------
vsadmin    192.168.1.16     StorageSystem


EXAMPLE: 0008-500 Admin error

[root@rhel1 ~]# snapdrive config delete 192.168.1.16
0008-500 Admin error: Datapath has been configured for the storage system 192.168.1.16.
Please delete it using snapdrive config delete -mgmtpath command and retry.


EXAMPLE: snapdrive config list -mgmtpath

[root@rhel1 ~]# snapdrive config list -mgmtpath
system name   management interface   datapath interface
-------------------------------------------------------
db            192.168.1.16           192.168.1.10|192.168.1.12

Note: db is the name of the SVM.

EXAMPLE: snapdrive config delete -mgmtpath

[root@rhel1 ~]# snapdrive config delete -mgmtpath db
Deleted configuration for appliance: db


EXAMPLE: snapdrive config delete

[root@rhel1 ~]# snapdrive config delete 192.168.1.16
Deleted configuration for appliance: 192.168.1.16


EXAMPLE: snapdrive config set (SUCCESSFUL)

[root@rhel1 ~]# snapdrive config set vsadmin 192.168.1.16
Password for vsadmin:
Retype password:
[root@rhel1 ~]#


EXAMPLE: snapdrive config set -mgmtpath (SUCCESSFUL)

[root@rhel1 ~]# snapdrive config set -mgmtpath 192.168.1.16 192.168.1.10 192.168.1.16 192.168.1.12
[root@rhel1 ~]# snapdrive config list
username   appliance name   appliance type
------------------------------------------
vsadmin    192.168.1.16     StorageSystem
[root@rhel1 ~]# snapdrive config list -mgmtpath
system name   management interface   datapath interface
-------------------------------------------------------
db            192.168.1.16           192.168.1.12|192.168.1.10


After ‘config delete -mgmtpath’ and config delete, we try again but this time with the wrong firewall policy on the mgmt LIF (data firewall policy), and it fails as expected.

EXAMPLE: snapdrive config set (0001-242 error https)

[root@rhel1 ~]# snapdrive config set vsadmin 192.168.1.16
Password for vsadmin:
Retype password:
0001-242 Admin error: Unable to connect using https to storage system: 192.168.1.16.


EXAMPLE: snapdrive config show (See Appendix 2)

[root@rhel1 ~]# snapdrive config show


After editing the /opt/NetApp/snapdrive/snapdrive.conf file (with vi), you need to restart the snapdrive daemon.

EXAMPLE: snapdrived restart

[root@rhel1 ~]# snapdrived restart
Successfully stopped daemon
WARNING!!! Unable to find a SAN storage stack. Please verify that the appropriate transport protocol, volume manager, file system and multipathing type are installed and configured in the system. If NFS is being used, this warning message can be ignored.
Successfully started daemon


Still with the management LIF firewall policy set incorrectly (data) but having edited the snapdrive.conf file to use http, we try snapdrive config set again.

EXAMPLE: snapdrive config set (0001-242 error http)

[root@rhel1 ~]# snapdrive config set vsadmin 192.168.1.16
Password for vsadmin:
Retype password:
0001-242 Admin error: Unable to connect using http to storage system: 192.168.1.16.


EXAMPLE: Viewing the sd-audit log

[root@rhel1 ~]# cat /var/log/sd-audit.log


EXAMPLE: Viewing the sd-client-trace.log

[root@rhel1 ~]# cat /var/log/sd-client-trace.log


EXAMPLE: Viewing interfaces (with a view to blacklisting the ones we don’t want to be used for SDU mgmt connect)

[root@rhel1 ~]# ifconfig


EXAMPLE: snapdrive config set to just a random IP, and it gives the same error!

[root@rhel1 ~]# snapdrive config set vsadmin 192.168.1.77
Password for vsadmin:
Retype password:
0001-242 Admin error: Unable to connect using http to storage system: 192.168.1.77.


EXAMPLE: snapdrive version
The version in my lab is 5.2.2.

[root@rhel1 ~]# snapdrive version
snapdrive Version 5.2.2
Snapdrive Daemon Version 5.2.2


EXAMPLE: Running snapdrive config wizard (see Appendix 3)

[root@rhel1 ~]# /opt/NetApp/snapdrive/setup/config_wizard


APPENDIXES

APPENDIX 1: Snapdrive: Supported commands and operations are:
APPENDIX 2: snapdrive config show
APPENDIX 3: /opt/NetApp/snapdrive/setup/config_wizard

APPENDIX 1: Snapdrive: Supported commands and operations are:

snapdrive  snap     show
snapdrive  snap     list
snapdrive  snap     create
snapdrive  snap     delete
snapdrive  snap     rename
snapdrive  snap     connect
snapdrive  snap     disconnect
snapdrive  snap     restore
snapdrive  snap     wizard    connect
snapdrive  snap     wizard    disconnect
snapdrive  snap     wizard    restore
snapdrive  storage  show
snapdrive  storage  list
snapdrive  storage  create
snapdrive  storage  delete
snapdrive  storage  resize
snapdrive  storage  connect
snapdrive  storage  disconnect
snapdrive  storage  wizard    create
snapdrive  storage  wizard    delete
snapdrive  host     connect
snapdrive  host     disconnect
snapdrive  version
snapdrive  config   access
snapdrive  config   prepare
snapdrive  config   check
snapdrive  config   show
snapdrive  config   set
snapdrive  config   delete
snapdrive  config   list
snapdrive  config   migrate   set
snapdrive  config   migrate   delete
snapdrive  config   migrate   list
snapdrive  clone    split   estimate
snapdrive  clone    split   start
snapdrive  clone    split   status
snapdrive  clone    split   stop
snapdrive  clone    split   result
snapdrive  portset  add
snapdrive  portset  delete
snapdrive  portset  list
snapdrive  igroup   add
snapdrive  igroup   delete
snapdrive  igroup   list
snapdrive  lun      showpaths
snapdrive  lun      fixpaths


APPENDIX 2: snapdrive config show

[root@rhel1 ~]# snapdrive config show
#
# Snapdrive Configuration
#   file:   /opt/NetApp/snapdrive/snapdrive.conf
#   Version 5.2.2 (Change 2571785 Built 'Wed Oct 15 08:01:01 PDT 2014')
#
# Default values are shown by lines which are commented-out in this file.
# If there is no un-commented-out line in this file relating to a particular value, then
# the default value represented in the commented-out line is what SnapDrive will use.
#
# To change a value:
#
#     -- copy the line that is commented out to another line
#     -- Leave the commented-out line
#     -- Modify the new line to remove the '#' and to set the new value.
#     -- Save the file and exit
#     -- Also remember to restart the snapdrive daemon by issuing 'snapdrived restart'
#
#
PATH="/etc/vx/bin:/sbin:/usr/sbin:/bin:/usr/bin:" #toolset search path
all-access-if-rbac-unspecified="on" #Allows access to all filer operations if the RBAC permissions file is missing in filer volume
allow-partial-clone-connect="off" #Allows to connect to a subset of file system or host volume of the cloned diskgroup
audit-log-file="/var/log/sd-audit.log" #Audit Log File Path
audit-log-max-size=20480 #Maximum size (in bytes) of audit log file
audit-log-save=2 #Number of historical audit log file to save
autosupport-enabled="off" #Enable autosupport messages to be sent to the configured Storage System(s)
available-lun-reserve=8 #Number of LUNs for which to reserve host resources
blacklist-interfaces="" #Ignore interfaces in the blacklist list
bypass-snapdrive-clone-generated-check="off" #Deletes volume clone eventhough if it is not snapdrive-generated
check-export-permission-nfs-clone="on" #Checks if the host has nfs export permissions for resource being connected
client-trace-log-file="/var/log/sd-client-trace.log" #client trace log file (Probably never used or useful)
cluster-operation-timeout-secs=600 #Cluster Operation timeout in seconds (Useful only on SFRAC Environments). Increase this value if you frequent failures in SFRAC environments
contact-http-dfm-port=8088 #HTTP server port to contact to access the DFM (Change this only if you have modified DFM Server settings)
contact-http-port=80 #HTTP port to contact to access the filer (This should not be changed most of the time)
contact-http-port-sdu-daemon=4094 #HTTP port on which sdu daemon will bind
contact-https-port-sdu-daemon=4095 #HTTPS port on which sdu daemon will bind
contact-ssl-dfm-port=8488 #SSL server port to contact to access the DFM
contact-ssl-port=443 #SSL port to contact to access the filer
contact-viadmin-port=8043 #HTTP/HTTPS port to contact to access the virtual interface admin
daemon-trace-log-file="/var/log/sd-daemon-trace.log" #daemon trace log file
datamotion-cutover-wait=120 #Wait time in seconds during data motion
default-noprompt="off" #A default value for -noprompt option in the command line
default-transport="iscsi" #Transport type to use for storage provisioning, when a decision is needed
deferred-logical-volume-start="off" #Enable to start the volume in background
device-retries=3 #Number of retries on Ontap filer LUN device inquiry (This is no longer useful or used)
device-retry-sleep-secs=1 #Number of seconds between Ontap filer LUN device inquiry retries (This is no longer useful or used)
dfm-api-timeout=180 #Timeout in seconds for calling DFM API
dfm-rbac-retries=12 #Number of access retries until DFM Refreshes (Increase this value if DFM is unable to discover newly created Volumes)
dfm-rbac-retry-sleep-secs=15 #Number of seconds between DFM rbac access retries(Increase this value if DFM is unable to discover the Volume)
do-lunclone="on" #Lunclone for Dataset mount_backup if readonly qtree is detected
enable-alua="on" #Enable ALUA for the igroup
enable-fcp-cache="on" #Enable FCP Cache in Assistants
enable-implicit-host-preparation="on" #Enable implicit host preparation for LUN creation
enable-migrate-nfs-version="off" #Enable snap connect or restore to use higher version of NFS version when NFS version used during snapshot currently is not enabled on the storage system
enable-mount-with-netdev="off" #Adds the _netdev file system option while mounting for iscsi transport protocol in Linux environment.
enable-mountguard-support="off" #Enable mountguard support (Useful only in AIX environments)
enable-parallel-operations="on" #Enable support for parallel operations
enable-ping-to-check-filer-reachability="on" #Use Ping Method to check the filer is responding or not
enable-split-clone="off" #Enable split clone volume or lun during connnect/disconnect
enforce-strong-ciphers="off" #SDU daemon will enforce TLSv1 and strong ciphers
filer-restore-retries=1440 #Number of retries while doing lun restore
filer-restore-retry-sleep-secs=15 #Number of secs between retries while restoring lun
filesystem-freeze-timeout-secs=300 #File system freeze timeout in seconds
flexclone-writereserve-enabled="off" #Enable space reservations during FlexClone creation
fstype="ext3" #File system to use when more than one file system is available
igroup-file="/opt/NetApp/snapdrive/.igroupfile" #location of igroup configuration file
loadsharing-update-sleep-interval=30 #Number of seconds SnapDrive for UNIX waits after load-sharing volume is updated.
lun-onlining-in-progress-retries=40 #Number of retries when lun onlining in progress after VBSR
lun-onlining-in-progress-sleep-secs=3 #Number of secs between retries when lun onlining in progress after VBSR
lunpath-monitor-frequency=24 #Number of hours after which SnapDrive for UNIX automatically fix path for LUNs.
mgmt-retries=2 #Number of retries on ManageONTAP control channel
mgmt-retry-sleep-long-secs=90 #Number of seconds between retries on ManageONTAP control channel (failover error)
mgmt-retry-sleep-secs=2 #Number of seconds between retries on ManageONTAP control channel
migrate-file="/opt/NetApp/snapdrive/.migfile" #Location of Migrate File
multipathing-type="none" #Multipathing software to use when more than one multipathing solution is available. Possible values are 'NativeMPIO' or 'DMP' or 'none'
override-vbsr-snapmirror-check="off" #If DFM is not configured, makes VBSR snapmirror check an overridable check.This may disrupt SnapMirror relationship when snapshot to be restored is older than SnapMirror baseline snapshot. This configuration variable is not applicable for Clustered Data ONTAP 8.2 and above
override-vbsr-snapvault-check="off" #If DFM is not configured, makes VBSR snapvault check an overridable check.This may disrupt SnapVault relationship when snapshot to be restored is older than SnapVault baseline snapshot. This configuration variable is applicable to Data ONTAP 7-Mode only
password-file="/opt/NetApp/snapdrive/.pwfile" #location of password file
ping-interfaces-with-same-octet="off" #SDU only pings through host interface which has first 3 octets of IPV4 in common with filer interface
portset-file="/opt/NetApp/snapdrive/.portset" #location of portset configuration file
prefix-clone-name="" #Prefix string for naming FlexClone
prefix-filer-lun="" #Prefix for all filer LUN names internally generated by storage create
prepare-lun-count=16 #Number of LUNs for which to request host preparation
rbac-cache="off" #Use RBAC cache when all DFM servers are down. Active only when rbac-method is dfm.
rbac-cache-timeout=24 #Number of hours the RBAC cache is valid
rbac-method="native" #Role Based Access Control(RBAC) methods
recovery-log-file="/var/log/sd-recovery.log" #recovery log file
recovery-log-save=20 #Number of old copies of recovery log file to save
san-clone-method="lunclone" #Clone methods for snap connect: unrestricted, optimal or lunclone
sdu-daemon-certificate-path="/opt/NetApp/snapdrive/snapdrive.pem" #location of https server certificate
sdu-password-file="/opt/NetApp/snapdrive/.sdupw" #location of SDU Daemon and DFM password file
secure-communication-among-cluster-nodes="off" #Enable Secure Communication (Useful only on SFRAC environments)
sfsr-polling-frequency=10 #Sleep for the given amount of seconds before attempting SFSR
snapconnect-nfs-removedirectories="off" #NFS snap connect cleaup unwanted dirs;
snapcreate-cg-timeout="relaxed" #Timeout type used in snapshot creation with Consitency Groups.
snapcreate-check-nonpersistent-nfs="on" #Check that entries exist in persistent filesystem file for specified nfs fs.
snapcreate-consistency-retries=3 #Number of retries on best-effort snapshot consistency check failure
snapcreate-consistency-retry-sleep=1 #Number of seconds between best-effort snapshot consistency retries
snapcreate-must-make-snapinfo-on-qtree="off" #snap create must be able to create snapinfo on qtree
snapdelete-delete-rollback-with-snap="off" #Delete all rollback snapshots related to specified snapshot
snapmirror-dest-snap-support-enabled="on" #Enables snap restore and snap connect commands to deal with snapshots which were moved to another filer volume (e.g. via SnapMirror)
snaprestore-delete-rollback-after-restore="on" #Delete rollback snapshot after a successfull restore
snaprestore-make-rollback="on" #Create snap rollback before restore
snaprestore-must-make-rollback="on" #Do not continue 'snap restore' if rollback creation fails
snaprestore-snapmirror-check="on" #Enable snapmirror destination volume check in snap restore
space-reservations-enabled="on" #Enable space reservations when creating new luns
space-reservations-volume-enabled="snapshot" #Enable space reservation over volume.
trace-enabled="on" #Enable trace
trace-level=7 #Trace levels: 1=FatalError; 2=AdminError; 3=CommandError; 4=warning, 5=info, 6=verbose, 7=full
trace-log-file="/var/log/sd-trace.log" #trace log file
trace-log-max-size=10485760 #Maximum size of trace log file in bytes; 0 means one trace log file per command
trace-log-save=100 #Number of old copies of trace log file to save
use-efi-label="off" #Enables use of EFI labels on Solaris which is required for lun size > 1 TB
use-https-to-dfm="on" #Communication with DFM done via HTTPS instead of HTTP
use-https-to-filer="on" #Communication with filer done via HTTPS instead of HTTP
use-https-to-sdu-daemon="off" #Communication with daemon done via HTTPS instead of HTTP
use-https-to-viadmin="on" #Specifies if HTTPS must be used to communicate with SMVI Product
vif-password-file="/opt/NetApp/snapdrive/.vifpw" #location of Virtual Interface Server password file
virtualization-operation-timeout-secs=600 #Virtualization Operation timeout in seconds
vmtype="lvm" #Volume manager to use when more than one volume manager is available
vol-restore="execute" #Method of restoring a volume. Possible values execute, preview and off
volmove-cutover-retry=3 #Number of retries during volume migration
volmove-cutover-retry-sleep=3 #Number of seconds between retries during volume migration cutover phase
volume-clone-retry=3 #Number of retries during flex-clone create
volume-clone-retry-sleep=3 #Number of seconds between retries during flex-clone create
volume-destroy-retry=5 #Number of retries during volume destroy in case of jumpahead
volume-destroy-retry-sleep=3 #Number of seconds between retries during volume destroy in case of jumpahead
volume-offline-retry=3 #Number of retries during volume offline. Data ONTAP 7-Mode only
volume-offline-retry-sleep=3 #Number of seconds between retries during volume-offline. Data ONTAP 7-Mode only


APPENDIX 3: /opt/NetApp/snapdrive/setup/config_wizard

[root@rhel1 ~]# /opt/NetApp/snapdrive/setup/config_wizard


*** SCRIPT FOR SETTING CONFIGURATION VARIABLES ***

This script helps you in setting up the configuration variables.
Enter the appropriate values.

To exit the wizard, enter "exit".



Select the profile for which snapdrive should be configured for [nfs, san, mixed]
Please enter your choice:nfs


Enable Protection Manager Integration? [yes, no]
Please enter your choice:no


Select the rbac-method [native, dfm]
Please enter your choice:native


SOAP/ZAPI access preference to storage system [https, http]
Please enter your choice:http


The following configuration variables will be changed

rbac-method=native
use-https-to-filer=off
Do you want to continue? [yes, no]:yes
Do you want the snapdrive daemon to be restarted now? [yes, no]:yes


Restarting the snapdrive daemon...
Successfully stopped daemon
WARNING!!! Unable to find a SAN storage stack. Please verify that the appropriate transport protocol, volume manager, file system and multipathing type are installed and configured in the system. If NFS is being used, this warning message can be ignored.
Successfully started daemon


No comments:

Post a Comment