*Credits to CG, ES,
SH, JS, AN
Here with a focus
on specific questions pertaining to a mostly RedHat and NFS on NetApp
environment, with a little VMware!
## Contents ##
- NetApp Links
- NFS Best Practices - Linux Mount Options
- NFS Best Practices - More Linux Mount Options
- Some Random Questions
## NetApp Links ##
*Credit to CG
NetApp
Thin-Provisioned LUNs on RHEL 6.2 Deployment Guide
*Rishikesh Boddu,
Martin George - July 2012
Red Hat Enterprise
Linux 6, KVM, and NetApp Storage: Best Practices Guide for Clustered Data ONTAP
*Joe Benedict -
November 2012
## NFS Best Practices - Linux
Mount Options ##
*Credit to JS, CG
mount -t nfs -o rw,bg,hard,intr,rsize=65536,wsize=65536,vers=3,proto=tcp,timeo=600 NFS_SERVER_IP_ADDRESS:/PATH /CLIENT_PATH
Note: The vers and
proto will vary in syntax from OS to OS.
Why?
bg - so if you
have a problem at boot you don't sit there forever
hard - to
prevent loss of data if the system crashes
intr - to be
able to ctrl-c out of an error (Oracle doesn’t like this, unlikely to see a
problem though)
64K rsize/wsize
- for max efficiency
timeo=600 - because
it’s a reasonable value and probably doesn’t matter anyway because the main
retry stuff is handled at the TCP layer
Exceptions?
noac/actimeo=0
- use if you must have multiple servers with a single consistent image
(depending on the OS)
llock - if you
have Solaris; this will probably help performance if you have a lot of high IO
ro - use if
you want read-only
“TCP slot tables should be 128, 16 will hurt you bad. Also, there
was a bug in many Linux OS’s that prevented the setting from taking effect when
placed in /etc/sysctl.conf. They may
need to edit /etc/init.d/netfs to
call /sbin/sysctl -p in the first
line of the script so that sunrpc.tcp_slot_table_entries
is set before NFS mounts any file systems (if NFS mounts the file systems
before this parameter is set, the default value of 16 will be in force).” “UDP slot tables is irrelevant.”
## NFS Best Practices - More Linux Mount Options ##
*Credit to AN, CG
enable “forcedirectio” - if the application maintains an internal cache
(e.g. databases)
Why? Under certain
circumstances, performance may be enhanced by purposely setting up the NFS
client to not cache any data. When this option is used on a Solaris client,
system memory is not used for file-system data and instead only the Oracle
buffer cache is used. If data is not in the buffer cache, it is fetched from
the storage system.
Note: Not all
databases or all instances of the same database benefit from disabling the
client cache. The performance impact is deployment specific.
enable “llock” - if the application performs file locking in a
single-host environment
Why? Unfortunately,
some NFS clients take a brute-force approach to maintaining coherency of locked
data. Specifically, on some platforms, locking a file or data region results in
all data associated with the file being invalidated from cache and all
operations are then “over the wire,” resulting in higher I/O latencies. Depending
on the application requirements, llock helps to take advantage of file-system
caching to improve performance.
Disable
WCC (Weak Cache Consistency) - if the workload contains significant
write traffic in a single-host environment, test for the weak cache consistency
issue.
Why? The NFSv3
protocol allows for weak cache consistency. Basically, the protocol has
attributes associated with each file that contain timestamps and file size from
the last access. The theory is that two instances can read share the file with
no problem. The moment that one instance writes to the file, other instances
see the attributes change (for example, last modified time) and invalidates any
cached copies of the file - that's how it was designed to work.
Note: NFSv4 no
longer has the concept of WCC for file modifying operations.
set nfs:nfs3_bsize=65536
Why? This command
controls the logical block size used by the NFSv3 client. Usually
nfs:nfs3_max_transfer_size is tuned to
have the same value as nfs:nfs3_bsize. Do not set this parameter too high
because it might cause the system to hang while waiting for memory allocations
to be granted.
set nfs:nfs3_max_threads=64
Why? This command
controls the number of kernel threads that perform asynchronous I/O for the
NFSv3 client. Placing a value for this command is dependent on the network
bandwidth (for a very low-bandwidth network, you might want to decrease this
value so that the NFS client does not overload the network.)
set nfs:nfs3_nra=? (value depends on the network bandwidth)
Why? This command
controls the number of read-ahead operations that are queued by the NFSv3
client when sequential access to a file is discovered. You can increase or reduce the number of
read-ahead requests that are outstanding for a specific file at any given time.
- For a very
low-bandwidth network or on a low-memory client, you might want to decrease
this value so that the NFS client does not overload the network or the system
memory.
- If the network is
very high bandwidth and the client and server have sufficient resources, you
might want to increase this value.
/dev/tcp
tcp_recv_hiwat
Why? Determines the
maximum value of the TCP receive buffer. This is the amount of buffer space
allocated for TCP receive data. The default value is 49152 in Solaris 10 (recommend
setting it to 65535.)
/dev/tcp
tcp_xmit_hiwat
Why? Determines the
maximum value of the TCP transmit buffer. This is the amount of buffer space
allocated for TCP transmit data. The default value is 49152 in Solaris 10 (recommend
setting it to 65535.)
## Some Random Questions ##
*Credit to ES, SH, CG
Q1) When a SATA
disk reaches 90% capacity, does it suffer a serious slow down?
No! The slowdown at
90% is more to do with WAFL than the disk and it also depends on where the data
on the disk is located and the manner in which it is being accessed –
sequential or random and if read or write. Same applies to SAS as well as SATA.
It’s layout rather than utilisation, although a full disk “may” mean a poorer
layout.
Q2) With RedHat
5.8, what are the recommended volume options?
Depends! The
obvious one is the security style
(UNIX). If being used for NFS exports
there are some NFS tuning settings for performance (see above.) Generally
language settings - allowing UTF-8 support with vol lang volname en_US.UTF-8 - are particularly important along
with vol
options volname convert_ucode on and
vol options volname create_ucode on.
Requires more information on what the volume is being used for to make a
decision on what to set.
Q3) With RedHat
and NetApp, what’s the recommendation for flow control?
With all OS’s: on
NetApp switch and host, where possible set flow control to none. With some
HBAs, CNAs or NICs, this is not something that can be set - in this case ensure
that flow control is set to none on all parts of the infrastructure that
support it.
Q4) With RedHat they
notice a benefit by mounting the NFS volumes with different IPs to get around
serialization that happens when using just one IP for the NFS server (NetApp),
would VMware similarly benefit from NFS volumes being mounted on different IPs?
Maybe! There have
been reported benefits from using such configurations. Whether it will be a benefit
or not will depend on the environment.
Q5) How to detect
pause frames?
ifstat –a
Q6) On a FAS32XX,
can see the c0a and c0b, but on all of them c0a is disabled, is this normal behaviour?
Yes! Normally one
cluster connection is passive and the other is active. If c0b fails then c0a should takeover and
become active. Double check with a “cf monitor”
command output.
Q7) Is it possible
to transfer an aggregate with all the volumes and data inside it, to another
controller by simply un-assigning the disks and then re-assigning the disks?
Yes! The disks will
be seen as a “foreign” aggregate which will be taken offline once the disks are
assigned to the new controller. You will
need to bring that aggregate online to make it and the volumes inside it active
(a good idea to rename the aggregate too.)
See: How to move an
aggregate between software disk-owned HA pairs
For an aggregate
with FlexVols:
Method 1:
FILER1> priv set diag
FILER1*> aggr offline aggrToMove -a
FILER1*> priv set admin
FILER1> aggr assign XXX YYY ZZZ -o FILER2 -f
FILER1> aggr status -r aggrToMove
FILER2> aggr online aggrMoved
Method 2:
To un-own a disk
(don’t worry about the errors regards failed aggregate on the source):
FILER1> disk assign XXX -s unowned –f
To assign a disk:
FILER2> disk assign XXX
Then destroy the
aggregate on the source!
FILER1> aggr offline aggrOldLocation
FILER2> aggr destroy aggrOldLocation
Q8) Is there a
head swap/upgrade process for say a FAS2240-4 to FAS3250?
See: https://support.netapp.com/NOW/knowledge/docs/hardware/NetApp/cs_migration/index.htm?isLegacy=true
Comments
Post a Comment