Microsoft SQL on ONTAP with SnapCenter

** Work in progress **

A place to put some researches.

Key sources of information:

  1. TR-4590 | Best practice guide for Microsoft SQL Server with ONTAP
  2. TR-4714: Best Practices Guide for Microsoft SQL Server using NetApp SnapCenter
  3. FlexPod Datacenter for Microsoft SQL Server 2019 and VMware vSphere 6.7 - Cisco
  4. Maximum supported volumes reached (1020652) (vmware.com)
  5. NetApp Hardware Universe
  6. ESXi/ESX Configuration Maximums (1003497) (vmware.com)
Also see:
Some maximums to consider (in the below I focused on NFS because I'm researching a deployment of MSSQL using VMDKs on NFS datastores):
  • [4] (Maximum Number of NFS Datastores) The default value (32 in ESXi 7.0) can can be increased to a maximum value of 256
    • Note: This maximum applies since ESXi 5.x
    • [6] 256 NFS3 datastores per host as well as 256 NFS4.1 datastores per host are supported.
  • [5] Examples with AFF A400 on ONTAP 9.13.1:
    • Max number of flexible volumes - NAS:
      • Node: 2500 (much higher than 256)
      • HA Pair: 5000
      • Cluster: 30000
    • NFS Cluster Limits:
      • Max number of export rules - Cluster: 200000
    • WAFL Cluster Limits:
      • Max size of 64-bit aggregate (TiB): 800
      • Max size of 64-bit volume (TiB): 300
      • Max file size in a 64-bit volume (TiB): 128
      • Max number of volume snapshot copies:
        • Volume: 1023
        • Node: 1023000
        • HA-Pair: 2046000
  • [6] VMFS5 / VMFS-6: Maximum File size 62TB (less than the ONTAP max file size)

[2][3] Microsoft SQL Server Database Storage Layout with SnapCenter

SnapCenter best practice considerations for Microsoft SQL Server database layout are aligned with the suggested Microsoft SQL Server deployment. SnapCenter supports backup only of user databases that reside on a NetApp storage system. Along with the performance benefit of segregating user database layout into different volumes, SnapCenter also has a large influence on the time required to back up and restore. Separate volumes for data and log files significantly improve the restore time as compared to a single volume hosting multiple user data files. Similarly, user databases with I/O-intensive applications might experience increased backup time.

When backing-up databases with SnapCenter, take the following considerations into account:
  • Databases with I/O intensive queries throughout the day should be isolated in different volumes and eventually have separate jobs to back them up.
  • Large databases and databases that have minimal RTO should be placed in separate volumes for faster recovery.
  • Small to medium-size databases that are less critical or that have fewer I/O requirements should be consolidated into a single volume. Backing up many databases residing in the same volume results in fewer Snapshot copies to be maintained. NetApp also recommends consolidating Microsoft SQL Server instances to use the same volumes to control the number of backup Snapshot copies taken.
  • Create separate LUNs to store full text-related files and file-streaming-related files.
  • Assign a separate LUN for each instance to store Microsoft SQL server log backups. The LUNs can be part of the same volume.
  • System databases store database server metadata, configurations, and job details; they are not updated frequently. System databases and tempdb should be placed in separate drives or LUNs. Do not place system databases in same volume as user databases. User databases have different backup policies and the frequency of user database backups is not same as for system databases.
  • With Microsoft SQL Server AG setup, the data and log files for replicas should be placed in an identical folder structure on all nodes.
[2][3] Best Practices

Note: Not listing all the best practices from [3].

The following are (some) NetApp recommendations on volume design for optimal performance:
  • Use flexible volumes to store Microsoft SQL Server database files and do not share volumes between hosts.
  • Use NTFS mount points instead of drive letters to avoid the 26-drive letter limitation in Microsoft Windows Server. When using volume mount points, NetApp recommends giving the volume label the same name as the mount point.
  • Disable storage Snapshot copy schedules and retention policies. Instead, use the SnapCenter for SQL Server plug-in to coordinate Snapshot copies of the Microsoft SQL Server data volumes.
  • Microsoft SQL Server uses the system database tempdb as a temporary workspace, especially for I/O intensive database consistency checker (DBCC) CHECKDB operations. Therefore, place this database on a dedicated volume with a separate set of spindles. In large environments where volume count is a challenge, you can consolidate tempdb into fewer volumes and store it in the same volume as other system databases. This procedure requires careful planning. Data protection for tempdb is not a high priority because this database is re-created every time the SQL Server is restarted.
  • Place user data files (.mdf) on separate volumes because they are random read/write workloads. It is common to create transaction log backups more frequently than database backups. For this reason, place transaction log files (.ldf) on a separate volume or VMDK from the data files so that independent backup schedules can be created for each. This separation also isolates the sequential write I/O of the log files from the random read/write I/O of data files and significantly improves Microsoft SQL Server performance.
Database layout considerations and best practices:

1 - Database layout considerations

  • Having data and log on same volume reduces administrator complexity.
    • Data and log should still be places on separate LUNs (or VMDKs)
  • For mission-critical database, spread multiple data files and log files across individual volumes; this increases overall IOPS and reduces latency.
  • Having dedicated volumes helps faster restore with ONTAP Snapshot-based backup/recovery.
  • Log backup directory should be placed on separate volume.
2 - Database layout best practices

  • Data File and Log File on different flexvols = Faster restores + eases disaster recovery and cloning purpose.
  • Data File and Log File on different LUNs in same flexvol - Faster restores + fast disaster recovery and cloning purpose & better consolidation
  • Data Files on same LUN and Log Files on same LUN in same flexvol - Better manageability and consolidation + not recommended for large production databases + in-place restore is not as fast.
Old Articles of Small Interest
My Take

If you're going to end up with one datafile in one VMDK in one VMware NFS datastore, consuming 1 of your 256 datastores, it seems a bit pointless, and using in-guest iSCSI would make more sense.

Comments