Problem: High iSCSI NIC Utilization on Windows 2008R2 Hyper-V Host

Scenario:
Poor performance is reported on a business application not long after a virtualization initiative has been implemented, and it is noticed that – on one of the hosts that is part of a Hyper-V failover cluster – the utilization on an iSCSI NIC is very high.

Initial Diagnosis:
The cause of the excessive iSCSI traffic is traced to the business application's backend SQL Server, and – initially – the proverbial finger is pointed at a recent VMware Capacity Planning report pointing out that the SQL Server System was not a recommended candidate for virtualization. The Capacity Planning report – under 'System Exceptions' – flagged the SQL server as generating excessive disk I/O of 102.92 MBps. Now, a 1 Gbps NIC has maximum throughput of 128 MBps (1024 Mbps / 8 bits per byte,) so this fits closely with the indicated excessive disk I/O being likely to cause a near maxed out 1 Gbps NIC.

Solution:
It turned out that the SQL server had never been running with an optimal configuration, and just happened that the excessive disk I/O had not been noticed as a problem in its physical incarnation (running on a RAID 5 across 4 x Ultra320 (320 MBps) SCSI disks.) Using SQL Server Management Studio, and checking the Server Properties -> Memory page, displayed the Maximum server memory for the SQL server was only set to 1000 MB, and upping this to 3000 MB (the underlying OS was Windows Server 2003 Standard 32-bit, and the virtual machine was configured with the OS's maximum memory of 4 GB) caused the iSCSI NIC utilization to drop from near 100% to below 5%. The excessive disk I/O had been caused by the SQL server not having enough memory and having to page excessively to disk.


SQL Server Revisited:
Revisiting the SQL Server a short time later; in Windows Task Manager, the 'Mem Usage' of the sqlservr.exe is running at 2'739'252 KB (or ~2675 MB,) which is under the 3000 MB Maximum server memory allocated to SQL Server, hence SQL Server has no need to page to disk, and iSCSI utilization is aok.

Note: My involvement was only on the fringes in this particular scenario, and can take no credit for the solution. An interesting problem though and worthy of a write up.

Comments