Patrick Purviance – SQL Server DBA Blog
Blogging impactful solutions for SQL Server DBAs worldwide

Archive for the ‘Performance’ Category

Windows 2003 SP2, the Scalable Networking Pack, and SQL Server – can be a bad combo

Friday, October 1st, 2010

Learned something new this week (and kind of the hard way). Hopefully this will help someone else out there.

This past Sunday morning, the Windows server group applied monthly Windows updates to our SQL Servers. This is a monthly process and typically no issues arise as we have them apply the updates to our development and test servers the week prior. However, this month, immediately after the updates were applied to the servers hosting our SQL instances and the servers restarted, performance between the servers (a lot of ETL processes here) became dismal. Jobs/Processes that normally would run in seconds or minutes wouldn’t finish at all. Seemed as though any SQL calls to our SQL 2005 Servers (all running Windows 2003 SP2) would just be sucked into the great SQL black hole.

We had’t had problems with these scheduled ETL processes earlier that morning, prior to the patches. But the following simple Linked Server select statement would take minutes from the affected servers and would return in less than 50ms from an unaffected server:

1
SELECT 1 FROM [LinkSvrA].[ApplDon].[dbo].[Customer] WHERE [CUST_NO_PK] = 1

Where LinkSvrA is the linked server running Windows 2003 SP2 (plus some Windows updates applied Sunday morning). The column, [CUST_NO_PK] is the only column in the Primary Key, Clustered Index. This was executed from a Windows 2008 server running SQL Server 2008 SP1.

After hours and hours of troubleshooting the performance problems, a couple of key indicators to me were pointing at network connectivity and throughput. The SQL Server wait types that were dominating the wait pools were OLEDB and ASYNC_NETWORK_IO. Ultimately indicating the provider (SQL Native Client in this case) is waiting on the network.

In the end, the case was escalated to Microsoft and it was determined that the Network Card settings on LnkSvrA were the root cause and after following the steps below, the issue was resolved.

Changing settings at TCP level are often not enough and the settings need to be disabled at the NIC driver level as well. You need to go into NCPA.cpl, then navigate to the Advanced network settings for your network card(s) and disable the “Chimney Offload” and Receive Side Scaling settings.

Steps to disable RSS, TCP Offloading (this is specific to the card but will usually look something like this):

• Open your network connection properties and click on “Configure”.

• In the advanced tab, set Large Offload to Disabled for all IP versions.

• Ensure Receive Side Scaling is Disabled.

• One additional recommendation we did not end up changing was the TCP and UDP Checksum Offload option. Having this set to Rx and Tx Enabled is working fine for us.

• Restart the server to ensure these settings take effect and retest your processes.

Take Away:
The SNP (Scalable Network Pack) features added and, by default, enabled by SP2 for Windows 2003 caused the performance of normally well performing processes to become abysmal. This has been a well known issue since the introduction of SP2 for Windows 2003 and Microsoft later released a patch to disable the SNP features due to the problems encountered by SQL Server and Exchange Server post SP2 implementation. In your Windows 2008 servers, the SNP features are disabled by default, so this shouldn’t be a problem. 

Update:  MS has a KB Article with a patch of sorts for this:  http://support.microsoft.com/kb/948496

 

“Indexes? We don’t need no stinking indexes!”

Tuesday, September 28th, 2010

Yeah, right…..Alright, now that I’ve got your attention (or at least I better have if you’re a DBA worth your weight in any known unit of measure), below is a quick query for finding AND generating a script to create missing indexes for SQL Server 2005+. Now, this is just my version of a way to do it, and you may have opinions which you’re welcome to share if it will improve the query in some meaningful way for myself and everyone else, but I’ll throw out the normal disclaimer: You MUST review the recommended indexes and determine for yourself if they are appropriate for your database/workload. They may make sense, they may not, so TEST, TEST, TEST!

Have fun and good luck optimizing!

UPDATED! Now includes better formatting in the output to more closely mimic the missing index feature recommendation in SSMS execution plans.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
SELECT  Obj.name as [Table Name]
	, (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) AS [Relative Impact]
	,  'CREATE NONCLUSTERED INDEX [IX_' + rtrim(Obj.name) + 
		CASE WHEN mid.equality_columns IS NULL THEN ''
			ELSE '_' + replace(replace(replace(replace(IsNull(mid.equality_columns, ''), ',', '_'), '[', ''), ']', ''), ' ', '') 
		END + 
		CASE WHEN mid.inequality_columns IS NULL THEN ''
			ELSE '_' + replace(replace(replace(replace(IsNull(mid.inequality_columns, ''), ',', '_'), '[', ''), ']', ''), ' ', '') 
		END + 
		'] ON [' + Sch.name + '].[' + Obj.name COLLATE DATABASE_DEFAULT + '] (' + 
		CASE WHEN mid.equality_columns IS NULL THEN '' 
			ELSE mid.equality_columns 
		END + 
		CASE WHEN (mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL) THEN ', '
			ELSE ''
		END + 
		CASE WHEN mid.inequality_columns IS NULL THEN ') '
			ELSE mid.inequality_columns + ') ' 
		END + 
        CASE WHEN mid.included_columns IS NULL THEN ') '  
		ELSE 'INCLUDE (' + mid.included_columns + ')' 
		END + ';' 
		AS [Create Index Statement]
	, mid.equality_columns as [Equality Columns]
	, mid.inequality_columns as [Inequality Columns]
	, mid.included_columns  as [Included Columns]
FROM sys.dm_db_missing_index_group_stats AS migs 
	INNER JOIN sys.dm_db_missing_index_groups AS mig ON migs.group_handle = mig.index_group_handle 
	INNER JOIN sys.dm_db_missing_index_details AS mid ON mig.index_handle = mid.index_handle AND mid.database_id = DB_ID() 
	INNER JOIN sys.objects Obj ON mid.OBJECT_ID = Obj.OBJECT_ID 
	INNER JOIN sys.schemas Sch ON Obj.schema_id = Sch.schema_id
WHERE (migs.group_handle IN 
		( 
		SELECT TOP (500) group_handle 
		FROM  sys.dm_db_missing_index_group_stats WITH (nolock) 
		ORDER BY (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) DESC))  
					AND OBJECTPROPERTY(Obj.OBJECT_ID, 'isusertable')=1 
ORDER BY 2 DESC , 3 DESC

Check “Long Running Operations” Stored Procedure

Monday, June 7th, 2010

I threw together a SProc for returning useful information about long running operations for SQL Server 2005+ and thought I’d share it. The procedure uses the dynamic management objects, dm_exec_requests, dm_exec_connections, dm_exec_sessions, and dm_exec_sql_text to tell you how much longer you may have to wait on processes that SQL Server categorizes as “long running”. Typically I use this SP to find out how much longer a backup or a restore process is going to take. And it’s usually within a few minutes of being correct, so it’s worlds better than starting a restore on a SQL 2000 box and just telling everyone it will be done when it’s done and there’s little way to tell other than experience. 😉

Let me know what you think…

-Patrick