All About and For Microsoft SQL Server DBA: BLUEPRINT:- Common scenarios to avoid in OLTP

Database Design

Rule	Description	Value	Source	Problem Description
1	High Frequency queries having a high number of table joins.	>4	Sys.dm_exec_sql_text Sys.dm_exec_cached_plans	High frequency queries with lots of joins can be too normalized for high OLTP scalability.
2	Frequently updated tables having # indexes.	>3	Sys.indexes sys.dm_db_operational_index_stats	Excessive index maintenance for OLTP.
3	Big IOs Table Scans Range Scans	>1	Perfmon object SQL Server Access Methods Sys.dm_exec_query_stats	A missing index flushes the cache.
4	Unused Indexes.	Index not in Sys.dm_db_index_usage_stats. If an index is NEVER used, it will not appear in the DMV sys.dm_db_index_usage_stats		Avoid Index maintenance for unused indexes.

CPU

Rule	Description	Value	Source	Problem Description
1	Signal Waits	>25%	Sys.dm_os_wait_stats	Time in runnable queue is pure CPU wait.
2	Plan reuse	<90%	Perfmon object SQL Server Statistics	OLTP identical transactions should ideally have >95% plan reuse.
3	Parallelism: Cxpacket waits	>5%	Sys.dm_os_wait_stats	Parallelism reduces OLTP throughput. CXPACKET indicates that multiple CPUs are working in parallel, dividing up the query in smaller pieces. Ordinarily a well tuned OLTP application would not parallelize unless an index is missing, there is an incomplete WHERE clause, or the query is not a true OLTP transaction.

Memory

Rule	Description	Value	Source	Problem Description
1	Page life expectancy	<300 sec	Perfmon object SQL Server Buffer Manager SQL Server Buffer Nodes	Page life expectancy is the average number of seconds a data page stays in cache. Low values could indicate a cache flush that is caused by a big read. Pure OLTP workloads do NOT issue big reads, thus possible missing index.
2	Page life expectancy	Drops by 50%	Perfmon object SQL Server Buffer Manager	Page life expectancy is the average number of seconds a data page stays in cache. Low values could indicate a cache flush that is caused by a big read. Pure OLTP workloads do NOT issue big reads, thus possible missing index.
3	Memory Grants Pending	>1	Perfmon object SQL Server Memory Manager	Current number of processes waiting for a workspace memory grant.
4	SQL cache hit ratio	<90%	SQL cache hit ratio falls under 90% for sustained periods of time greater than 60 sec.	It is likely that large scans have to be performed, which in turn flushes out the buffer cache.

Rule	Description	Value	Source	Problem Description
1	Average Disk sec/read	>20 ms	Perfmon object Physical Disk	Reads should take 4-8 ms without any IO pressure.
2	Average Disk sec/write	>20 ms	Perfmon object Physical Disk	Writes (sequential) can be as fast as 1 ms for transaction log.
3	Big IOs Table Scans Range Scans	>1	Perfmon object SQL Server Access Methods	A missing index flushes the cache.
4	If Top 2 values for wait stats are any of the following: ASYNCH_IO_COMPLETION IO_COMPLETION LOGMGR WRITELOG PAGEIOLATCH_x	Top 2	Sys.dm_os_wait_stats	If top 2 wait_stats values include IO, there is an IO bottleneck.
5	Low bytes per sec.		Perfmon object Physical Disk

Blocking

Rule	Description	Value	Source	Problem Description
1	Block percentage	>2%	Sys.dm_db_index_operational_stats	Frequency of blocks.
2	Block process report	30 sec	Sp_configure profiler	Report of statements.
3	Average Row Lock Waits	>100ms	Sys.dm_db_index_operational_stats	Duration of blocks.
4	If Top 2 values for wait stats are any of the following: LCK_M_BU LCK_M_IS LCK_M_IU LCK_M_IX LCK_M_RIn_NL LCK_M_RIn_S LCK_M_RIn_U LCK_M_RIn_X LCK_M_RS_S LCK_M_RS_U LCK_M_RX_S LCK_M_RX_U LCK_M_RX_X LCK_M_S LCK_M_SCH_M LCK_M_SCH_S LCK_M_SIU LCK_M_SIX LCK_M_U LCK_M_UIX LCK_M_X	Top 2	Sys.dm_os_wait_stats	If top 2 wait_stats values include locking, there is a blocking bottleneck.
5	High number of deadlocks	>5 per hour	Trace flag 1204 to display in the errorlog and or the profiler deadlock graph.	If the deadlock occurs with the same participant SQL commands or operations multiple times, it is likely that there is a locking problem.

Network

Rule	Description	Value	Source	Problem Description
1	High network latency coupled with an application that has many round trips to the database.	Output queue length >2	Perfmon object: Network Interface	Indicates that the latency between the application server and the database is high. Could be caused by significant network infrastructure between the application and the instance of SQL Server.
2	Network bandwidth is used up.	Packets Outbound Discarded Packets Outbound Errors Packets Received Discarded Packets Received Errors	Perfmon object: Network Interface	Dropped packets are detected.

In summary, given the high volume of identical small transactions that characterize OLTP, transactions per second and resource usage can be improved as follows:

1. Database designs usually keep the number of indexes to a functional minimum as every insert, update, and delete incurs index maintenance.

2. CPU can be reduced with plan reuse and join reduction.

3. IO performance can be reduced with good indexing, join reduction, and high page life expectancy.

4. Memory is optimal when there are no sudden drops in Page Life Expectancy.

5. Sorts can be limited with index usage. That is, a certain sort order is supported by an index that is sorted the same way, either ascending or descending.

6. Blocking can be reduced with index design and short transactions.

...Happy SQLing
Thanks for Visiting and Sharing your Views

All About and For Microsoft SQL Server DBA

Friday, July 1, 2011

BLUEPRINT:- Common scenarios to avoid in OLTP

No comments:

Post a Comment