Friday, June 5, 2009

Update of Books Online SQL 2008 available

Last month an update of the Books Online (BOL) of SQL 2008 is published.

Books Online is the primary documentation for SQL Server 2008 and includes the following types of information:
  • Setup and upgrade instructions.
  • Information about new features and backward compatibility.
  • Conceptual descriptions of the technologies and features in SQL Server 2008.
  • Procedural topics describing how to use the various features in SQL Server 2008.
  • Tutorials that guide you through common tasks.
  • Reference documentation for the graphical tools, command prompt utilities, programming languages, and application programming interfaces (APIs) that are supported by SQL Server 2008.
  • Descriptions of the sample databases and applications that are available with SQL Server 2008. You can download the sample databases from the SQL Server Community Projects and Samples page on CodePlex.

Tuesday, June 2, 2009

SP_HelpIndexALL procedure with INCLUDE and Filtered columns

As of SQL 2005 you can create indexes with the INCLUDE option. As of SQL 2008 you can create filtered indexes. Both options are a nice way to improve your index structure. To retrieve the index structure on your table, you can make use of SP_HELPINDEX </ tablename>. However this stored procedure does not display the INCLUDE columns and do the display the WHERE clause of your Filtered index.  On the blog of Kimberly L. Tripp I found a nice script which displayed these INCLUDE columns and WHERE Clause. However I found one bug on the Include columns. Assume Indexid 43 has an include columns. Indexid 44 has NO Include column. For IndexID 44 the first INCLUDE column of IndexID 43 was displayed as INCLUDE column for IndexID44.

With next script this bug is fixed.

CREATE PROCEDURE Sp_helpindexall
                @objname NVARCHAR(776)  
                -- the table to check for indexes  

AS
  -- Based on the stored procedure sp_helpindex2 
  -- of Kimberly L.Tripp,

  -- http://www.SQLskills.com/blogs/Kimberly  
  -- Only fixed a bug for the Include Columns
  --
2 June 2009 
  -- There for renamed the stored procedure 
  -- to SP_HelpindexALL
to add included columns 
  -- AND filter_definition to the output.   

  -- See http://www.KeepItSimpleAndFast.com 
  -- blog of André van de Graaf  

  -- PRELIM
  SET nocount  ON
  
  DECLARE  @objid             INT,   
            -- the object id of the table  

           @indid             SMALLINT, 
            -- the index id of an index  

           @groupid           INT,    
            -- the filegroup id of an index  

           @indname           SYSNAME,
           @groupname         SYSNAME,
           @status            INT,
           @keys              NVARCHAR(2126),   
           @inc_columns       NVARCHAR(MAX),
           @inc_Count         SMALLINT,
           @loop_inc_Count    SMALLINT,
           @dbname            SYSNAME,
           @ignore_dup_key    BIT,
           @is_unique         BIT,
           @is_hypothetical   BIT,
           @is_primary_key    BIT,
           @is_unique_key     BIT,
           @auto_created      BIT,
           @no_recompute      BIT,
           @filter_definition NVARCHAR(MAX)
  
  -- Check to see that the object names are 
  -- local to the current database.

  SELECT @dbname = Parsename(@objname,3)
  
  IF @dbname IS NULL
    SELECT @dbname = Db_name()
  ELSE
    IF @dbname <> Db_name()
      BEGIN
        RAISERROR (15250,-1,-1)
        
        RETURN (1)
      END
  
  -- Check to see the the table exists and initialize @objid.
  SELECT @objid = Object_id(@objname)
  
  IF @objid IS NULL
    BEGIN
      RAISERROR (15009,-1,-1,@objname,@dbname)
      
      RETURN (1)
    END
  
  -- OPEN CURSOR OVER INDEXES (skip stats: bug shiloh_51196)
  DECLARE ms_crs_ind CURSOR LOCAL STATIC FOR
  SELECT i.index_id,
         i.data_space_id,
         i.name,
         i.ignore_dup_key,
         i.is_unique,
         i.is_hypothetical,
         i.is_primary_key,
         i.is_unique_constraint,
         s.auto_created,
         s.no_recompute,
         i.filter_definition
  FROM   sys.indexes i
         JOIN sys.stats s
           ON i.object_id = s.object_id
              AND i.index_id = s.stats_id
  WHERE  i.object_id = @objid
  
  OPEN ms_crs_ind
  
  FETCH  ms_crs_ind
  INTO @indid,
       @groupid,
       @indname,
       @ignore_dup_key,
       @is_unique,
       @is_hypothetical,
       @is_primary_key,
       @is_unique_key,
       @auto_created,
       @no_recompute,
       @filter_definition
  
  -- IF NO INDEX, QUIT
  IF @@FETCH_STATUS < 0
    BEGIN
      DEALLOCATE ms_crs_ind
      RAISERROR (15472,-1,-1,@objname) 
        -- Object does not have any indexes.  

      RETURN (0)
    END
  
  -- create temp tables
  CREATE TABLE #spindtab (
    index_name SYSNAME 
        COLLATE
 database_default NOT NULL,
    index_id          INT,
    ignore_dup_key    BIT,
    is_unique         BIT,
    is_hypothetical   BIT,
    is_primary_key    BIT,
    is_unique_key     BIT,
    auto_created      BIT,
    no_recompute      BIT,
    groupname SYSNAME 
        COLLATE
 database_default NULL,
    index_keys NVARCHAR(2126)
        COLLATE database_default   NOT NULL, 
    -- see @keys above for length descr  

    filter_definition NVARCHAR(MAX),
    inc_count         SMALLINT,
    inc_columns       NVARCHAR(MAX))
  
  CREATE TABLE #includedcolumns (
    rownumber SMALLINT,
    [Name]    NVARCHAR(128))
  
  -- Now check out each index, 
  -- figure out its type and keys and  

  -- save the info in a temporary table 
  -- that we'll print out at the end.

  WHILE @@FETCH_STATUS >= 0
    BEGIN
      -- First we'll figure out 
      -- what the keys are.

      DECLARE  @i       INT,
               @thiskey NVARCHAR(131)   
      
      SELECT @keys = Index_col(@objname,@indid,1),
             @i = 2
      
IF (Indexkey_property(@objid,@indid,1,'isdescending') = 1)
 SELECT @keys = @keys + '(-)'
   
 SELECT @thiskey = Index_col(@objname,@indid,@i)
      
IF ((@thiskey IS NOT NULL)
 AND (Indexkey_property(@objid,@indid,@i,'isdescending') = 1))
 SELECT @thiskey = @thiskey + '(-)'
      
WHILE (@thiskey IS NOT NULL)
 BEGIN
 SELECT @keys = @keys + ', ' + @thiskey,
        @i = @i + 1
     
 SELECT @thiskey = Index_col(@objname,@indid,@i)
          
 IF ((@thiskey IS NOT NULL)
 AND (Indexkey_property(@objid,@indid,@i,'isdescending') = 1))
   SELECT @thiskey = @thiskey + '(-)'
        END
      
      -- Second, we'll figure out 
      -- what the included columns are.

      SELECT @inc_Count = Count(* )
      FROM   sys.tables AS tbl
             INNER JOIN sys.indexes AS si
               ON (si.index_id > 0
                   AND si.is_hypothetical = 0)
                  AND (si.object_id = tbl.object_id)
             INNER JOIN sys.index_columns AS ic
               ON (ic.column_id > 0
                   AND (ic.key_ordinal > 0
                         OR ic.partition_ordinal = 0
                         OR ic.is_included_column != 0))
                  AND (ic.index_id = Cast(si.index_id AS INT)
                       AND ic.object_id = si.object_id)
             INNER JOIN sys.columns AS clmns
               ON clmns.object_id = ic.object_id
                  AND clmns.column_id = ic.column_id
      WHERE  ic.is_included_column = 1
             AND (si.index_id = @indid)
             AND (tbl.object_id = @objid)
      
      -- ADD Set included columns 
      -- to NULL to script of Kimberly 

      -- otherwise an index without 
      --
an include column 
      -- will get the first include 
      -- of the previous index

      SET @inc_columns = NULL
      
      IF @inc_Count > 0
        -- Add begin to script of Kimberly
        BEGIN
          INSERT #includedcolumns
          SELECT Row_number()
                   OVER(ORDER BY clmns.column_id),
                 clmns.name
          FROM   sys.tables AS tbl
           INNER JOIN sys.indexes AS si
            ON (si.index_id > 0
                AND si.is_hypothetical = 0)
                AND (si.object_id = tbl.object_id)
           INNER JOIN sys.index_columns AS ic
            ON (ic.column_id > 0
                AND (ic.key_ordinal > 0
                OR ic.partition_ordinal = 0
                OR ic.is_included_column != 0))
                AND (ic.index_id = Cast(si.index_id AS INT)
                AND ic.object_id = si.object_id)
           INNER JOIN sys.columns AS clmns
            ON clmns.object_id = ic.object_id
                AND clmns.column_id = ic.column_id
   WHERE  ic.is_included_column = 1
         AND (si.index_id = @indid)
         AND (tbl.object_id = @objid)
          
  SELECT @inc_columns = [Name]
  FROM   #includedcolumns
  WHERE  rownumber = 1
      
  SET @loop_inc_Count = 1
      
  WHILE @loop_inc_Count < @inc_Count
     BEGIN
     SELECT @inc_columns = @inc_columns + ', ' + [Name]
     FROM   #includedcolumns
     WHERE  rownumber = @loop_inc_Count + 1
            
     SET @loop_inc_Count = @loop_inc_Count + 1
    END
 -- Add END to script of Kimberly
 END
      
      SELECT @groupname = NULL
      
      SELECT @groupname = name
      FROM   sys.data_spaces
      WHERE  data_space_id = @groupid
      
      -- INSERT ROW FOR INDEX
      INSERT INTO #spindtab
      VALUES     (@indname,
                  @indid,
                  @ignore_dup_key,
                  @is_unique,
                  @is_hypothetical,
                  @is_primary_key,
                  @is_unique_key,
                  @auto_created,
                  @no_recompute,
                  @groupname,
                  @keys,
                  @filter_definition,
                  @inc_Count,
                  @inc_columns)
      
      -- Next index
      FETCH  ms_crs_ind
      INTO @indid,
           @groupid,
           @indname,
           @ignore_dup_key,
           @is_unique,
           @is_hypothetical,
           @is_primary_key,
           @is_unique_key,
           @auto_created,
           @no_recompute,
           @filter_definition
    END
  
  DEALLOCATE ms_crs_ind
  
  -- DISPLAY THE RESULTS
  SELECT 'index_name' = index_name,
    'index_description' = Convert(VARCHAR(210), 
 
   CASE WHEN index_id = 1 THEN 'clustered' 
      ELSE
 'nonclustered' END +
    CASE WHEN ignore_dup_key <> 0 
      THEN
 ', ignore duplicate keys' 
      ELSE
 '' END +
    CASE WHEN is_unique <> 0 
      THEN
 ', unique' 
      ELSE
 '' END +
    CASE WHEN is_hypothetical <> 0 
      THEN
 ', hypothetical' 
      ELSE
 '' END +
    CASE WHEN is_primary_key <> 0 
      THEN
 ', primary key' 
      ELSE
 '' END +
    CASE WHEN is_unique_key <> 0 
      THEN
 ', unique key' 
      ELSE
 '' END +
    CASE WHEN auto_created <> 0 
      THEN
 ', auto create' 
      ELSE
 '' END +
    CASE WHEN no_recompute <> 0 
      THEN
 ', stats no recompute' 
      ELSE
 '' END + ' located on ' + groupname),
     'index_keys' = index_keys,
     --'num_included_columns' = inc_Count,
     'included_columns' = inc_columns,
     'filter_definition' = filter_definition
  FROM     #spindtab
  ORDER BY index_name
  
  RETURN (0) -- sp_helpindexwinc2

Thursday, May 7, 2009

How to find the bottleneck in your application with a SQL Profiler trace file?

Introduction

Performance bottlenecks can have different causes. Every cause can be recognized on its own way and will have its own solution. To solve your performance issue in your application you need to know on which part you need to focus. We have 4 different types of performance bottlenecks. This blog will describe what kind of performance bottlenecks we can have in applications and how we can recognize them. If you find situations as described in this document, it is not said that this is wrong. It will help you to analyze very critical what is happening in your application and maybe you can rewrite a part of your application to avoid specific behavior. It is like rebuilding your house, it will cost you a lot of headache but at the end you’re happy with the result.

Types of performance bottlenecks

1) Long running queries
  • Incomplete WHERE clauses. For instance an MRP record can be recognized by column1 = 'MRP' and Column2 = ‘B’. So both fields should be specified in the WHERE clause. In that case indexes created for MRP records will be used by the optimizer. In SQL Profiler you will see queries with a big amount of reads. Amount of reads depends on your database size. In a database smaller than 5 GB 100.000 reads is high. For bigger databases higher values are acceptable however 1 million reads are always big and should be analyzed.


  • Wrong data types in the query. For instance Column Periode in table Transactions is of type INT. Query syntax should be Transactions.Periode = 5 instead of Transactions.Periode = ‘ 5’ First syntax is correct. When wrong datatypes are used SQL server need to convert this data to the types which are used in the datamodel. This will cost CPU resources and not efficient query plans. Always double check the data type of a field with next command SP_HELP Transactions


  • Unneccassry JOINS. Check for every JOIN if it is used in the SELECT or WHERE part of your query. If this only applies for the SELECT part of your query check if this column is used for a calculation or displayed to the user. If not remove this part of the query.

    SELECT TOP 100 TR.FinYear, TR.Periode, TR.Journalnr, TR.Description, TR.Amount
    FROM Transactions TR
    INNER JOIN Journals JR ON JR.Journalnr = TR.Journalnr
    WHERE TR.FinYear = 2009


  • Queries which retrieve a lot of data without using it. For instance retrieve 50 columns: SELECT Column1, Columns2, Column3.....,Column 50 FROM Transactions. but show only 6 columns to the user. In this situation 44 columns of data are not used. Do you buy all newspapers in the store and select at home which one your are going to read and throw all other newspapers in the bin? Data transfer of 50 columns will takes more time than data transfer of 6 columns. When profiling the Amount of reads in SQL Profiler the profiled values can be the same because SQL already has done a bookmark lookup to retrieve the data of the 6 columns. However the physical data transferred to the user is much lower. This can be measured with the Query Analyzer of SQL 2005. (Include Client Statistics SHIFT+ALT+S) After completing the query an extra tab Client Statistics is presented.

    Trial 1 (SELECT TOP 100 * FROM Transactions with all records already in memory) The 50 milliseconds were used to transfer all data to the client. The time needed for SQL server is the difference between Total execution time and the client processing time. In this case 50 - 50 = 0 milliseconds. (Not measurable)

    Trial 2 (SELECT TOP 1000 * FROM Transactions with NOT all records in memory). The 1101 milliseconds where used to transfer all data to the user. Only 10 milliseconds where used by SQL Server to retrieve the requested data from the database. (792Kb)

    Trial 3 (select top 1000 Finyear, Periode, Journalnr, Description, Amount from Transactions) 5 colomns of data will now only take 94Kb bytes of data
  • More tips can be found on my blog: Guidelines to write well performing queries



2) Big amount of (recurring) queries.


  • Every time the same query. For instance retrieve the default currency in every order line. One time when starting a new order is enough. Every query which can be avoided should be avoided.
    After you have executed your application stop the SQL profile trace and save this trace to a file. Open this trace in SQL profiler and order the traces lines via the organize columns. Group on the column Textdata.



  • All recurring queries will be ordered together. In next example you see that exec efwGetOrkrg ‘ 5555’ is executed 4 times with one second.



  • A lot of queries with almost the same syntax. For instance

UPDATE Transactions
SET SYSMODIFIED = '2006-03-10 09:00:00.000'
WHERE ID = 21436

UPDATE Transactions
SET SYSMODIFIED = '2006-03-10 09:00:00.001'
WHERE ID = 21437

UPDATE Transactions
SET SYSMODIFIED = '2006-03-10 09:00:00.002'
WHERE ID = 21438

These queries can be combined in one query:
UPDATE Transactions

SET SYSMODIFIED = '2006-03-10 09:00:00.001'
WHERE ID IN (21436,21437,21438)
This will save a lot of round trips from your application to the server. What is faster: Buying one box on beer with 24 bottles with your bike or buying 24 times with a Ferrari one bottle of beer?

3) High resource usage at the client side. (CPU and or memory)



  • Application can retrieve a lot of data from the database and do a lot off calculations with it to present a result to the user. (High CPU usage) Try to rebuild your queries so SQL server can do these calculations. This will save the transfer of a lot of data from the server to the client.


  • Memory leaks. Memory used by the application is growing and only a part of the memory is released. After a while the application will crash. Use Windows Performance monitor to see if this happens.


  • High memory usage. For instance more than 250 Mb. On a client with a total memory of 512 Mb this will result in disk swapping. Because the operating itself and other open applications (like Outlook, Internet Explorer etc) need more than 262 Mb. If disk swapping occurs the performance will drop enormously. Reading from disk is 10.000 times slower than reading from memory.


4) Time gaps between the end time of query X and the start time of query X+1.
In this period SQL server is not busy with your application. At the client site, your application is doing something. For instance building the spread for an entry application. Debug your application to look why this takes so much time. See the end of this blog for a script to find the time gaps between queries.

The tools I use to check my application?

There are 2 tools I uses to identify your bottleneck.
a. SQL Profiler, to analyze all queries which are executed.
b. Windows Performance monitor, to analyze the resource usage at the client and server side.

Both tools should be used at the same time.

SQL Profiler: Create a trace file with a filter on your hostname, so you get a trace file with only the queries which are executed by your application.


Performance monitor: With the performance monitor it is easy to see which resources are used on client and server side. To have a clear view between client side and server do not use SQL Server and Globe on one machine. Use SQL server on a separate machine. Performance monitor can be started via START, RUN, Perfmon. Add next counters to the performance monitor for the client computer:

Performance object: Processor Counter: % Processor Time Instance: Total
Process Page File Bytes (Your application)
Process Working Set (Your application)

To be able to select your application in the performance monitor the application should be started otherwise it can’t be selected.

For the server side add next counters to the performance monitor:
Performance object: Processor Counter: % Processor Time Instance: Total
Physical Disk Counter:
Avg. Disk Read Queue Length Instance: (Your database)
Avg. Disk Write Queue Length (Your database)

To be able to select performance counters of a machine you should be a member of the local administrators group on that machine. If so you can add counters of both SQL server and Globe client computer in one performance monitor.

Start the performance monitor and the SQL profiler as described and start your application which you want to improve.
After profiling your application. Stop the trace and save it. Reload the trace file and add filter to your trace. For instance on CPU to find CPU intensive queries, on Reads to find queries which retrieve a lot of data.
To find the time gaps between queries store your trace file in a SQL table.
Use next script to retrieve statistics about your traced application:

DECLARE @AnalyzeTime VARCHAR(20)
SET @AnalyzeTime = '2009-05-07 11:16:06'
-- Profile statistics of application
SELECT SUM(DATEDIFF(ms, A.Starttime,A.Endtime)) AS SUMQueryTime,
SUM(DATEDIFF(ms, B.Endtime,A.Starttime)) AS SUMTimeDiffPrevQuery
,DATEDIFF(ms, MIN(A.Starttime),Max(A.Endtime)) AS TimeSpend, COunt(*) As NumberOfQueries
FROM [Profilehome2] A
INNER JOIN [Profilehome2] B ON A.Rownumber = B.Rownumber + 1
WHERE A.TEXTDATA IS NOT NULL
AND A.starttime > @AnalyzeTime
Use next script to find the time gaps between queries:

DECLARE @AnalyzeTime VARCHAR(20)
SET @AnalyzeTime = '2008-09-01 11:16:06'
-- Time difference between previous query
SELECT A.Textdata,DATEDIFF(ms, B.Endtime,A.Starttime) AS TimeDiffPrevQuery,A.Starttime, A.Endtime
FROM [Profilehome2] A
INNER JOIN [Profilehome2] B ON A.Rownumber = B.Rownumber + 1
Where A.starttime > @AnalyzeTime

Sunday, May 3, 2009

Performance improvements with filtered indexes in SQL 2008

In SQL 2008 filtered indexes are introduced. A filtered index allows us to create a filter to index a subset of rows within a table. A filtered index will:
  1. Improve query performance. Statistics are more accurate which can result in better query plans. 
  2. Reduce index maintenance costs. An index is only maintained when the data in the index is changed.
  3. Reduce index storage costs.
Filtered indexes can be very useful if you have one of the following situations:
  1. Columns which contains mostly NULL values and where queries retrieve only the rows where the data is NOT NULL. 
  2. When you want to enforce uniqueness for a set of values. With a filtered index you can create a unique index for the set of values. For instance
    CREATE UNIQUE NONCLUSTERED INDEX CustomerID_NOTNULL ON Accounts (CustomerID) WHERE CustomerID IS NOT NULL 
  3. When analyzing data on a specific data range. For instance you have a table with records for the last 4 years (2006,2007,2008,2009). You are analyzing only the year 2009. A filtered index will look like
    CREATE INDEX YEAR2009 ON Transactions (Year,Period, Amount) WHERE Year = 2009
To see the difference in index sizes on a table create a normal index and a filtered index. For instance:
  1. CREATE INDEX YEAR ON Transactions (Year,Period, Amount)
  2. CREATE INDEX YEAR2009 ON Transactions (Year,Period, Amount) WHERE Year = 2009
Use next script to see the difference in index size:


DECLARE @TableName VARCHAR(100)

SET @TableName = 'fill in the tablename'

SELECT db.index_id,si.name, IndexSizeKB = SUM(page_count * 8)
FROM sys.dm_db_index_physical_stats(db_id(),object_id(@TableName),NULL,NULL,'DETAILED') db
INNER JOIN sys.indexes SI ON SI.index_id = db.index_id AND SI.object_id = object_id(@TableName)
GROUP BY db.index_id ,si.name
ORDER BY db.index_id desc,si.name