Showing posts with label SQL performance. Show all posts
Showing posts with label SQL performance. Show all posts

Tuesday, July 20, 2010

How to retrieve numbers of records in a table with high performance without index or tablescans?


Photo Credit: dodge challenger1

The most common way to retrieve the number of records in table is done with a query like:
select count(*) from tableX where column1 = 'value'

Using this kind of queries will result in a (clustered) index scan. On tables with millions of records this can still take a while to execute and can result in unnecessary overhead on the SQL server. SQL server stores a lot of statistics of your tables. These statistics are used by the optimizer to generate the most efficient queryplan. You can use these statistics to retrieve the number of records in your database. This is always faster then to query the table itself.
SQL server uses his own mechanisme to update the statistics, so it can happen that there is a small mismatch between the number of records in a table and the number of records stored in the statistics. For big tables this is not always so important. For instance to display the growth of your tables.

You can use next query to retrieve the number of records per tables via the statistics:

SELECT sysobjects.name,rowcnt
FROM sysobjects
INNER JOIN sysindexes ON sysobjects.id=sysindexes.id
WHERE xtype='U' AND indid IN (0,1)
AND sysobjects.name = 'Your table'

To check for the number of records for a column with a specific value use next query. This query wil return the number of records per column value. Available columns are columns which exists as the first column in an index. Example you have a table: X with the column: Type. The column type is used as the first column in an index. Next query will return all existing values for the column type with the number of records.

DECLARE @TableName Nvarchar(200)
SET @TableName = 'Your Table'

DECLARE ColumnStatistics CURSOR FOR
SELECT Tbl.name AS TableName, SSID.name as IndexName ,
         INDEX_COL( tbl.[name], idk.indid, 1 ) AS IndexColumn
FROM SYSINDEXkeys idk
INNER JOIN SYSOBJECTS tbl ON idk.[id] = tbl.[id]
INNER JOIN SYSINDEXES SSID on SSID.[ID] = TBL.[id] and ssid.indid = IDK.indid
WHERE tbl.ID = (SELECT ID FROM SYSOBJECTS WHERE name = @Tablename)
           AND keyno = 1

OPEN ColumnStatistics

DECLARE @IndexName Nvarchar(200)
DECLARE @IndexColumn Nvarchar(200)

FETCH NEXT FROM ColumnStatistics INTO @TableName,@IndexName, @IndexColumn
WHILE @@FETCH_STATUS = 0
BEGIN

CREATE TABLE #TempTable (ColumnValue nvarchar(200),INT1 INT,EQ_ROWS INT,
    INT3 INT,INT4 INT,TableName nvarchar(200))
INSERT INTO #TempTable (ColumnValue, int1,EQ_ROWS,int3,int4)
EXEC ('dbcc show_statistics(' +@TableName+','+@IndexName+') with histogram')

SELECT @TableName AS Tablename,@IndexColumn AS IndexColumn,ColumnValue, EQ_ROWS AS Records

FROM #Temptable
DROP TABLE #TempTable

FETCH NEXT FROM ColumnStatistics INTO @TableName,@IndexName, @IndexColumn
END
CLOSE ColumnStatistics
DEALLOCATE ColumnStatistics

How to check if and when the statistics are updated?

execute next query:
EXEC sp_autostats 'Your table'



Check for the column AUTOSTATS. The value should be set to ON.
Check for the column Last Updated to see when the statistics for the index is updated for the last time.


Enjoy using the statistics.

Tuesday, July 14, 2009

Book review: SQL Server 2008 Internals

One of the topics I'm interested in is performance of a database. Based on this topic I read the book SQL Server 2008 Internals. I enjoyed to read this book. It is easy to read and usefull for people with and without knowledge of previous versions of SQL Server. After reading this book it will help you to better understand the performance of your SQL Server configuration and databases. I can imagine that everybody has it's own favorite topics when reading this book. This is the list of my  favorite topics:

  • SQL Server Resource Governor
  • Indexes: Internals and Management
    • Tools for analyzing indexes.
    • Data modification Internals
    • Managing index structures
  • Special Storage
    • Sparse columns
    • Data compression
  • The query optimizer
    • Optimizer architecture
    • Statistics, Cardinality Estimation and Costing
    • Index selection
    • Plan Hinting
  • Plan Caching and Recompilation
    • Cache mechanisme
    • Plan cache internals.
  • DBCC Internals.
I think it is a usefull investment of only 52 Euro (60 US Dollar). It is a must to have for every DBA'er who wants to deliver the best performance of his SQL server environment to his users. This also applies to software engineers who want to build well performing applications on SQL Server.

Happy reading.