Keep It Simple and Fast: query

Showing posts with label query. Show all posts

Sunday, June 14, 2009

How to debug your SQL server query?

Have you ever wondered why your query is not doing as what you expected?
Have you ever add extra lines in your query to print parameter values, so you better understand what is happening?

If you answer one of these questions with No, you are a really good query builder or you never build queries. If you answer one of these questions with Yes, this blog post can be something for you.

Within SQL 2008 a debug option is introduced in the SQL Server Management Studio (SSMS).

It is the green arrow. In previous versions of SSMS, this green button is the execute button.
To demonstrate the debug option I will use a simple query:

DECLARE @Rownumber INT = 0

WHILE @Rownumber < 10
BEGIN
SELECT @@SERVERNAME
END

This query will never end. This is because @Rownumber does not change.
Now we will start debugging this query.

Set your cursor on SELECT @@SERVERNAME and
Pres F9 to set a breakpoint. A red circle will appear.
Press ALT F5. In your 'locals' window the @Rownumber will appear without any value.
Press ALT F5. Debugging is started and will stop on the breakpoint. The @Rownumber has now a value of 0. This is what you expect.

Press ALT F5 to continue. The servername will be returned.
Press ALT F5 to continue. Another time the servername will be returned. However the value of @Rownumber is still on zero. This explains why the query never stops. The @Rownumber will never get 10.
Stop debugging with SHIFT F5
Add next statement SET @Rownumber = @Rownumber + 1
So we get next script:

DECLARE @Rownumber INT = 0

WHILE @Rownumber < 10
BEGIN
SELECT @@SERVERNAME
SET @Rownumber = @Rownumber + 1
END
Set breakpoint on the same line: SELECT @@SERVERNAME
Start debugging by pressing ALT F5 a few times. You will see that the value of @Rownumber is growing and the query will stop when the value reaches 9.

Enjoy this feature when you do not understand what your query is doing. Unfortunately, this debug feature is only available in SQL Server 2008. Maybe another reason to upgrade to SQL Server 2008.

Friday, March 6, 2009

What is the syntax of the query which is locking data?

Image Amagill

In live SQL environments queries can be locked by other queries. In the tables master..sysprocesses you can see by which SPID is blocking your query:
SELECT Blocked FROM Master..Sysprocesses WHERE Blocked <> Spid and Blocked <> 0

However it is nice to know which SPID is blocking, but it more interesting to know what the syntax is of this blocking query. With next script you can retrieve the query syntax of the Spid which is locking the data.

SET NOCOUNT ON

DECLARE DB CURSOR FOR -- Select all SPID's which creates a lock
SELECT SP.blocked, SB.Name
FROM Master..Sysprocesses SP
INNER JOIN Master..Sysdatabases SB ON SP.DBID = SB.DBID
WHERE SP.blocked <> 0 AND SP.blocked <> SP.spid

OPEN DB
DECLARE @LOCKSPID Nvarchar (10); -- Spid of locking transaction
DECLARE @DBNAME Nvarchar(256); -- Dbname in which a locked transaction is.
DECLARE @Handle Nvarchar(256);
DECLARE @SQL nvarchar(4000); -- Generated SQL Script which will be executed against all databases

FETCH NEXT FROM DB INTO @LOCKSPID,@DBNAME
WHILE @@FETCH_STATUS = 0
BEGIN

-- Generate SQL script for the first database in paramater @SQL
SET @Handle = '(Select sql_handle FROM Master..Sysprocesses WHERE Spid = ' + @LOCKSPID +')'
SET @SQL = 'SELECT Text FROM ::fn_get_sql('+ @Handle +')'
EXEC SP_EXECUTESQL @SQL -- Execute the generated script.

FETCH NEXT FROM DB INTO @LOCKSPID,@DBNAME -- Generate and execute script for the next locked query

END
CLOSE DB
DEALLOCATE DB

Previous script will work on SQL 2005 and SQL 2008. For SQL 2000 a hotfix is available.

Wednesday, February 4, 2009

Guidelines to write well performing queries.

During my daily work I have done a lot of performance analyzes at customer sites. One part of a performance analyze is analyzing resource intensive queries on the database. For instance CPU usage, disk I/O. Sometimes these queries can be improved by rewriting the query without losing functionality. This blog contains the most common mistakes I have seen in bad performing queries.

Image credit: Dave Ward

Avoid use of SELECT * in SQL queries. Retrieve only the columns you need. This will avoid bookmark lookups and will result in less data traffic from SQL server to client. Do you buy everything in the supermarket and at home throw away what you do not need?
Example:
SELECT ResourceID, FullName
FROM Resources

Instead of
SELECT *
FROM Resources
Create only result sets when necessary. This will avoid unnecessary data retrieval. If you want to know if something exists it is not interesting to know how much time it exists.
Example:
IF EXISTS (SELECT 1 FROM Resources WHERE RESID > 1005)

Instead of
IF EXISTS (SELECT COUNT(*) FROM Resources WHERE RESID > 1005)

Look to the execution plan of your query. It should contain an Index Seek instead of an Index Scan or Table Scan.
Use parameterized queries for queries which will be used a lot. This will save unnecessary query plan generations and the Procedure cache is used more efficient.
Example:
SELECT FullName, PhoneNumber
FROM Resources
WHERE RESID=@P1

Instead of
SELECT FullName, PhoneNumber
FROM Resources
WHERE RESID=1005
Write query with correct data type. Check data type of a column via sp_help <tablename>
Example:Column ID Datatype INT
SELECT Fullname
FROM Resources
WHERE RESID = 1005

Instead of
SELECT Fullname
FROM Resources
WHERE RESID = '1005'
Avoid SQL roundtrips. Combine queries into one query. Buying a bottle of beer and chips bag at once is always faster than first buying a bottle of beer and go home and then go back to the supermarket to buy a chips bag.
Example:
Query 1: SELECT FullName FROM Resources WHERE ID = 1005
Query 2: SELECT PhoneNumber FROM Resources WHERE ID = 1005

Combined Query: SELECT FullName, PhoneNumber FROM Resources WHERE ID = 1005
Remove JOINS which are not part of the SELECT or WHERE clause.

SELECT A.Column2, A.Column3
FROM Table1 A
INNER JOIN Table2 B ON A.Column1 = B.Column5
WHERE A.Column4 = 20

Remove INNER JOIN Table2 B ON A.Column1 = B.Column5

New query:
SELECT A.Column2, A.Column3
FROM Table1 A
WHERE A.Column4 = 20
JOIN a table on columns which are part of an index. In most cases one of the 2 columns is the primary key column.

SELECT C.Name
FROM Activities A
INNER JOIN Contacts C ON C.ID A.ContactID
WHERE A.TYPE = 10010002

C.ID is the primary key column of table Contacts.
A.ContactID is part of an index on table Activities.
Avoid a lot of OR statements in the WHERE CLAUSE. Use a UNION. Because of the 3 different columns in the WHERE clause the clustered index will be used.

Example: Table with Clustered index on ID and 3 indexes on: City, Type and the last one on Birthday. Next query will result in a clustered table scan.
SELECT FullName
FROM Resources
WHERE City = 'Delft'
OR Type = 1
OR BirthDay > '2000-01-01'

Next query will use the 3 indexes:
SELECT FullName
FROM Resources
WHERE City = 'Delft'
UNION ALL
SELECT FullName
FROM Resources
WHERE Type = 1
UNION ALL
SELECT FullName
FROM Resources
WHERE BirthDay > '2000-01-01'
Avoid ORDER BY columns which does not change the ORDER BY. This is unnecessary overhead.
Example: Tabel Resources with Primary key on ID and Index 1 on RESID and Index 2 on FullName
SELECT FullName FROM Resources
WHERE City = 'Delft'
ORDER BY RESID, FullName

Remove the FullName in the ORDER BY. This will avoid an Index Scan on index FullName and a Hash Match
Configure your SQL Server management Studio to connect Read Uncommitted to the database. By default SQL Server management Studio will connect Read Committed. By connecting Read Uncommitted, you will avoid that you lock colleagues or will be locked by other colleagues. In SQL Server management Studio select: Tools, Options, Query Execution, SQL Server, Advanced, SET TRANSACTION ISOLATION LEVEL: Read Uncommitted).
Test your queries on a database with enough content in it. Performance of a query on a table with only 10 records is always fast. It will only guarantee that the syntax of the query is correct. You can't make any conclusions of this query regarding performance. If your application should perform on a table with 1 million records. Test your queries on a table with 1 million records.
Avoid using temporary tables. Creation of temporary tables can cost a lot of disk I/O.
Use SET NOCOUNT ON at the beginning of SQL batches, Stored Procedures and Triggers.
Define on every table a primary key. If you do not configure, SQL Server will do it in the background. This is additional overhead.
Incomplete or over complete WHERE CLAUSES. Read the database documentation of the application to understand the business logic. Knowing the business logic you can define the correct WHERE clause.
UPDATE multiple records in one query. For instance one update statement for 10 order lines:

UPDATE OrderLines
SET Modified = '2009-02-02 10:00:00.002'
WHERE ID in(14235,14236,14237,14238,14239,14240,14241,14242,14243,14244)

Instead of
UPDATE OrderLines
SET Modified = '2009-02-02 10:00:00.002'
WHERE ID = 14235

UPDATE OrderLines
SET Modified = '2009-02-02 10:00:00.003'
WHERE ID = 14236
.
.
.
UPDATE OrderLines
SET Modified = '2009-02-02 10:00:00.013'
WHERE ID = 14244

Wednesday, December 24, 2008

Identifying the performance bottleneck is already 50% of the solution.

In my previous blog post I'm in favor to visit customers as a product manager. You get useful feedback how customers are using your product. This week I did, as Principal Research Engineering, a consult to find the bottleneck in the Globe environment of a customer. It was an interesting consult with useful feedback. I will summarize some experiences:

SQL 2000 (X86) was installed on a Windows Server 2003 (X64). The server has 8 GB memory. However only 2 GB was used by the 32 bits version of SQL Server. It is possible to allocate more memory for SQL Server (X86) but only on a 32 bits operating system. To use all memory on Windows Server 2003 (X64), install an X64 version of SQL Server. After installing SQL 2005 (X64) all server memory was used. The more memory SQL server can use, the better is the performance.
An application took a while to display the information. With SQL profiler, one query was identified as the bottleneck. Two suggestions to improve the query will be investigated. Hopefully are these suggestions possible. The development organization is currently analyzing these suggestions.
The Intrastat application resulted almost always in a query time out. Also in this case, SQL Profiler was used to identify the slow query. The query itself was correct however in the query plan; a clustered index scan was executed on the biggest transaction table GBKMUT. After creating an index, the application executed within 2 minutes. This index will be added to the index structure of GBKMUT for our logistic Globe customers.

Conclusion: Identifying the performance bottleneck is already 50% of the solution. In this case, a wrong SQL Server installation, a query which can be improved and an improvement of the standard index structure of Globe. This consult resulted in a win-win situation. For the customer, the performance of his Globe environment is improved. For Exact, our customers can benefit of the improvements in the application and index structure.

Keep It Simple and Fast

Sunday, June 14, 2009

How to debug your SQL server query?

Friday, March 6, 2009

What is the syntax of the query which is locking data?

Wednesday, February 4, 2009

Guidelines to write well performing queries.

Wednesday, December 24, 2008

Identifying the performance bottleneck is already 50% of the solution.

About Me

Search This Blog

Popular Posts

Blog Archive

Followers of my blog

Blogs I Read

My Places