Tuesday, June 5, 2012

SQL Azure connection termination causes.


Microsoft SQL Azure database is a clud based relational database service that's hosted and maintained by Microsoft. Multiple databases from different customers are hosted on the same server. To garantuee performance for all these different customers, the SQL server should not come in an overload state. SQL Azure has the ability to monitor and rebalance active and online user databases. To achieve this, SQL Azure continuously gathers and analyze database usage statistics and will terminate connections when necessary. Throttling is a mechanism used by SQL Azure to prevent machines from becoming overloaded and unresponsive. Throttling can be divided in:
  1. Soft throttling. This applies to the database consuming the most resources on the box. Soft throttling happens when a physical machine seems to be on the way of beig overloaded, unless its workload is reduced.
  2. Hard throttling. This is the final stage of throttling. It happens when the machine is critically impacted due to overload. It terminates existing operations and prevents new ones until the metric returns below expected thershold.

This blog will summarize all connection terminations which can happen:
  • Worker threads. When soft throttling limit for worker threads on a machine is exceeded, the database with the highest requests per second is throttled. Existing connections to that database are terminated and new connections to the database are denied, until number of workers drops below soft limit
  • Database size. When the database space allocatted to user db is full, the user gets a db full error.
  • Physical database space. When total database size on a machine exceeds 90% of total space available on machine, all databases become read-only. Load balancer ensures the situation is resolved by balancing databases across machines.
  • Log bytes used. SQL Azure supports transactions generating log of up to 2 GB in size.
  • Transaction log length. Uncommitted transactions can block the truncation of log files. To prevent this, the distance from the oldest active transaction log sequence number (LSN) to the tail of the log (current LSN) cannot exceed 20% of the size of the log file. When violated, the offending transaction is terminated and rolled back so that the log can be truncated.
  • Transaction Lock Count. Sessions consuming greater than one million locks are terminated.
  • Blocking System tasks. Transactions request locks on resources like rows, pages, or tables, on which the transaction is dependent and then free the locks when they no longer have a dependency on the locked resources. Due to these locks, some transactions might block resources required by system sessions. If a transaction locks a resource required by an underlying system operation for more than 20 seconds, it is terminated. In addition, any transaction that runs for more than 24 hours is terminated.
  • Tempdb usage. When a session uses more than 5 GB of tempdb space, the session is terminated.
  • Memory use. When there are sessions waiting on memory for 20 seconds or more, sessions consuming greater than 16 MB for more than 20 seconds are terminated in the descending order of time the resource has been held, so that the oldest session is terminated first. Termination of sessions stops as soon as the required memory becomes available.
  • Too many requests. If number of concurrent requests made to a database exceed 400, all transactions that have been running for 1 minute or more are terminated.
  • Idle connections. Connections to SQL Azure database that are idle for 30 minutes or longer will be terminated.
  • Denial of service. When there are a high number of login failures from a particular source internet protocol (IP) address, SQL Azure will block the connections from that IP address for a period of time. The connection is terminated and no error is returned.
  • Fail over issues. SQL Azure is flexible to cope with any variations in usage and load. The service replicates multiple redundant copies of data to multiple physical servers to maintain data availability and business continuity. In case of a hardware failure, SQL Azure provides automatic failover to optimize availability for your application. Currently, some failover actions may result in an abrupt termination of a session.
More details about all these termination causes and returned error messages can be found here.