The only way to capture timeouts is via SQL Trace – you need to know what was running/executing at the time of the timeout and SQL Trace is the tool to capture this. Do not confuse SQL Trace with SQL Profiler – SQL Trace is the collection of stored procedures that are used by profiler for capturing events. SQL Profiler is a Client Side Trace – SQL Trace is a Server Side Trace. Never, ever, run an extended duration trace using SQL Profiler. Most modern servers will handle it but for a busy server it can be enough to cause a blue screen and a server restart. Doing this is what I call “DBA’s behaving badly!”
Setup a server-side trace by using SQL Profiler starting with a blank template and add the following items…
Errors and Warnings:
*Attention – this is the actual timeout event. This only tells you a timeout has occurred. You need the following and associate this event with what was running when the timeout occurred.
Stored Procedures:
*RPC:Starting
*SP:StmtStarting
TSQL:
*SQL:BatchStarting
*SQL:StmtStarting
Normally we would use the completed objects but remember with a timeout they don’t complete so we want the starting objects. If you want to filter the trace for a particular database then add this in now. Click Run and then Stop the trace. Go Export > Script Trace Definition and save the script to a file. Using SSMS, open the script you just created and connect to the server concerned. Edit the parameters for @maxfilesize and the sp_trace_create command. I would use the file rollover option to keep a predetermined number of files that you can overwrite on a daily basis without the files getting too large and also consuming all your disk space. Once running, it is possible to query the live trace file by using one of the built-in SQL functions… fn_trace_gettable()
You pass in the filename and then the number of files to read, or DEFAULT to read all files. e.g. SELECT * FROM fn_trace_gettable(‘D:\MSSQL\Trace\MyTraceFile.trc’, DEFAULT)
The EventID for a timeout is 16 – this can be found by querying sys.trace_events
After the trace has been running a while, run a query to see if there have been any timeouts. For example…
SELECT *
FROM fn_trace_gettable('D:\MSSQL\Trace\MyTraceFile.trc', DEFAULT)
WHERE EventClass = 16
if you get a result set back, note the times and correlate that with a particular file range so that the can query just that file and run something like the following…
SELECT b.* FROM
FN_TRACE_GETTABLE('E:\MSSQL\TimeoutTrace_30.trc', 1) AS a
INNER JOIN FN_TRACE_GETTABLE('E:\MSSQL\TimeoutTrace_30.trc', 1) AS b ON a.SPID = b.SPID
WHERE a.EventClass = 16
AND b.StartTime > DATEADD(ss, -30, a.StartTime)
AND b.StartTime <= a.StartTime
What is happening here is we are performing a self join on one of the trace files. In the first result set is the Attention event (16) and joining on the SPID of that event to the same file getting all the events in the last 30 seconds on the same SPID. This should reveal what command was running when the timeout occurred. You can use this information to diagnose further what the problem is but at least we now what was running at the time 