SQL Server – Page 4 – Mitch Wheat

SQL Server: Diagnosing Intermittent Connectivity Issues

April 12, 2019 by admin

Common causes of intermittent SQL Server connectivity issues:

Queries/transactions holding long running exclusive locks on tables. Queries waiting on these locks may be timing out.
A faulty switch, network card, cable or other network infrastructure.
Another device on the network with the same IP address.
Incorrect connection protocol being used.
Virus scanners and incorrect exclusion lists.
Network/Port security scanners.
VM snapshot backups.
Under-powered SQL Server or TLog backups over an under-powered network connection or large TLog file growth settings.
SQL Server under extreme memory pressure.
An application doing slow row-by-row processing while SQL Server is sending the data fast; this often shows up as a high value for ASYNC_NETWORK_IO waits.

The obvious question is “What changed?”!

When different parts of the system are managed by different people, the culprit can be a seemingly innocuous change that is not well documented or communicated to the various stakeholders.

Is a server alias being used?

Have TCP/IP dynamic ports been enabled for the TCP/IP protocol?

See How to Configure a SQL Server Alias for a Named Instance on a Development Machine

Is the SQL Server a named instance?

If you don’t specify the protocol in the connection string, then it will use the protocol specified for the database driver. The protocol order, by default, is Shared Memory, TCP/IP and then Named Pipes.

To list connections and the protocol they are using:

SELECT * FROM sys.dm_exec_connections

–WHERE session_id = @@SPID;

To specify which protocol to use in a connection string, add tcp: before the server name:

var connectionString = “Data Source=tcp:Server\Instance; …”;

Is IPv6 being used?

Was the server upgraded?

If so, there might be queries that are timing out due to optimiser/cardinality estimator changes.

Are all applications experiencing intermittent connections?

If you have a monitoring tool connected to the SQL Server, is it able to connect even when the problem is occurring?

Are there any errors in the SQL Server or application server logs?

If all application servers lose network connectivity regularly at the same time, it might be a hardware issue such as a faulty switch or network card.

Does it happen to all application servers?

If you have several application or web servers, are they all affected?

[As an aside, if you’ve only got one, you might want to consider setting up another for load balancing and troubleshooting.]

Are all queries in the application affected, or just some queries?

Some queries/transactions may be holding long running exclusive locks on tables. Queries waiting on these locks might be timing out.

Do timeouts occur at regular days/times?

Could it be regularly scheduled port security scans, or VM backups, or large transaction log file growths? Use a fixed transaction log file growth setting (128MB is commonly used) rather than a percentage.

During the timeouts, is the application server able to ping the SQL Server?

Set up a script to ping the SQL Server every 10-15 seconds. If you are not able to ping the SQL Server next time the application has query timeouts, then it is not a SQL Server problem.

SQLBits 2019 Sessions Online

March 20, 2019 by admin

The SQLBits 2019 conference sessions are online (and free). They vary in quality but are mostly good.

Faster I/O Subsystem Causing More WRITELOG Waits?

March 13, 2019 by admin

Paul Randal explains how WRITELOG waits can increase with a faster I/O subsystem:

I was contacted last week by someone who was asking how are faster disks causing more WRITELOG waits. They were seeing lots of these waits, with an average wait time of 18ms. The log was stored on a Raid-1 array, using locally-attached spinning disks in the server. They figured that by moving the log to Raid-1 array of SSDs, they’d reduce the WRITELOG wait time and get better workload throughput.
They did so and got better performance, but were very surprised to now see WRITELOG as the most frequent wait type on the server, even though the average wait time was less than 1ms, and asked me to explain…

Azure SQL DB Storage Bottleneck

March 5, 2019March 5, 2019 by admin

If you are using Azure SQL Databases, you should definitely read this post by Brent Ozar: There’s a bottleneck in Azure SQL DB storage throughput. The bottom line:
the transaction log throughput currently appears to bottleneck at 16 cores!

The bit where he compares AWS costs/relative performance is also an eye opener:

8 cores, $1,991 per month: 64 minutes</li><li><strong>16 cores,$ 3,555 per month: 32 minutes (and interestingly, it’s the same speed with zone redundancy enabled)
80 cores, $18,299 per month: 32 minutes</strong></li><li>Just for reference: 8-core AWS EC2 i3.2xl VM,$ 1,424 per month with SQL Server Standard Edition licensing: 2 minutes (and I don’t put that in to tout AWS, I just happen to have most of my lab VMs there, so it was a quick comparison)

Categories Azure, SQL Azure, SQL Server

TransactionScope Default Isolation Level

March 5, 2019January 7, 2019 by admin

When you create a new TransactionScope, you might be surprised to find that the default isolation level is Serializable.

TransactionScope’s default constructor defaults the isolation level to Serializable and the timeout to 1 minute. In SQL Server, SERIALIZABLE transactions are rarely useful and prone to deadlocks.

If you are using a TransactionScope and getting deadlocks in your application (and hopefully you have a retry mechanism for SQLException 1205), it might be due to the Serializable isolation level.

Instead of using the default constructor:

using (var scope = new TransactionScope()) { // .... Do Stuff .... scope.Complete(); }

Use this:

using (var scope = TransactionUtils.CreateTransactionScope()) { // .... Do Stuff .... scope.Complete(); } ..... public class TransactionUtils { public static TransactionScope CreateTransactionScope() { var transactionOptions = new TransactionOptions { IsolationLevel = IsolationLevel.ReadCommitted, Timeout = TransactionManager.MaximumTimeout }; return new TransactionScope(TransactionScopeOption.Required, transactionOptions); } }

[The other gotcha relates to SQL Server 2014 and below: TransactionScope and Connection Pooling]

Categories .NET, C#, SQL Server, TransactionScope

sp_whoisactive 11.32

July 26, 2019October 29, 2018 by admin

Back in Dec 2017, Adam Machanic released a version of sp_whoisactive that is fully compatible with SQL Azure. He’s recently released an update, version 11.32

Categories Azure, Performance, SQL, SQL Azure, SQL Server

SQL Server: Complex Queries and Plan Timeout

October 23, 2018 by admin

I was speaking to a client recently about how very complex queries might not always be getting the most efficient query plan due to compilation timeouts. Joseph Pilov has written a good description here: Understanding Optimizer Timeout and how Complex queries can be Affected in SQL Server

Categories Query Plan, SQL Server

SQL Agent Jobs Starting at the Same Time

October 22, 2018October 21, 2018 by admin

The following query will list SQL Agent jobs that are due to start execution at the same time:

;with cteNextRunDatetime as ( SELECT job_id, next_scheduled_run_date = max(next_scheduled_run_date) FROM msdb.dbo.sysjobactivity WHERE next_scheduled_run_date IS NOT NULL GROUP BY job_id ) , cteSimultaneousJobsDates as ( SELECT next_scheduled_run_date FROM cteNextRunDatetime GROUP BY next_scheduled_run_date HAVING COUNT(*) > 1 ) SELECT sj.name, sj.description, c.next_scheduled_run_date FROM msdb.dbo.sysjobs sj JOIN cteNextRunDatetime sja ON sj.job_id = sja.job_id JOIN cteSimultaneousJobsDates c on c.next_scheduled_run_date = sja.next_scheduled_run_date WHERE enabled = 1 ORDER BY sja.next_scheduled_run_date

Categories SQL Server, SQL Server Agent

When Did My Azure SQL Database Server Restart?

October 27, 2018October 12, 2018 by admin

Getting the server restart time for an on-premise SQL Server is simple, and in fact there are several ways using sys.dm_os_sys_info, sys.dm_exec_sessions, sys.traces, or sys.databases

In an Azure SQL Database, you don’t get access to those system objects.

Brent Ozar posted a way to get the approximate Azure SQL Database restart date/time but I found that some of the wait types can produce large outliers and skew the result:

Instead, I’ve modified to use a standard statistics technique to reject outlier values that are outside 1.5 times the interquartile range:

;with cte as ( SELECT wait_time_ms FROM sys.dm_os_wait_stats w with(nolock) WHERE wait_type IN ( 'BROKER_TASK_STOP', 'DIRTY_PAGE_POLL', 'HADR_FILESTREAM_IOMGR_IOCOMPLETION', 'LAZYWRITER_SLEEP', 'LOGMGR_QUEUE', 'REQUEST_FOR_DEADLOCK_SEARCH', 'XE_DISPATCHER_WAIT', 'XE_TIMER_EVENT' ) ) select approx_ms_since_restart = AVG(wait_time_ms), approximate_restart_date = DATEADD(s, AVG(-wait_time_ms)/1000, GETDATE()) from cte cross join (select q1 = min(wait_time_ms), q3 = max(wait_time_ms), iqr = max(wait_time_ms) - min(wait_time_ms) from (select wait_time_ms, row_number() over (order by wait_time_ms) as seqnum, count(*) over (partition by null) as total from cte ) t where seqnum = cast(total * 0.25 as int) or seqnum = cast(total * 0.75 as int) ) qts where (wait_time_ms >= q1 - 1.5 * iqr) AND (wait_time_ms <= q3 + 1.5 * iqr)

This tells me my Azure server restarted on the 21st Sept 2018 around 23:46.

It’s not as complicated as it first looks:

The first part is obviously the same as Brent Ozar’s query, it gets a list of waits and their respective cumulative wait times. We generate a row number ordered by wait time and the total number of rows, and then pick 2 values using the row number at positions a quarter and three quarters along that list (the first and third quartile values). We then reject any value that is smaller than the 1st quartile value minus 1.5 times the interquartile range and any value that is larger than the 3rd quartile value plus 1.5 times the interquartile range.

Note: DATEADD() only accepts an integer for its second parameter, and using milliseconds(ms) overflows for even short periods of server up-time. In fact, since it’s approximate it might be better to use:

approximate_restart_date = DATEADD(minute, AVG(-wait_time_ms)/60000, GETDATE())

Categories SQL Azure, SQL Server, TSQL 1 Comment

SQL Server 2019 …

September 29, 2018 by admin

If anyone needed a single concrete reason to upgrade to SQL Server 2019 (when its released), ignoring all the new features, here it is:

2628 – String or binary data would be truncated in table ‘%.ls’, column ‘%.ls’. Truncated value: ‘%.*ls’.

That’s right! After something like 25+ years, Microsoft are finally fixing the cursed “BINARY OR STRING DATA WOULD BE TRUNCATED”!

Seen via Brent Ozar’s post: What’s New in SQL Server 2019’s sys.messages: More Unannounced Features

Categories Error, SQL Server

Older posts

Newer posts

Previous Page1 … Page3 Page4 Page5 Next