SQL Server Availability Groups: Add ‘Check if Primary’ Step to Existing Agent Jobs

I’ve recently been helping a client set up and configure SQL Server Always On Availability Groups. In addition to implementing your SQL Server Agent jobs on all the secondaries, you also have to implement a mechanism to make jobs aware of which node they are running on (Primary or secondary replicas). You would think this would be available out of the box, right? I guess they are so many permutations (especially when you have multiple availability groups on the same server), that it was left to end users to implement.

When SQL Server completes a primary <=> secondary replica role change it fires alert 1480, and you can configure this alert to notify you of the role change and act upon it.

You have 2 choices:

  1. Respond to the role changeover alert (1480) and run a task to disable the jobs related to the databases in the availability group that has failed over.
  2. Add a new first job step to every SQL Agent job which checks if it is running on the Primary Replica. If it is running on the Primary then move to the next job step and continue executing the job, otherwise if it is running on a Secondary Replica then stop the job.

To check if the Primary Replica of a given AG database is currently hosted on a particular server or not, you can either check based on the name of the database or you can check based on the name of the Availability Group the database is in (a database can only be in one availability group).

My scenario had a single availability group with not all databases on the server in the AG (a common scenario). After a couple of unsatisfactory attempts at (a), I decided to opt for (b).

Not wanting to do this change manually for every agent job, I’ve created a stored procedure that adds a new first job step to any agent job, to check if the job is running on the primary. If not, it silently quits with success. There’s also a short snippet of TSQL to run the proc for every SQL Agent job.

use master
go

-- Adds a first step to specified job, which checks whether running on Primary replica

create procedure AddAGPrimaryCheckStepToAgentJob
    @jobname nvarchar(128)
as

set nocount on;

-- Do nothing if No AG groups defined
IF SERVERPROPERTY ('IsHadrEnabled') = 1
begin
    declare @jobid uniqueidentifier = (select sj.job_id from msdb.dbo.sysjobs sj where sj.name = @jobname)

    if not exists(select * from msdb.dbo.sysjobsteps where job_id = @jobid and step_name = 'Check If AG Primary' )
    begin
        -- Add new first step: on success go to next step, on failure quit reporting success
        exec msdb.dbo.sp_add_jobstep 
          @job_id = @jobid
        , @step_id = 1
        , @cmdexec_success_code = 0
        , @step_name = 'Check If AG Primary' 
        , @on_success_action = 3  -- On success, go to Next Step
        , @on_success_step_id = 2
        , @on_fail_action = 1     -- On failure, Quit with Success  
        , @on_fail_step_id = 0
        , @retry_attempts = 0
        , @retry_interval = 0
        , @os_run_priority = 0
        , @subsystem = N'TSQL' 
        , @command=N'IF (SELECT ars.role_desc
        FROM sys.dm_hadr_availability_replica_states ars
        JOIN sys.availability_groups ag ON ars.group_id = ag.group_id AND ars.is_local = 1) <> ''Primary''
    BEGIN
       -- Secondary node, throw an error
       raiserror (''Not the AG primary'', 2, 1)
    END'
        , @database_name=N'master'
        , @flags=0
    end
end
GO

-------------------

-- Run AddAGPrimaryCheckStepToAgentJob for each agent job

DECLARE @jobName NVARCHAR(128)

DECLARE jobCursor CURSOR LOCAL FAST_FORWARD
FOR
    SELECT j.name FROM msdb.dbo.sysjobs j
    --WHERE ??? -- filter out any jobs here

OPEN jobCursor 
FETCH NEXT FROM jobCursor INTO @jobName

WHILE @@FETCH_STATUS = 0
BEGIN
    exec AddAGPrimaryCheckStepToAgentJob @jobName

    FETCH NEXT FROM jobCursor INTO @jobName
END

CLOSE jobCursor
DEALLOCATE jobCursor
GO

----------------------

-- Remove the first job step ''Check If AG Primary'' added in previous snippet
-- Just here should you want to remove the step added in snippet above.

DECLARE @jobName NVARCHAR(128)

DECLARE jobCursor CURSOR LOCAL FAST_FORWARD
FOR
    SELECT j.name FROM msdb.dbo.sysjobs j
    join msdb.dbo.sysjobsteps js on js.job_id = j.job_id
    where js.step_name = 'Check If AG Primary' and js.step_id = 1

OPEN jobCursor 
FETCH NEXT FROM jobCursor INTO @jobName

WHILE @@FETCH_STATUS = 0
BEGIN
    
    EXEC msdb.dbo.sp_delete_jobstep  
        @job_name = @jobName,  
        @step_id = 1 ;  
    
    FETCH NEXT FROM jobCursor INTO @jobName
END

CLOSE jobCursor
DEALLOCATE jobCursor
GO

Note: For SQL Server 2014 onwards you can use the builtin function sys.fn_hadr_is_primary_replica('dbname'):

If sys.fn_hadr_is_primary_replica (@dbname) <> 1   
BEGIN  
    -- This is not the primary replica, exit without error. 
END  
-- This is the primary replica, continue to run the job... 

C#: Simple Read/Write Application Settings

I needed a simple C# class for saving writable application settings. The first attempt was to use Project -> Properties -> Settings, but only user scoped settings are writable. Here’s a simple solution: you will need to add Newtonsoft’s Json.NET (Nuget Json.Net) to your project, if you are not already using it.

using System.IO;
using Newtonsoft.Json;
...

public class AppSettingsBase<T> where T : new()
{
    private const string DefaultSettingsFilename = "settings.json";

    public void Save(string fileName = DefaultSettingsFilename)
    {
        File.WriteAllText(fileName, JsonConvert.SerializeObject(this));
    }

    public static void Save(T settings, string fileName = DefaultSettingsFilename)
    {
        File.WriteAllText(fileName, JsonConvert.SerializeObject(settings));
    }

    public static T Load(string fileName = DefaultSettingsFilename)
    {
        return (File.Exists(fileName)) 
            ? JsonConvert.DeserializeObject<T>(File.ReadAllText(fileName)) : new T();
    }
}

Add your application defined settings to your settings class (derived from the base template):

public class AppSettings : AppSettingsBase<AppSettings>
{
    public Guid ClientId = Guid.Empty;
    public List<string> ServerList = new List<string>();
    public List<string> DatabasesToExclude = new List<string>();

    public bool IsAutoUpdate = true;
    public int AutoUpdaterCheckIntervalHours = 6; 
    public DateTime LastUpdateCheckTime = DateTime.MinValue;

    ...
}

Using your settings is simple:

    AppSettings settings = AppSettings.Load();

    settings.LastUpdateCheckTime = DateTime.UtcNow;
    settings.Save();

Create a Single .NET Executable with Multiple Assemblies

If you want to create a single .NET executable there are several possibilities. In the past I’ve used Jeffrey Richter’s technique of embedding DLLs as resources and loading these into memory from the Manifest Resource Stream on startup. While this works well, it does require a bit of extra code to not try to access anything in an assembly until after it’s been loaded at startup.

In the interest of cleaner code, I decided to give ILMerge a go.

It’s simple to install ILMerge into your application using NuGet:

PM> Install-Package ilmerge -Version 2.14.1208

To run ILMerge on your release builds, add an AfterBuild section to your default projects .csproj file:

  <Target Name="AfterBuild" Condition=" '(Configuration)' == 'Release' ">     <CreateItem Include="@(ReferenceCopyLocalPaths)" Condition="'%(Extension)'=='.dll'">       <Output ItemName="AssembliesToMerge" TaskParameter="Include" />     </CreateItem>     <PropertyGroup>       <ReferenceAssemblies>C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.5.2</ReferenceAssemblies>     </PropertyGroup>     <Message Importance="high" Text="Executing ILMerge...with target platform from(ReferenceAssemblies)" />
    <Exec Command="&quot;(SolutionDir)\packages\ILMerge.2.14.1208\tools\ILMerge.exe" /target:exe /out:@(MainAssembly) /internalize /targetplatform:v4,"(ReferenceAssemblies)&quot; &quot;@(IntermediateAssembly)&quot; @(AssembliesToMerge->'&quot;%(FullPath)&quot;', ' ')" />
    <Delete Files="@(ReferenceCopyLocalPaths->'$(OutDir)%(DestinationSubDirectory)%(Filename)%(Extension)')" />
  </Target>

Note: If you receive this error from ILMerge when building your release executable:

An exception occurred during merging:
ILMerge.Merge: There were errors reported in AWSSDK.Core’s metadata.
Array dimensions exceeded supported range.
at ILMerging.ILMerge.Merge()
at ILMerging.ILMerge.Main(String[] args)

It’s because ILMerge doesn’t currently support the new portable PDB format.
https://github.com/Microsoft/ILMerge/issues/29
https://github.com/Microsoft/ILMerge/issues/11

If you can live without the .pdb files being merged into a single PDB file add /ndebug to the Exec command.

This is a temporary workaround for issues with ILMerge where it cannot process the new portable .NET PDB format.

An excellent resource on PDB files is John Robbins’s : PDB Files: What Every Developer Must Know :

A .NET PDB only contains two pieces of information, the source file names and their lines and the local variable names. All the other information is already in the .NET metadata so there is no need to duplicate the same information in a PDB file.

How to send large amounts of data securely

RSA public/private key cryptography is limited in the amount of data that it can encrypt. With the commonly used v1.5 padding and the largest RSA key (currently 2048-bits), the maximum size of data that can be encrypted is 245 bytes.

If you want to encrypt and securely send more data than that you need to use a combination of asymmetric and symmetric encryption algorithms.

In practice, RSA is used to exchange an encrypted key between communicating endpoints that is then used to symmetrically encrypt/decrypt the large data.

Encryption by sender:

  1. Generate a cryptographically strong random key, K, of the length required for a symmetrical encryption technique such as Rijndael (maximum key size is 256 bits for Rijndael).
  2. Symmetrically encrypt your data using Rijndael using the random key generated in step 1.
  3. Using RSA, asymmetrically encrypt the random key generated in step 1 with the public part of the recipient’s RSA key.
  4. Send the RSA encrypted random key AND the encrypted data from steps 2 and 3 to recipient.

Decryption by recipient:

  1. Decrypt the encrypted key using your private RSA key.
  2. Decrypt the original data using the RSA-decrypted symmetric key from step 1.

Here’s how to generate a 2048 bit public/private key in C#:

    using (rsaProvider = new RSACryptoServiceProvider(2048))
    {
        rsaProvider.PersistKeyInCsp = false;

        // Export public key to file
        publicKey = rsaProvider.ToXmlString(false);
        using (publicKeyFile = File.CreateText(publicKeyFileName))
        {
            publicKeyFile.Write(publicKey);
        }

        // Export private/public key pair to file
        privateKey = rsaProvider.ToXmlString(true);
        using (privateKeyFile = File.CreateText(privateKeyFileName))
        {
            privateKeyFile.Write(privateKey);
        }
    }

The Fall of RNN / LSTM

Good article on “hierarchical neural attention encoders” the next evolution in neural network designs.

Then in the following years (2015–16) came ResNet and Attention. One could then better understand that LSTM were a clever bypass technique. Also attention showed that MLP network could be replaced by averaging networks influenced by a context vector.

SQL Server Unindexed Foreign Keys

I saw this, DMV To List Foreign Keys With No Index, via Brent Ozar’s weekly links email.

Unindexed foreign key columns might not be captured by the sys.dm_db_missing_index_details DMV because of their relatively small size. Lack of indexes on foreign keys might only have a small performance impact during reads but can lead to lock escalations during heavy write loads causing excessive blocking and possibly dead locks.

I’ve updated the original posted query to generate TSQL to create the missing indexes (which you should compare to the existing index landscape to see if any indexes can be consolidated before running in).

[Note: if you are unfortunate enough to have spaces in your table/column names, then you’ll need to replace them with an underscore ‘_’  (or other character) in the index name.]

;with cte_fk as 
( 
    select   
        fk_table_schema = OBJECT_SCHEMA_NAME(fk.parent_object_id),
        fk_table = OBJECT_NAME(fk.parent_object_id),
        fk_column = c.name,
        fk_name   = fk.name,
        fk_has_index = CASE WHEN i.object_id IS NOT NULL THEN 1 ELSE 0 END,
        is_fk_a_pk_also = i.is_primary_key,
        is_index_on_fk_unique = i.is_unique,
        index_def = 'create index NC_' + OBJECT_NAME(fk.parent_object_id) + '_' + c.name + 
           ' ON ' + QUOTENAME(OBJECT_SCHEMA_NAME(fk.parent_object_id)) + '.' + QUOTENAME(OBJECT_NAME(fk.parent_object_id)) + '(' + QUOTENAME(c.name) + ')',
        pk_table_schema = OBJECT_SCHEMA_NAME(fk.referenced_object_id),
        pk_table = OBJECT_NAME(fk.referenced_object_id),
        pk_column = c2.name,
        pk_index_name = kc.name,
        fk.*
    from     
        sys.foreign_keys fk
        join sys.foreign_key_columns fkc ON fkc.constraint_object_id = fk.object_id
        join sys.columns c ON c.object_id = fk.parent_object_id AND c.column_id = fkc.parent_column_id
        left join sys.columns c2 ON c2.object_id = fk.referenced_object_id AND c2.column_id = fkc.referenced_column_id
        left join sys.key_constraints kc ON kc.parent_object_id = fk.referenced_object_id AND kc.type = 'PK'
        left join sys.index_columns ic ON ic.object_id = c.object_id AND ic.column_id = c.column_id
        left join sys.indexes i ON i.object_id = ic.object_id AND i.index_id = ic.index_id
)
select  
    * 
from    
    cte_fk c
    left join sys.dm_db_partition_stats ps on ps.object_id = c.parent_object_id and ps.index_id <= 1
where   
    fk_has_index = 0 
    -- and fk_table = 'mytablename'
order by 
    used_page_count desc

Some TSQL Anti-patterns: ‘DISTINCT’ column list, Scalar valued Functions…

I’ve been looking through some code that I last worked on over a year ago. Since then several people have contributed to the project. The first thing that I noticed was a sprinkling of DISTINCT added to SELECT queries where it was originally totally unnecessary! Paul White puts it better than I could:

SELECT DISTINCT is sometimes an indication that someone has ‘fixed’ a query that returns duplicate rows in a naive way. This (mis-)use is probably more common among those with relatively little database experience.” https://dba.stackexchange.com/questions/139687/when-to-use-distinct

If you’re a Dev or DBA and you’re about to fix a problem by adding DISTINCT to a query, ask yourself “I am really fixing the real problem?”. And if DISTINCT is the real answer, then an equivalent GROUP BY might be more efficient.

The next thing I noticed is the addition of scalar valued functions not unsurprisingly causing less than stellar performance. An experienced DBA (and developer) should be fully aware of the performance implications of using scalar valued functions, and should avoid if possible.

 

SQL Server 2012 Error: No catalog entry found for partition ID xxx in database N

If you are running SQL Server 2012 and you see this error:

DESCRIPTION: No catalog entry found for partition ID xxx in database N.
The metadata is inconsistent. Run DBCC CHECKDB to check for a metadata 
corruption.

a quick check reveals database 2 is tempDB:

SELECT name FROM sys.databases WHERE database_id = 2;

If you run a DBCC CHECKDB against tempDB:

DBCC CHECKDB ('tempdb') WITH NO_INFOMSGS, TABLERESULTS;

you receive no results, indicating no issues with tempDB.

This is a known issue with a fix:

FIX: “No catalog entry found for partition ID in database ” error when you use SQL Server 2012

Assume that you query the tempdb.sys.allocation_units table in Microsoft SQL Server 2012. When you use NOLOCK hint in the query or the query is under the READ UNCOMMITED transaction isolation level, you receive the following intermittent 608 error message:

Error: 608 Severity: 16 State: 1
No catalog entry found for partition in database . The metadata is inconsistent. Run DBCC CHECKDB to check for a metadata corruption

Note The DBCC CHECKDB command does not show any sign of database corruption.

Fixed in: