Configurable Retry Logic in Microsoft.Data.SqlClient

Microsoft have recently released a long awaited retry mechanism for .NET SqlClient

I’m a fan of Polly for retry logic:

Polly is a library that allows developers to express resilience and transient fault handling policies such as Retry, Circuit Breaker, Timeout, Bulkhead Isolation, and Fallback in a fluent and thread-safe manner.

It will be interesting to see how they compare in terms of ease of use.

Configurable retry logic in SqlClient introduction

Windows File Splitter

Unlike Linux systems, Windows doesn’t have a built-in command line file splitter. Splitting large files into smaller chunks is something you often want to with data warehouses (such as Snowflake), in order to be able to use multiple threads for better bulk loading performance.

I saw a post by Greg Low, SDU_FileSplit – Free utility for splitting CSV and other text files in Windows where he had created one for Windows, but it wasn’t open-source.

I decided to create an open-source file splitter for Windows.

It’s a standalone .NET 5.0 executable, supports wildcards, can recurse sub-folders (careful with that option!) and automatically gzip compress output files.

Example use:

FileSplitter.exe -i c:\temp\*.csv -m 100000 -c -o c:\temp\TestFileSplitter

Splits all the files in specified folder c:\temp with extension .csv, and splits into max. 100K lines per file, storing the output files in folder c:\temp\TestFileSplitter

Date and Time Dimension

Almost every fact table in a data warehouse uses a date (or calendar) dimension, because most measurements are defined at specific points in time. A flexible calendar date dimension is at the heart of most data warehouse systems; it provides easy navigation of a fact table through user familiar dates, such as weeks, months, fiscal periods and special days (today, weekends, holidays etc.).

I’ve created a date dimension generator here at Github

It targets SQL Server, but should be easy to convert to other RDBMS.

It features:

  • User defined start and end dates
  • Computed Easter dates (for years 1901 to 2099)
  • Computed Chinese New year dates for years 1971 to 2099.
  • Computed public holidays for US, UK, Canada, Ireland, Malta, Philippines, Australia (with state specific for WA, NSW, QLD, SA, VIC).
  • Date labels in US, UK and ISO formats.

Things to Note:

  1. The [TodayFlag] needs to be updated once per day by a scheduled task (timezone dependent: might need a flag for each timezone).

  2. If you use an unusual Fiscal year (say 5-4-4), it will need to be loaded from an external source (such as an Excel/Google spreadsheet).

  3. The precise start date of the month of Ramadan is by proclamation, so these need to be added, year by year. It is possible to calculate but can be a day out, and can vary by region.

    https://travel.stackexchange.com/questions/46148/how-to-calculate-when-ramadan-finishes

    https://en.wikipedia.org/wiki/Ramadan_%28calendar_month%29

Babelfish for PostgreSQL

This has the capacity to be huge:

Babelfish for PostgreSQL is an Apache-2.0 open source project that adds a Microsoft SQL Server-compatible end-point to PostgreSQL to enable your PostgreSQL database to understand the SQL Server wire protocol and commonly used SQL Server commands. With Babelfish, applications that were originally built for SQL Server can work directly with PostgreSQL, with little to no code changes, and without changing database drivers.

Do You Name All Your SQL Server Database Constraints?

If you define a constraint without explicitly giving it a name, SQL Server will generate one for you.
You know the ones, they look something like this PK__MY_TABLE__3213E83FA7739BB4.

Why might that be a bad thing? It makes writing deployment scripts harder because you won’t know up front the names of constraints you might want to refer to.

Michael J Swart describes a query to discover the system generated names in your databases (with a small modification):

SELECT 
    [Schema] = SCHEMA_NAME(o.schema_id),
    [System Generated Name] = OBJECT_NAME(o.object_id),
    [Parent Name] = OBJECT_NAME(o.parent_object_id),
    [Object Type] = o.type_desc
FROM 
    sys.objects o
    JOIN sys.sysconstraints c ON o.object_id = c.constid
WHERE 
    (status & 0x20000) > 0
    and o.is_ms_shipped = 0

According to the sys.sysconstraints documentation page:

This SQL Server 2000 system table is included as a view for backward compatibility. We recommend that you use the current SQL Server system views instead. To find the equivalent system view or views, see Mapping System Tables to System Views (Transact-SQL). This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.

You can query the same information by using the individual views unioned together:


SELECT 
    [Schema] = SCHEMA_NAME(schema_id),
    [System Generated Name] = OBJECT_NAME(object_id),
    [Parent Name] = OBJECT_NAME(parent_object_id),
    [Object Type] = type_desc
FROM sys.check_constraints 
WHERE is_system_named = 1

UNION ALL

SELECT 
    [Schema] = SCHEMA_NAME(schema_id),
    [System Generated Name] = OBJECT_NAME(object_id),
    [Parent Name] = OBJECT_NAME(parent_object_id),
    [Object Type] = type_desc
FROM sys.default_constraints 
WHERE is_system_named = 1

UNION ALL

SELECT 
    [Schema] = SCHEMA_NAME(schema_id),
    [System Generated Name] = OBJECT_NAME(object_id),
    [Parent Name] = OBJECT_NAME(parent_object_id),
    [Object Type] = type_desc
FROM sys.key_constraints 
WHERE is_system_named = 1

UNION ALL

SELECT 
    [Schema] = SCHEMA_NAME(schema_id),
    [System Generated Name] = OBJECT_NAME(object_id),
    [Parent Name] = OBJECT_NAME(parent_object_id),
    [Object Type] = type_desc
FROM sys.foreign_keys  
WHERE is_system_named = 1

SQL Server Error Code 4815 Bulk Insert into Azure SQL Database

If you receive error code 4815 while doing a Bulk Insert into an Azure SQL Database (including SqlBulkCopy()), it’s likely you are trying to insert a string that is too long into a (n)varchar(x) column.

The unhelpful error message does not contain any mention of overflow, or the column name! Posting in the hope it will save someone some time.

Postgresql: Find Users With Weak Passwords

A while back I wrote a short post that checks for SQL Server SQL logins with weak passwords. Here’s the equivalent for Postgresql (it’s only checking the MD5 hash algorithm at present):

CREATE TEMPORARY TABLE temp_CommonPasswords
(
	Password varchar(30) not null primary key
) 
ON COMMIT DROP;

INSERT INTO temp_CommonPasswords(Password) VALUES 
(''),
('123'),
('1234'),
('12345'),
('123456'),
('1234567'),
('12345678'),
('123456789'),
('1234567890'),
('987654321'),
('123qwe'),
('mynoob'),
('18atcskd2w'),
('55555'),
('555555'),
('3rjs1la7qe'),
('google'),
('zxcvbnm'),
('000000'),
('1q2w3e'),
('1q2w3e4r5t'),
('1q2w3e4r'),
('qwerty'),
('qwerty123'),
('password'),
('p@ssword'),
('p@ssw0rd'),
('password1'),
('p@ssword1'),
('password123'),
('passw0rd'),
('111111'),
('1111111'),
('abc123'),
('666666'),
('7777777'),
('654321'),
('123123'),
('123321'),
('iloveyou'),
('admin'),
('nimda'),
('welcome'),
('welcome!'),
('!@#$%^&*'),
('aa123456'),
('lovely'),
('sunshine'),
('shadow'),
('princess'	),
('solo'),
('football'),
('monkey'),
('Monkey'),
('charlie'),
('donald'),
('Donald'),
('dragon'),
('Dragon'),
('trustno1'),
('letmein'),
('whatever'),
('hello'),
('freedom'),
('master'),
('starwars'),
('qwertyuiop'),
('Qwertyuiop'),
('qazwsx'),
('corona'),
('woke'),
('batman'),
('superman'),
('login');


SELECT 
	usename
FROM 
	pg_shadow 
	cross join lateral (Select Password from temp_CommonPasswords) c 
WHERE
	'md5'||md5(c.Password||usename) = pg_shadow.passwd

UNION ALL

SELECT 
	usename 
FROM 
	pg_shadow 
WHERE 
	passwd = 'md5'||md5(usename||usename)

.NET Core Standalone Executable

.NET Core 1.0 came out June 27, 2016. 4 years later, and who knows how many hundreds of thousands of person hours development, I figured it would be quite mature.

On that premise, feeling quite hopeful, I decided to see what’s involved in converting a .NET 4.7.1 standalone console application to .NET Core 3.1, which you’d think would be relatively straight forward.

Three hours later, my 5MB standalone console application has ballooned to 74MB! If you select ‘PublishTrimmed=true’, then the size drops to 44MB but then the application doesn’t work. Apparently, trimming is not able to work out what’s needed, even when reflection isn’t involved.

Turns out even the un-trimmed 74MB app. still doesn’t work as you can’t use the built-in encrypted connection strings section in app.config file. (It hasn’t currently been implemented in .NET Core, along with DbProviderFractory, and a few other surprises…)

I went looking for resources and other people’s experiences converting to .NET Core.

https://docs.microsoft.com/en-us/dotnet/core/porting/
https://docs.microsoft.com/en-us/dotnet/standard/analyzers/api-analyzer
https://github.com/hvanbakel/CsprojToVs2017
https://ianqvist.blogspot.com/2018/01/reducing-size-of-self-contained-net.html

Scott Hanselman gets really excited about making a 13MB+ “Hello world” .Net Core application. He even calls it tiny!! (and that’s after he got it down from 69MB). His post starts out with the line “I’ve always been fascinated by making apps as small as possible, especially in the .NET space.” Irony, or what? In what kind of insane world is a “Hello World!” application 13MB!?!

On a tangential side note; just ditched ILMerge for creating standalone executables. In the past I’ve used Jeffrey Richter’s technique of embedding assemblies in the resource manifest, adding a startup hook to load assemblies into the app. domain at runtime, but like a FOOL, I thought that ILMerge was the ‘better’, more .NETway of doing things.

The amount of pain ILMerge has caused me over the last few years is staggering. It has to be one of the most fragile tools out there. If the planets aren’t aligned it spits the dummy. If there’s ever a problem it spits out an unhelpful cryptic “exited with error X” message. Good luck finding the problem!

Just moved over to using Fody/Costura; it uses that same technique of embedding assemblies in the executable.

It worked the very first time! Unlike ILMerge. As an added bonus it automatically compresses/decompresses assemblies, and my .NET 4.7.1 standalone executable is 2 MB smaller!

SQLFrontline: Snapshot SQL Server Configuration

Taking a snapshot of a SQL Server’s configuration, enables you to see what changes over time. It can also provide a record of the date changes were made, so that you can correlate if problems occur and determine if a change might be to blame. It’s also a great way to document any fixes you have made.

An example: some months ago I had generated a SQLFrontline report against a server I had been asked to look at and update to industry best practices. Some time after the work had been done, I re-ran the report and discovered that someone had turned on SQL Server’s ‘Priority Boost’ setting since the previous data collection! (You should never turn this setting on):

“Raising the priority too high may drain resources from essential operating system and network functions, resulting in problems shutting down SQL Server or using other operating system tasks on the server.”

https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/configure-the-priority-boost-server-configuration-option

SQLFrontline currently performs 310+ checks looking at reliability, performance, configuration, security, database design and emails you the results, with clear instructions on what needs attention.