A really clear explanation of Matrix Calculus
This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks.
SQL Server, performance, data, analytics
A really clear explanation of Matrix Calculus
This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks.
I’ve recently changed what I was calling the ‘Demo’ version of SQLFrontline, to a ‘Freemium’ model. The demo version only displayed one recommendation result in each of the four severity categories (Critical, High, Medium, Info).
The free version does not include all the features of the paid premium version obviously, but still provides some useful recommendations, providing advice on 40 checks.
Both use the same lightweight metadata collection.
The Free Version:
The Premium Version:
If you want to try it out, click this link to request a free access token.
Once you have an access token, here’s how to run it: How to run SQLFrontline
Reasons not to embed SQL in SSRS reports (.rdl) and create stored procedures instead:
All seem fairly obvious, but it’s surprising how many people still embed SQL into SSRS reports.
Joe Albahari’s talk “What I’ve learned from 20 years of programming in C#” from Wednesday is available on YouTube. Starts around 35:13
Microsoft have recently released a long awaited retry mechanism for .NET SqlClient
I’m a fan of Polly for retry logic:
Polly is a library that allows developers to express resilience and transient fault handling policies such as Retry, Circuit Breaker, Timeout, Bulkhead Isolation, and Fallback in a fluent and thread-safe manner.
It will be interesting to see how they compare in terms of ease of use.
Unlike Linux systems, Windows doesn’t have a built-in command line file splitter. Splitting large files into smaller chunks is something you often want to with data warehouses (such as Snowflake), in order to be able to use multiple threads for better bulk loading performance.
I saw a post by Greg Low, SDU_FileSplit – Free utility for splitting CSV and other text files in Windows where he had created one for Windows, but it wasn’t open-source.
I decided to create an open-source file splitter for Windows.
It’s a standalone .NET 5.0 executable, supports wildcards, can recurse sub-folders (careful with that option!) and automatically gzip compress output files.
Example use:
FileSplitter.exe -i c:\temp\*.csv -m 100000 -c -o c:\temp\TestFileSplitter
Splits all the files in specified folder c:\temp with extension .csv, and splits into max. 100K lines per file, storing the output files in folder c:\temp\TestFileSplitter
Almost every fact table in a data warehouse uses a date (or calendar) dimension, because most measurements are defined at specific points in time. A flexible calendar date dimension is at the heart of most data warehouse systems; it provides easy navigation of a fact table through user familiar dates, such as weeks, months, fiscal periods and special days (today, weekends, holidays etc.).
I’ve created a date dimension generator here at Github
It targets SQL Server, but should be easy to convert to other RDBMS.
It features:
The [TodayFlag] needs to be updated once per day by a scheduled task (timezone dependent: might need a flag for each timezone).
If you use an unusual Fiscal year (say 5-4-4), it will need to be loaded from an external source (such as an Excel/Google spreadsheet).
The precise start date of the month of Ramadan is by proclamation, so these need to be added, year by year. It is possible to calculate but can be a day out, and can vary by region.
https://travel.stackexchange.com/questions/46148/how-to-calculate-when-ramadan-finishes
https://en.wikipedia.org/wiki/Ramadan_%28calendar_month%29
This has the capacity to be huge:
Babelfish for PostgreSQL is an Apache-2.0 open source project that adds a Microsoft SQL Server-compatible end-point to PostgreSQL to enable your PostgreSQL database to understand the SQL Server wire protocol and commonly used SQL Server commands. With Babelfish, applications that were originally built for SQL Server can work directly with PostgreSQL, with little to no code changes, and without changing database drivers.
If you define a constraint without explicitly giving it a name, SQL Server will generate one for you.
You know the ones, they look something like this PK__MY_TABLE__3213E83FA7739BB4.
Why might that be a bad thing? It makes writing deployment scripts harder because you won’t know up front the names of constraints you might want to refer to.
Michael J Swart describes a query to discover the system generated names in your databases (with a small modification):
SELECT
[Schema] = SCHEMA_NAME(o.schema_id),
[System Generated Name] = OBJECT_NAME(o.object_id),
[Parent Name] = OBJECT_NAME(o.parent_object_id),
[Object Type] = o.type_desc
FROM
sys.objects o
JOIN sys.sysconstraints c ON o.object_id = c.constid
WHERE
(status & 0x20000) > 0
and o.is_ms_shipped = 0
According to the sys.sysconstraints documentation page:
This SQL Server 2000 system table is included as a view for backward compatibility. We recommend that you use the current SQL Server system views instead. To find the equivalent system view or views, see Mapping System Tables to System Views (Transact-SQL). This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.
You can query the same information by using the individual views unioned together:
SELECT
[Schema] = SCHEMA_NAME(schema_id),
[System Generated Name] = OBJECT_NAME(object_id),
[Parent Name] = OBJECT_NAME(parent_object_id),
[Object Type] = type_desc
FROM sys.check_constraints
WHERE is_system_named = 1
UNION ALL
SELECT
[Schema] = SCHEMA_NAME(schema_id),
[System Generated Name] = OBJECT_NAME(object_id),
[Parent Name] = OBJECT_NAME(parent_object_id),
[Object Type] = type_desc
FROM sys.default_constraints
WHERE is_system_named = 1
UNION ALL
SELECT
[Schema] = SCHEMA_NAME(schema_id),
[System Generated Name] = OBJECT_NAME(object_id),
[Parent Name] = OBJECT_NAME(parent_object_id),
[Object Type] = type_desc
FROM sys.key_constraints
WHERE is_system_named = 1
UNION ALL
SELECT
[Schema] = SCHEMA_NAME(schema_id),
[System Generated Name] = OBJECT_NAME(object_id),
[Parent Name] = OBJECT_NAME(parent_object_id),
[Object Type] = type_desc
FROM sys.foreign_keys
WHERE is_system_named = 1
If you receive error code 4815 while doing a Bulk Insert into an Azure SQL Database (including SqlBulkCopy()), it’s likely you are trying to insert a string that is too long into a (n)varchar(x) column.
The unhelpful error message does not contain any mention of overflow, or the column name! Posting in the hope it will save someone some time.