Finding Columns with Skewed Data

Queries with parameter sensitive plans can perform poorly when an inappropriate query plan is used.

Even if your statistics are up to date, parameter sensitive plans can be caused by skewed data,
so performing a data skew analysis can identify which filter columns might be involved in poor query plans.

I’ve adapted the code found here into a SQL Server stored procedure that can be run across an entire database, a schema, a single table or just a single column.

It should be relatively easy to convert this to other RDBMS.

Here’s an example of the output when run on the Stackoverflow 2013 downloadable database (approximately 50GB):

3 thoughts on “Finding Columns with Skewed Data”

  1. Added ” and ty.[user_type_id] NOT IN (34,35,99) — image, text, ntext” to the colcur cursor to avoid sorting errors when loading the #tmp_tally_table.

    An amazing piece of work! Spent a day re-learning some statistics topics.

    Reply
    • Thanks, I’ve added the extra check. I don’t come across those column types very often (their use is deprecated, as you know).

      Reply

Leave a Comment