Analyze table usage for present data


dataThe time I’ve spend analyzing tables for effective database modeling is just too much. I don’t like to do things twice and I’d rather spend a little more time to develop something smart than doing things over and over again. This was also the case where I had to analyze table usage for a set of tables in a databases.

The question asked was to see the minimum and the maximum length for all the columns in a table. Doing this is not that hard but making it effective to execute it against more than one column is a bit harder. Still not rocket science but needs a little more actions are needed to get things done.

The script below will take a few arguments

  • table_name
  • sample

The table_name argument is just the table name you want to analyze. You could change the query so it will execute against all the tables but that’s a little too far for the purpose of this script.

The sample argument allows you to take a subset of data instead of the entire table. The reason I did that was because I’m working with tables with more than a 100 million rows and that could take a long time before everything would’ve been read.

The script will take the amount of rows and put it in a temporary table. Than get all the columns from the specified table. After this it will create the queries needed to get all the lengths needed and make up a result.

At the end the result is combined with the information from the columns.

This will create the following result:


In the example above you can see that several columns have a CHAR data type of length 40 but are unused and that costs data. By changing the column data type from CHAR to VARCHAR it was possible to save a lot of data especially if you’re dealing with millions of rows.

Hope this helps you out to analyze your tables a little faster.

Procedure statistics over period of time


procedureRecently I enjoyed a PASS session with Paul Randal where I saw a nice script which took two snapshots of the wait statistics and calculated the delta of the two. If you’re interested you can find the blog post here.

Based on that idea I wanted a script that did the same thing but than for stored procedures.

The script will take a snapshot of the sys.dm_exec_procedure_stats, wait for a set amount of time and take another snapshot. Based on the values of these two snapshots the script will calculate several values.

Thank again Paul for the idea.