Analyze table usage for present data

Standard

dataThe time I’ve spend analyzing tables for effective database modeling is just too much. I don’t like to do things twice and I’d rather spend a little more time to develop something smart than doing things over and over again. This was also the case where I had to analyze table usage for a set of tables in a databases.

The question asked was to see the minimum and the maximum length for all the columns in a table. Doing this is not that hard but making it effective to execute it against more than one column is a bit harder. Still not rocket science but needs a little more actions are needed to get things done.

The script below will take a few arguments

  • table_name
  • sample

The table_name argument is just the table name you want to analyze. You could change the query so it will execute against all the tables but that’s a little too far for the purpose of this script.

The sample argument allows you to take a subset of data instead of the entire table. The reason I did that was because I’m working with tables with more than a 100 million rows and that could take a long time before everything would’ve been read.

The script will take the amount of rows and put it in a temporary table. Than get all the columns from the specified table. After this it will create the queries needed to get all the lengths needed and make up a result.

At the end the result is combined with the information from the columns.

DECLARE @table_name NVARCHAR(250)
DECLARE @query NVARCHAR(MAX)
DECLARE @sample NVARCHAR(100)

IF OBJECT_ID('tempdb..##TEMP') IS NOT NULL DROP TABLE ##TEMP
IF OBJECT_ID('tempdb..#COLUMNS') IS NOT NULL DROP TABLE #COLUMNS
IF OBJECT_ID('tempdb..#RESULT') IS NOT NULL DROP TABLE #RESULT

-- Replace the values in these variables
SET @table_name = '<table name>'
SET @sample = '<amount of records to test>'

-- Create the table for the results
CREATE TABLE #RESULT
(
	table_name VARCHAR(250)
	, column_name VARCHAR(250)
	, min_value INT
	, max_value INT
)

-- Get the sample from the table
EXECUTE ('SELECT TOP ' + @sample + ' * INTO ##TEMP FROM ' + @table_name +';')

-- Get the information about the columns
SELECT 
	c.name AS [column_name]
	, c.max_length AS [column_max_length]
	, t.name AS [column_type_name]
	, t.max_length AS [type_max_length]
INTO #COLUMNS FROM sys.columns c
INNER JOIN sys.types t ON t.user_type_id = c.user_type_id
WHERE OBJECT_NAME(object_id) = @table_name
ORDER BY column_id

-- Reset the query variable
SELECT @query = ''

-- Make up the queries for getting the lengths of the columns
SELECT @query += '
		INSERT INTO #RESULT 
		SELECT ''' + @table_name 
			+ ''', ''' + column_name + '''
			,ISNULL(MIN(LEN(' + column_name + ')), 0)
			, ISNULL(MAX(LEN(' + column_name + ')), 0) 
		FROM ##TEMP;'
FROM
	#COLUMNS

-- Execute the queries
EXECUTE sp_executesql @query

-- Get the result
SELECT 
	r.table_name
	, r.column_name
	, c.column_type_name
	, c.column_max_length
	, r.min_value
	, r.max_value
	
	, c.type_max_length
FROM 
	#RESULT r
INNER JOIN #COLUMNS c
	ON r.column_name = c.column_name

This will create the following result:

data_usage_result

In the example above you can see that several columns have a CHAR data type of length 40 but are unused and that costs data. By changing the column data type from CHAR to VARCHAR it was possible to save a lot of data especially if you’re dealing with millions of rows.

Hope this helps you out to analyze your tables a little faster.

First version PSSQLLib released

Standard

libraryUPDATE:

The scripts for the PSSQLLib are now on GIT!

It’s been long in the making and I’ve finally made a Powershell module (PSSQLLib) which makes my life a little easier when it comes to getting information from my instances.

The module in question makes it possible to get information like the databases, databse users, the privileges of the logins and the database users, the configution settings and a lot more from any instance.

The module is just a first release and new features can certainly be added. If you’re missing a feature, leave a comment and I’ll get it in there.

For more information about the Powershell module can be found here.

Hope you enjoy it!

Restore a database using Powershell

Standard

/home/sstad/domains/sqlstad.nl/public html/wp content/uploads/2015/05/150504 backup restore

Recently I had a situation where I had to restore several full backups of multiple databases on a SQL Server instance daily. The databases would already be there and a simple restore with replace option would suffice. As I’m a fan of Powershell I wanted to restore a database using Powershell.

As lazy as I am I don’t want to make a script for each of the databases, I just want to supply a few parameters to a script and that the script does the rest for me.

I wanted the script to do the following:

  • Get the latest backup file from a directory
  • Restore the database with replace option
  • Be able to use the script in a SQL Server Agent job

The script would have to have the following parameters to do everything:

  • Directory: Directory to search through to get the latest file.
  • SQL Server instance: Which instance the server needed to restore to
  • Database: The database to restore to
  • Filter: Which extension to search for like “BAK”

Eventually the script looks like this

<#============================================================================
  File:     RestoreDatabase.sql
 
  Summary:  Restores databases in SQL Server
 
  SQL Server Versions: 2005 onwards
------------------------------------------------------------------------------
  Written by Sander Stad, SQLStad.nl
 
  (c) 2015, SQLStad.nl. All rights reserved.
 
  For more scripts and sample code, check out http://www.SQLStad.nl
 
  You may alter this code for your own *non-commercial* purposes (e.g. in a
  for-sale commercial tool). Use in your own environment is encouraged.
  You may republish altered code as long as you include this copyright and
  give due credit, but you must obtain prior permission before blogging
  this code.
 
  THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF
  ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED
  TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
  PARTICULAR PURPOSE.
============================================================================#>


param(
    [string]$d
    , [string]$s
    , [string]$db
    , [string]$f
)


$directory = $d
$serverInstance = $s
$database = $db
$filter = $f

if(Test-Path $directory)
{
    # Get latest file from the directory
    $latestBackup = Get-ChildItem -Path $directory -File -Recurse -Filter $filter | Sort-Object LastAccessTime -Descending | Select-Object -First 1

    # Setup the full path
    try{
        Restore-SqlDatabase -ServerInstance $serverInstance -Database $database -BackupFile $latestbackup.FullName -ReplaceDatabase
    } catch{
        #Write-Error ("Restore of database failed." + $_)
    }
}
else{
    #Write-Host -ForegroundColor Red "Directory couldn't be found!"
}

The catches are commented due the fact that the SQL Server job can’t cope with these kind of error messages. The script assumes that the database is already present. In a later version I will make it possible to adjust this accordingly but for now the functionality is just to execute a replace.

To make the job go to SQL Server and create a new job and job step and fill it in like below:

job_restore_step

Make sure the field Type is set on “Operating system (CmdExec)” and that the command is set up like this:

powershell "[folder to script]\RestoreDatabase.ps1 -d '[folder]' -s '[instance]' -db '[database]' -f '*.bak'"

The parameters:

  • -d: Directory to search
  • -s: SQL Server instance
  • -db: Database to replace
  • -f: Filter

I used the “*.bak” filter to get all the bak-files but you could use a different extension if needed as long as it is for a full backup.

Replace the values between the brackets and give the job a spin.

 

 

Generate truncate script

Standard

truncate all the tablesLet me first say that I never truncate my production databases! It’s a bad practice and in most cases you don’t win anything. A good article about this: Stop Shrinking Your Database Files. Seriously. Now.

Besides the downsides of truncating your files I’ve had situations where I had to restore several databases for read-only purposes to a server.

When you deal with a lot of databases and want to shrink all the files you have to query the values or look at the properties of each database. This is not a efficient way to do this and therefore I’ve created the a little script to generate a truncate script to do everything for you.

For SQL Server 2000

SELECT 
	'USE [' + d.name + ']; PRINT ''Trunking ' + af.filename + ' '' DBCC SHRINKFILE (N''' + f.name + ''' , 0, TRUNCATEONLY)'
FROM 
	sysaltfiles af 
INNER JOIN sysdatabases d
	ON f.dbid = d.dbid
WHERE
	d.name NOT IN ('master', 'msdb', 'model', 'tempdb')

For SQL Server 2005 and up

SELECT
	'USE [' + d.name + ']; PRINT ''Trunking ' + mf.physical_name + ' '' DBCC SHRINKFILE (N''' + mf.name + ''' , 0, TRUNCATEONLY)'
FROM 
	sys.master_files mf 
INNER JOIN sys.databases d 
	ON mf.database_id = d.database_id 
WHERE
	d.name NOT IN ('master', 'msdb', 'model', 'tempdb')

I left out the system databases because I didn’t want to mess with them.

Find columns with special characters

Standard

Problem

im specialImagine a database where you want to know how many times a certain column contains a special character. The reason I had to do this was because I needed to know the impact of an export interface which wasn’t working properly with special characters.

But as lazy as I am as it comes to do repetitive tasks, I had to find a way I could easily get all columns with special characters in a data set without a lot of hassle.

The solution

Fortunately for me I can use the internal INFORMATION views of a database in SQL Server to get all the columns with their table, schema and type of data. Because this database only used varchar columns for saving text data I only had to search for this data type. You can easily extend the selecting by adding the neccesary types like TEXT, NVARCHAR etc.

To create the script I had to accomplish the following things:

  1. Retrieve all the colums that could contain text
  2. Set up a pattern to check the columns with
  3. Loop through the columns
  4. In every loop save the results
  5. Make up a data set of the data retrieved

I eventually got to this script:

-- Replace the mydatabase with the name of your database 
USE [mydatabase] 

-- Delete possible present temporary tables 
IF Object_id('tempdb..#columns') IS NOT NULL 
  DROP TABLE #columns 

IF Object_id('tempdb..#results') IS NOT NULL 
  DROP TABLE #results 

IF Object_id('tempdb..#rows') IS NOT NULL 
  DROP TABLE #rows 

-- Create a table for holding all the columns 
CREATE TABLE #columns 
  ( 
     id            INT NOT NULL PRIMARY KEY IDENTITY(1, 1), 
     database_name VARCHAR(100), 
     schema_name   VARCHAR(50), 
     table_name    VARCHAR(200), 
     column_name   VARCHAR(200), 
     position      INT 
  ) 

-- Create a table for holding the results 
CREATE TABLE #results 
  ( 
     database_name VARCHAR(100), 
     schema_name   VARCHAR(50), 
     table_name    VARCHAR(200), 
     column_name   VARCHAR(200), 
     count         INT 
  ) 

-- Declare the variables needed 
DECLARE @max INT, 
        @cnt INT 
DECLARE @sql     NVARCHAR(max), 
        @pattern VARCHAR(200) 

-- Get all the columns 
INSERT INTO #columns 
SELECT table_catalog, 
       table_schema, 
       table_name, 
       column_name, 
       ordinal_position 
FROM   information_schema.columns 
WHERE  data_type = 'varchar' 
ORDER  BY table_catalog, 
          table_name, 
          ordinal_position 

-- Get the minimum id and max that are needed for the loop 
SELECT @cnt = Min(id) 
FROM   #columns 

SELECT @max = Max(id) 
FROM   #columns 

--Setup the pattern 
SELECT @pattern = 'ëïöüäéíóáèìòàËÏÖÄÉÍÓÁÈÌÒÀ' 

-- Make sure the @sql variable is empty 
SELECT @sql = '' 

-- Start loop 
WHILE( @cnt < ��@max ) 
  BEGIN 
      -- Setup the sql statement 
      SELECT @sql = 'INSERT INTO #results SELECT ''' + database_name + ''', ''' 
                    + schema_name + ''', ''' + table_name + ''', ''' 
                    + column_name + ''', COUNT(1) AS count FROM [' + database_name + '].[' 
                    + schema_name + '].[' + table_name + '] WHERE [' 
                    + column_name 
                    + '] COLLATE Latin1_General_BIN2 LIKE ''%[' 
                    + @pattern + ']%'' ESCAPE ''E'' ' 
      FROM   #columns 
      WHERE  id = @cnt 

      -- Execute the sql statement 
      EXEC Sp_executesql 
        @sql 

      -- Increae the counter 
      SELECT @cnt += 1 
  END 

-- Select the data 
SELECT results.database_name, 
       results.schema_name, 
       results.table_name, 
       results.column_name, 
       results.count AS 'count_results' 
FROM   #results results 
WHERE  results.count >= 1

This might take a while when you need to search through a lot of columns and records.  When the script you will get a result like below:

results specialcharacters

results specialcharacters

 

 

 

 

 

 

 

The pattern can contain any character that you want. I only searched for some of the most common special characters but any character will do.

I hope you find this post helpfull and I appreciate any comment.

SSRS permissions overview

Standard

Problem: SSRS Permissions Overview

I had to create a SSRS permissions overview with all the privileges set for each report, folder and other objects in SQL Server Reporting Services.

When you try to google anything like export permissions SSRS  you get tons of results where people have issues with the permissions.

I don’t have issues with the permissions I only want to get an SSRS permissions overview.

Solution

Just a simple T-SQL script which combins some of the tables which hold the values I needed.

The script below will show all the different objects in SSRS with their path and name with username and the privilege it has.

USE [ReportServer] 

go 

SELECT u.username, 
       r.rolename, 
       c.NAME, 
       c.path 
FROM   [dbo].[users] u 
       INNER JOIN [dbo].[policyuserrole] pur 
               ON u.userid = pur.userid 
       INNER JOIN [dbo].[policies] p 
               ON p.policyid = pur.policyid 
       INNER JOIN [dbo].[roles] r 
               ON r.roleid = pur.roleid 
       INNER JOIN [dbo].[catalog] c 
               ON c.policyid = p.policyid 
ORDER  BY u.username, 
          r.rolename, 
          c.NAME

More info about the ReportServer database can be found here.

The image below shows how the query returns the results

results ssrs permissions

Results from SSRS permissions query

This is a fairly simple T-SQL script but it can help you out.

I always appreciate comments so if you have one feel free to share it.

Get ASCII characters from string

Standard

I had a situation where I got an error converting a value to a decimal which had some white text and a valid number.

During the conversion I used the LTRIM and RTRIM function to remove any white spaces which were present at the beginnend and the end of the value. Unfortunately the characters in the string didn’t get removed with these functions. This let me to creating a little script to get the different character strings.

DECLARE @value VARCHAR(50) 
DECLARE @sql NVARCHAR(max) 
DECLARE @i INT 

SET @value = '  11452.72 ' 
SET @i = 0 
SET @sql = '' 

WHILE @i < Datalength(@value) 
  BEGIN 
      SELECT @sql += 'SELECT ' + Cast(@i AS VARCHAR) 
                     + ' AS ''Position'',' + 'SUBSTRING(''' + @value 
                     + ''', (' + Cast(@i AS NVARCHAR) 
                     + ' + 1), 1) AS ''String Value'',' 
                     + 'CAST(ASCII(SUBSTRING(''' + @value + ''', (' 
                     + Cast(@i AS NVARCHAR) 
                     + ' + 1), 1)) AS VARCHAR) AS ''ASCII value''' 

      SELECT @i += 1 

      IF @i < Datalength(@value) 
        BEGIN 
            SELECT @sql += ' UNION ALL ' 
        END 
  END 

EXEC Sp_executesql @sql

This gave me the following result:

result_asciivalues

I knew the ASCII value 32 is a space and these characters were removed during the process. But as you can see the values at the end are not spaces, tabs or any other kind of white space character.

This helped me find out that something went wrong with the export of the files which I would never have found without knowing the reason why the value didn’t convert to decimal.

For more information about the ASCII function you can follow this link

I hope this script can help you too.

Split large data file into smaller data files

Standard

Problem

split_large_databasefileThe business uses a database which is 350 GB database in size and has only one data file and one transaction logfile. The previous DBA didn’t increase the amount of data files which lead to a large data file. Due to the size of the data file I got into trouble with the performance and restoring the database wasn’t a walk in the park either. So how can split a large data file into smaller data files.

What are the options?

  1. Do nothing and let the file grow when needed
  2. Split the file up into multiple data files

Solution number 1 is not a solution and will bite you in the long run because the database will grow so large it will not be easy to maintain anymore.

Solution number 2 will make sure the database has multiple files what could deliver an increase in performance and will be easier to restore. This solution can have be done in two ways:

  1. Have one file group with all the data files
  2. Work with multiple file groups and multiple files

In option number one you’ll have a single file group (the PRIMARY file group) which holds all the files. SQL Server will spread out the data over all the files in that file group equally.

In option number 2 you can make one or more file groups containing at least one data file where you can specifically move tables to.

Because I’m not 100% familiar with the performance per table and I didn’t have enough time to investigate this, I chose to use the PRIMARY file group and create several data files. I wanted to spread out the data over multiple data files because I wanted the original file to be emptied and the new files to be filled.

This was not a one-stop go action because I had to face several issues:

  1. When the data from the original file is moved to other files, the original data file maintains the original size.
  2. Because of issue 1 I had to create more space. I didn’t have the space on the original server and had to place the data elsewhere temporarily.
  3. Placing the files on another server or share isn’t possible because SQL Server doesn’t allow you to place database files on file shares
  4. Attaching an extra disk wasn’t possible in a timely fashion

Solution

To solve the issue that SQL Server doesn’t allow database files to be placed on file shares I enabled trace 1807. This trace bypasses several check of SQL Server from where it’s possible to place the files on file shares. See the code below:

DBCC TRACEON(1807)

The information about the trace flag was found at Varun Dhawan’s blog post which can be found here.

I created a file share on another server which pointed and gave the SQL Server service account modify rights on that share on NTFS and full control on the share.

I than created 10 data files to make sure the data files get a size around 35 GB each which is easy to handle.
TIP: Make sure the growth of the new data files is set to at least 1 GB because the continuous growth will have a negative impact on the performance of the data move.

To empty the original data file I used the following script:

USE [dbname]
DBCC SHRINKFILE('datafilename', EMPTYFILE)

This code will run for hours! Make sure there is enough space for the new data files to grow otherwise the process will exit. If the process in some way quits just run the shrinkfile command again and SQL Server will continue the move of data.

I used the script below to see how the files got filled:

SELECT a.fileid, 
       CONVERT(DECIMAL(12, 2), Round(a.size / 128.000, 2))           AS 
       [FILESIZEINMB], 
       CONVERT(DECIMAL(12, 2), 
       Round(Fileproperty(a.NAME, 'SpaceUsed') / 128.000, 2)) 
                                             AS [SPACEUSEDINMB], 
       CONVERT(DECIMAL(12, 2), Round( 
       ( a.size - Fileproperty(a.NAME, 'SpaceUsed') ) / 128.000, 2)) AS 
       [FREESPACEINMB], 
       a.NAME                                                        AS 
       [DATABASENAME], 
       a.filename                                                    AS 
       [FILENAME] 
FROM   dbo.sysfiles a

This return the following result:

split_large_file_databasefiles

 

As you can see the new data files are filled proportionally with the data from the original data file. The transaction log-file is begin used because the data goes through the transaction log-file and than gets written to the new data files. In the same time the free space in the original data file increases.

When the process finished I was still having a large data file with a lot space available.  Because this was the primary file which contains all the system objects for the database I first shrunk the primary data file with the following command:

USE [dbname]
GO
DBCC SHRINKFILE (N'datafile' , 0, TRUNCATEONLY)
GO

Because the file didn’t bother me I adjusted the initial size of the data file to be very small like 100MB and set it to have a maximum size of 100MB:

USE [dbname]
GO
DBCC SHRINKFILE (N'datafile' , 100)
GO
USE [master]
GO
ALTER DATABASE [dbname] MODIFY FILE ( NAME = N'datafile', MAXSIZE = 100MB , FILEGROWTH = 1MB )
GO

This will make sure the primary data file will never grow again and all the data that gets written will be written to the new data files.

Although the database works I wouldn’t want the files to be like they were spread across multiple locations. I detached the database to be able to move the data files around

USE [master]
GO
ALTER DATABASE [dbname] SET SINGLE_USER WITH ROLLBACK IMMEDIATE
GO
USE [master]
GO
EXEC master.dbo.sp_detach_db @dbname = N'dbname'
GO

… then moved the files from the fileshare to the databaseserver data directory and attched the database again where I needed to lookup the new datafiles because originally they were placed in an alternate location.

At last I had to clean up the changes I made by removing the file share and undo the trace. Undoing the trace is done with the following command

DBCC TRACEOFF(1807)

Conclusion

The process took a very long time. In this situation it took me more than 17 hours to get the database configured and to get the data moved.

Besides the fact that this took very long I’m glad with the result. The next step could be identity possible performance killer tables that can be placed in separate files on other file groups but that is a whole different article all together.

I hope you can use this information for your own actions and comments are always welcome!