Wednesday, July 2, 2014

Find blank rows in a SQL Server Database

After importing data into a SQL Server database there are sometimes blank rows that get created depending on what your data source looks like. Often when using Excel as a data source extra rows will be created with all blank values. Since the table doesn't by default have a primary key all columns can be null. Find what tables have blank rows and then deleting them can be time consuming. The script here will make this much easier.

Disclaimer
I have used this script successfully on my databases, but please, please, please make a backup of your database BEFORE you execute the following since it can affect all your tables. I am of course not responsible for any data loss caused by this script. 

Executing the script below does NOT actually do the deletes. You will still need to copy and paste the generated SQL into SSMS and execute it. I highly suggest you read the generated SQL to make sure it is doing what you want it to before you do the final execution of the generated sql.

create table #BlankRowCounts(TableName nvarchar(255), NumBlankRows int)
Declare @SQL as nvarchar(MAX)
select
    @SQL = ISNULL(@SQL + ' union ' , '')
    + 'select ' +
'''' + TABLE_NAME + ''' as TableName, ' +
'COUNT(1) as NumBlankRows' +
' from ' +
'[' + TABLE_NAME + ']' +
' where ' +
dbo.GetColumnList(TABLE_NAME, 1, ' is null and ') + ' is null'
from INFORMATION_SCHEMA.TABLES
where TABLE_TYPE = 'BASE TABLE'
ORDER BY TABLE_NAME

insert into #BlankRowCounts(TableName, NumBlankRows)
exec sp_executesql @SQL


select TableName, NumBlankRows,
'select ' +
'''' + TableName + ''' as TableName, ' +
dbo.GetColumnList(TableName, 1, ', ') + 
' from ' +
'[' + TableName + ']' +
' where ' +
dbo.GetColumnList(TableName, 1, ' is null and ') + ' is null' as SelectStmt,
'delete from ' +
'[' + TableName + ']' +
' where ' +
dbo.GetColumnList(TableName, 1, ' is null and ') + ' is null' as DeleteStmt
from
#BlankRowCounts
order by NumBlankRows desc
drop table #BlankRowCounts

IMPORTANT
You will also need to get the code for GetColumnList() function here.

The results of this SQL are simple. There are four columns:
  • TableName - The table for which the statements will affect
  • NumBlankRows - The number of rows in the table (see TableName) that have all blank columns
  • SelectStmt - The select statement you can copy and paste into SSMS to actually see for yourself that the columns are null. You don't have to execute these, but they are here to convince yourself that the data is blank.
  • DeleteStmt - The delete statement you can copy and paste into SSMS to actually do the deleting of the rows that have all the columns as null. I highly recommend reading this BEFORE you execute it. Also, consider backing up your data if the data is important to you.







No comments: