Thursday, March 29, 2012

count of null values in columns

Hi,

I would like to ask how can I get count of null values in each column together?

I know I can use COUNT(*) - COUNT(<col>), but I need to explicitly name every column in the query.... is it possible without it?

SELECT COUNT(foo) WHERE Foo IS NULL;

GO

|||

Hi!

Thank you for your reply, however, this works only for one column. I meant how to do that if I don't know which columns are present, eg if I use SELECT * command...

|||

Here is my way:

select
coutOfNullsInCol01 = sum(case when col01 is null then 1 else 0 end),
coutOfNullsInCol02 = sum(case when col02 is null then 1 else 0 end)
from
Table

|||

This still requires you to know col01 and col02 names.

imagine table

A B x NULL NULL NULL x NULL

What I'm looking for is something like

SELECT COUNTSofNULLSinCOLUMNS(*) FROM MyTable

which would return

1 3

You know I need to not specify any column names since I don't know them.

|||

If you're looking for an aggregate to count NULLS, you'll need to use a grouping function (such as GROUP BY) for that. Note that this function counts non-NULLS rather than NULLS, and still requires a name of the columns to work.

You could create a cursor of the column names by joining system tables to loop through them, but that sounds pretty un-elegant.

Why do you need this information? Are you looking for data density information?

|||

Hu! I dont know how to do this which lets you use a "SELECT" syntax.

But this can be done with a proc which takes the tables name as an input param and genrates an SQL statement dynamically to replace COL01, COL02... CONNN with acutal columns and give the same result.

Do you know how to do this? If not, wait for few minutes!

|||

Since I know row count I expect that non-NULLS or NULLS are the same for me.

System tables would be the solution, however, if the MyTable is not physically a table that won't help. Like SELECT thisstuff FROM (SELECT ...). And yes, since I'm not so experienced in SQL I just asked in case I'm missing some obvious elegant way. :)

This perhaps answers Laskhmana solution, which I guess is pretty similar idea.

Well... actually I was asking a year ago so I'll have to open some older project to answer your question, however I can't do it now. If I'll find out, I will let you know.

|||

here is the sample code (please note that this can be writen in a better way.):

declare

@.tableName sysname

set @.tableName = 'TableWithFewColumns'

-- you can create a proc with following code with above one as param

declare

@.strSQL varchar(8000)

if object_id('tempdb..#columns') is not null
drop table #columns

create table #columns(colName varchar(255))

insert into #columns
exec('select name from syscolumns (nolock) where id = object_id(''' + @.tableName + ''')')

set @.strSQL = ''

select
@.strSQL = @.strSQL + ',coutOfNulls' + colName + ' = sum(case when ' + colName + ' is null then 1 else 0 end)'
from
#columns

set @.strSQL = 'select ''' + @.tableName + ''' as SourceTableName' + @.strSQL + ' from ' + @.tableName

exec(@.strSQL)

No comments:

Post a Comment