I'm working for a client to import data out of a generic off the shelf application database. If you can imagine a database that is designed by software developers that have no concept of database design, that is what i'm dealing with. I'm selecting from tables that have 50+ columns, where half the columns contain null data for every row (i.e. not used). My question is, how do i identify the columns that are null for EVERY row (i.e. useless) and exclude them from my queries? Thanks for your help! (i want to avoid going column by column saying isnull etc. as that will take forever!)
-
Few interesting ideas on StackOverflowMark Sinkinson– Mark Sinkinson2014年04月02日 10:40:31 +00:00Commented Apr 2, 2014 at 10:40
-
correct me if i'm wrong again, with this query you have to specify each column.jhowe– jhowe2014年04月02日 15:35:41 +00:00Commented Apr 2, 2014 at 15:35
3 Answers 3
You can use this stored procedure to the trick You need to provide the table name you wish to query note that if you'll pass to procedure the @exec parameter = 1 it will execute the select query
SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE PROCEDURE [dbo].[SP_SELECT_NON_NULL_COLUMNS] ( @tablename varchar (100)=null, @exec int =0) AS BEGIN SET NOCOUNT ON IF @tablename IS NULL RAISERROR('CANT EXECUTE THE PROC, TABLE NAME IS MISSING',16 ,1) ELSE BEGIN IF OBJECT_ID('tempdb..#table') IS NOT NULL DROP TABLE #table DECLARE @i VARCHAR (max)='' DECLARE @sentence VARCHAR (max)='' DECLARE @SELECT VARCHAR (max) DECLARE @LocalTableName VARCHAR(50) = '['+@tablename+']' CREATE TABLE #table (ColumnName VARCHAR (max)) SELECT @i+= ' IF EXISTS ( SELECT TOP 1 '+column_name+' FROM ' +@LocalTableName+' WHERE ' +column_name+ ' '+'IS NOT NULL) INSERT INTO #table VALUES ('''+column_name+''');' FROM INFORMATION_SCHEMA.COLUMNS WHERE table_name=@tablename INSERT INTO #table EXEC (@i) SELECT @sentence = @sentence+' '+columnname+' ,' FROM #table DROP TABLE #table IF @exec=0 BEGIN SELECT 'SELECT '+ LTRIM (left (@sentence,NULLIF(LEN (@sentence)-1,-1)))+ +' FROM ' +@LocalTableName END ELSE BEGIN SELECT @SELECT= 'SELECT '+ LTRIM (left (@sentence,NULLIF(LEN (@sentence)-1,-1)))+ +' FROM '+@LocalTableName EXEC (@SELECT) END END END
The easy solution is to check every single column that you care about to see if it is NULL.
select Col1, Col2, Col3
from MyBigTable
where Col1 IS NOT NULL
and Col2 IS NOT NULL
and Col3 IS NOT NULL
A trickier and possibly performance-killing solution would be to add a computed bit column to the table that gets set to true if any column is not null. When any of the columns are modified, this column would get re-calculated. Your query would then look like this:
select Col, Col2, Col3
from MyBigTable
where IsThereAnyData = 1
-
he wants to identify columns that are all null. Not rows that have no nullsPaul– Paul2014年04月02日 13:43:01 +00:00Commented Apr 2, 2014 at 13:43
-
Ahh..yes I see that now. Sounds more like a presentation layer problem.datagod– datagod2014年04月02日 15:02:23 +00:00Commented Apr 2, 2014 at 15:02
-
yes so without checking if each column is null one by one (there are hundreds) i'm looking for a quicker way...jhowe– jhowe2014年04月02日 15:40:39 +00:00Commented Apr 2, 2014 at 15:40
-
You are looking for a quick solution to a design problem. The original database designer took a shortcut. You are now paying the price. There is no quick solution.datagod– datagod2014年04月05日 17:47:41 +00:00Commented Apr 5, 2014 at 17:47
I know this is old, but just in case someone stumbles across this again. Here is an idea to pull row counts for every field in every table in the database. To get the nulls, just "Where RecordCount=0" on the last query. For a little older DBs, one can use STUFF instead of STRING_AGG.
DROP TABLE IF EXISTS #ResultsTbl;
CREATE TABLE #ResultsTbl (
TableName nvarchar(255)
, ColumnName nvarchar(255)
, RecordCount int);
DECLARE @sqlcmd varchar(MAX);
WITH QueryBuilder as (
SELECT ic.TABLE_NAME
, ic.COLUMN_NAME
, CONVERT(varchar(max), -- CONVERT is needed for STRING_AGG function
'INSERT INTO #ResultsTbl(TableName,ColumnName,RecordCount)
SELECT ''' + ic.TABLE_NAME + ''', ''' +
ic.COLUMN_NAME + ''', count(*) from ' +
QUOTENAME(ic.TABLE_SCHEMA) + '.' + QUOTENAME(ic.TABLE_NAME) +
' where ' + QUOTENAME(ic.Column_Name) + ' is NOT NULL') as Command
FROM INFORMATION_SCHEMA.COLUMNS ic
INNER JOIN INFORMATION_SCHEMA.TABLES tbl --This join removes Views
on ic.TABLE_SCHEMA=tbl.TABLE_SCHEMA
and ic.TABLE_NAME=tbl.TABLE_NAME
and tbl.TABLE_TYPE='BASE TABLE'
/* ADD Optional WHERE Clause to pull specific tables or columns */
)
SELECT @sqlcmd=(
SELECT STRING_AGG(Command,';' + CHAR(10))
WITHIN GROUP (ORDER BY Table_Name asc, Column_Name asc) as cmd
FROM QueryBuilder
)
EXEC(@sqlcmd);
SELECT * FROM #ResultsTbl
WHERE RecordCount=0 --OPTIONAL: To only show NULL Columns.