Sound of NOT IN operator
Every time I see or think of using NOT IN operator my head makes loud submarine dive alarm sound. Yours should too!
Despite large number of articles and books that explain how does NOT IN operator work and describe what can go wrong, I can spot it in the wild again and again. It’s easy to explain why it requires special attention, but imaginary cost and energy savings seem to be greater. NOT IN (NULL) is well known anti-pattern. NOT IN (SELECT NotNullColumn FROM …) should be as obvious as above. Business people like their brainchildren to evolve, iterate and grow organically. Usually it means turning things around. Today’s NotNullColumn becomes nullable tomorrow. When it happens query stops working. I’m going to show repro below as query is worth 1000 words. After that there is a helper statement I use to look for NOT IN operator in objects’ definitions.
NOT IN (NULL)
SET ANSI_NULLS ON; --list SELECT 'NOT IN (NULL)' WHERE 1 NOT IN (2, NULL); --subquery SELECT 'NOT IN (NULL)' WHERE 1 NOT IN (SELECT col FROM (VALUES(2),(NULL)) tab(col)); Note - with ANSI_NULLS set to OFF this example does not work. It’s because AND 1 != NULL is no longer unknown. ANSI_NULLS is going to be deprecated at some point after SqlServer 2014. It’s recommended not to use OFF in first place.
Helper
I use following query to look for NOT IN occurrence in objects’ definitions. Two limitations to take into account - query is database scoped, based on INFORMATION_SCHEMA.ROUTINES (and there are other places like dynamic sql, SQLCLRs or jobs to look at). Second - it analysis code statically, without executing it (NOT IN can be in comment section). Instance wide version that checks execution plans has been on my todo list for some time.
;WITH ObjectListCTE AS ( --replace to use preferred method, I recommend INFORMATION_SCHEMA SELECT '[' + ROUTINE_SCHEMA + '].[' + ROUTINE_NAME + ']' AS [Object], OBJECT_ID('[' + ROUTINE_SCHEMA + '].[' + ROUTINE_NAME + ']') [ObjectId] FROM INFORMATION_SCHEMA.ROUTINES ) SELECT [Object], CAST(' - nested CDATA is not allowed REPLACE(OBJECT_DEFINITION([ObjectId]), ']]>', '') + ']]>' AS XML) [Overview], 'EXEC sp_helptext ''' + [Object] + '''' [Definition] FROM ObjectListCTE WHERE REPLACE(OBJECT_DEFINITION([ObjectId]), ' ', '') COLLATE SQL_LATIN1_GENERAL_CP1_CI_AI LIKE '%NOTIN%' AND OBJECTPROPERTY([ObjectId], 'IsMsShipped') = 0 ORDER BY 1;