I was debugging a piece of code today and stumbled upon a non-trivial problem, that IMHO is worth a post about.

Let’s take a “simple” example where you want to find records in one table that are not present in another table. “Present” or “Non-present” is determined by comparision against a single column, in both tables.

Sounds like a query as simple as they come, right? Well, not really. Let’s look at a sample query:

 select * from TMP_USER where UID not in (select EXT_ID from USER )

What does it return? The correct answer is: in MySQL, depends whether there are any null values in EXT_ID. If there are - it returns an empty set!!!

Not too easy to believe, huh? But running the same query with a slight modification:

 select * from TMP_USER where UID not in (select EXT_ID from USER where EXT_ID is not null)

you empirically get 116 results, none of which have UID null. Furthermore:

 select * from HV_TMP_USER where CUID is null

returns empty set. so - believe it or not - that\’s how “select .. not in” works in MYSQL.

Conclusion: watch-out for “not in” queries in MYSQL on a column that may contain null-values. The resul may be empty set, when you least expect it.