Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Databases SQL Basics Finding the Data You Want Review & Practice with SQL Playgrounds

Aden Ziguang Wang
Aden Ziguang Wang
1,307 Points

In the SQL Playground - Stage 3: Practice Question: Movies with any Missing Data

Instruction:

-- Find all movies with any missing data

my answer:

select *from movies where year_released is null or genre is null;

My questions is:

Is there any simpler way to find out whats missing without checking on the table itself?

eg: select * from movies where * is null (I know this is incorrect format but Im just trying to make my point, hope someone will understand);

3 Answers

Steven Parker
Steven Parker
231,271 Points

Note that your query shows ONLY the rows with missing data. But if you are thinking of a case with many columns, it's possible to create a (much MORE verbose) query that would return the ID and the name of each empty column.

SELECT id,
  CASE WHEN year_released IS NULL THEN 'Released  ' ELSE '' END ||
  CASE WHEN genre IS NULL THEN 'Genre' ELSE '' END as missing
FROM movies
WHERE year_released IS NULL OR genre IS NULL;

:point_right: Imagine that for a much larger database:exclamation:

Aden Ziguang Wang
Aden Ziguang Wang
1,307 Points

Thank you Steven for the detail explanation.

Steven Parker
Steven Parker
231,271 Points

It was fun, but I admit that was a lazy column namer, an elegant one would put separators between multiple column names.

Steven Parker
Steven Parker
231,271 Points

Verbosity is a common complaint about SQL. The clearest way to code this test, even with many columns, is pretty much how you show it. You can eliminate the test for id, since you can assume any existing record would have one.

Now depending on which SQL engine you use, there may be more compact ways to do the job, but they may have other drawbacks. For example. in Oracle you can say:

SELECT * FROM movies WHERE NVL2(year_released,1,0)+NVL2(genre,1,0) < 2;

...but that's not a whole lot shorter, it's much harder to understand, and certainly not portable to other SQL engines.

Now, for commonly used queries, it might make sense to create a view. It still retains the verbosity, but moves it out of view in the main query. In that case, you might end up with a short main query like this:

SELECT * FROM movies_with_missing_data;

but behind the scenes you have already done this:

CREATE VIEW movies_with_missing_data AS
SELECT * FROM movies WHERE year_released IS NULL OR genre IS NULL;
Aden Ziguang Wang
Aden Ziguang Wang
1,307 Points

Hi Steven, Im trying to be lazy here. See if the database is so mass that it is kind of impossible to scroll down each row to see if theres a missing data. In that case, how can I check what column / row is missing data?

Hi Aden,

I'm not sure if I understand but I think you looked at the movies table first and noticed that the year and genre columns were the only ones that had a null value and so you only included those in your query.

Now you are wondering how you would do that with a massive table?

You wouldn't normally be checking the table in advance. You have to assume any column could have a null value and you need to include all columns. In this case, the title should have been included in the check too. As Steven mentioned, you could skip the id because it's guaranteed to be there and not null.

When creating a table you can give columns a NOT NULL requirement. Any columns with that requirement would not need to be checked in the query. So you could possibly reduce the query by first checking the schema and seeing if any columns have a non null requirement.

Absolutely none of these answers are relevant to the SQL course that was taught up to this point. These other methods work but have not been introduced. This is why I am going to the forums here with the same question.

based on what was taught in the beginners SQL track up to exactly this point, If I have a form database with 1,00 columns and 2 Billion rows and for whatever reason, the form is allowed to leave blanks but I need the data, how would I retrieve all rows that have NULL in them?

Please, anyone, answer me within the confines of the course/track teachings up to this point.

Do I sit there for the next two days writing a very long SQL statement?

Alan McClenaghan
seal-mask
.a{fill-rule:evenodd;}techdegree seal-36
Alan McClenaghan
Full Stack JavaScript Techdegree Graduate 56,501 Points

Exactly. What I'm expecting to see is something like this:

SELECT * FROM movies WHERE ANY CELL IS NULL

What SQL keyword would replace ANY CELL in this query?

Otherwise you need to write something like this:

SELECT * FROM movies WHERE id IS NULL OR title IS NULL OR year_released IS NULL OR genre IS NULL

There must be something more elegant than this monstrosity!