Ben Gorman

Ben Gorman

Life's a garden. Dig it.

Challenge

The public table bigquery-public-data.ncaa_basketball.mbb_games_sr contains the results of NCAA mens basketball games. Use this table to determine the most "unfair" venues during the 2016 season, where the number of personal fouls called on the away team was significantly higher than the home team. Specifically,

  • Filter the games for season 2016
  • Group the data by venue_name
  • Exclude venues with fewer than 10 games
  • Sort the result by the average difference in personal fouls (home - away, descending)

Expected Result

unfair-venues

Bonus

Save the result as a view.

What's a view?

A view is like a table whose data is based upon a query. For example, if you have a big table of sales that you want to aggregate as monthly_sales. Two options you might consider are

  1. Schedule a daily query to update your monthly_sales table
  2. Make monthly_sales a view (as opposed to a table)

Pro & Cons

  1. The view is always up-to-date, whereas the table can be ~24 hours stale.
  2. The table takes up storage. The view doesn't.
  3. Accessing data from the table is be faster than from the view.
  4. The view executes the query as needed whereas the table executes the query once per day.

Solution

SELECT 
  venue_name,
  COUNT(*) AS games,
  ROUND(AVG(h_personal_fouls), 2) as home_personal_fouls,
  ROUND(AVG(a_personal_fouls), 2) as away_personal_fouls,
  ROUND(AVG(a_personal_fouls) - AVG(h_personal_fouls), 2) as diff_personal_fouls
FROM `bigquery-public-data.ncaa_basketball.mbb_games_sr`
WHERE season = 2016
GROUP BY venue_name
HAVING games >= 10
ORDER BY diff_personal_fouls DESC

Explanation

Nothing special here - just a plethora of basic SQL functions and clauses used together.

Bonus

Creating a view from a query in BigQuery is easy enough!

  1. Click Save > Save View make-view-1

  2. Select a project and dataset, and enter a "table" (view) name make-view-2

Access Denied Error 🤦‍♂️

If you get an error like this one,

make-view-3

It's probably because you're attempting to save the view in a different data location than the source table. In BigQuery, views must reside in the same location as their source table. In this case, the source data lives in the US region, so you'll need to save the view in a dataset that also lives in the US region.

make-view-4