1.2.6 - SQL Refresher

Last updated Jan 22, 2025

Youtube Video | ~16min

✍️ In this video we go through a SQL refresher with a focus on JOIN and GROUP BY. This example using the yellow_taxi_data and zones is helpful for the homework set.

SQL Code difference between using JOIN vs. WHERE

SELECT 
	tpep_pickup_datetime, 
	tpep_dropoff_datetime, 
	total_amount, 
	CONCAT(zpu."Borough", ' / ', zpu."Zone") AS "pick_up_loc",
	CONCAT(zdo."Borough", ' / ', zdo."Zone") AS "drop_off_loc"
FROM 
	yellow_taxi_data t JOIN zones zpu 
		ON t."PULocationID"=zpu."LocationID" 
	JOIN zones zdo 
		ON t."DOLocationID"=zdo."LocationID"
LIMIT 100;

SELECT 
	tpep_pickup_datetime, 
	tpep_dropoff_datetime, 
	total_amount, 
	CONCAT(zpu."Borough", ' / ', zpu."Zone") AS "pick_up_loc",
	CONCAT(zdo."Borough", ' / ', zdo."Zone") AS "drop_off_loc"
FROM 
	yellow_taxi_data t, 
	zones zpu, 
	zones zdo
WHERE 
	t."PULocationID"=zpu."LocationID" AND 
	t."DOLocationID"=zdo."LocationID"
LIMIT 100;

GROUP BY 1 column vs. multi-column

SELECT 
	CAST(tpep_dropoff_datetime AS DATE) as "day", 
	COUNT(1) as "count",
	MAX(total_amount),
	MAX(passenger_count)
FROM 
	yellow_taxi_data t 
GROUP BY
	CAST(tpep_dropoff_datetime AS DATE)
ORDER BY "count" DESC;

SELECT 
	CAST(tpep_dropoff_datetime AS DATE) as "day", 
	"DOLocationID",
	COUNT(1) as "count",
	MAX(total_amount),
	MAX(passenger_count)
FROM 
	yellow_taxi_data t 
GROUP BY
	1, 2
ORDER BY "day" ASC, "DOLocationID" ASC;

Resources

🔗 Use an app like Mimo to review SQL or Python basics

PreviousDocker-Compose Summary NextOptional Docker Video

Last updated 5 months ago