SQL (Structured Query Language) is one of the most important skills for us, data people. So in this article + video, get the necessary SQL skills you need for Data Analysis work.
Step 0: Install MySQL software
I am using the FREE MySQL Community Edition software to learn & practice SQL at home. You can get it from here.
If you have any other database software available (such as SQL Server or Oracle), you can use them to follow this tutorial.
Step 1: Import Awesome Chocolates Dataset
You need some data to practice SQL. So I prepared a sample dataset for a fictional (but yummy) company called Awesome Chocolates.
Download the .SQL file from here.
After you have the file,
- Open MySQL Workbench, login if necessary
- Click on the “server administration” tab (see illustration, click to expand)
- Click on “Data Import/Restore”
- Select the option “Import from self-contained file”
- Specify the path of the downloaded awesome-chocolates-data.sql file
- Start import
At the end of these steps, your MySQL should have the awesome chocolates database. Congratulations
You can see this from “Schemas” tab on the workbench
Step 2: Learn SQL for Data Analysis with this video
Everything is ready. Time to learn SQL.
I made an hour long tutorial to explain all the necessary SQL concepts for you. In this video, you will learn:
- How to use SELECT statement to answer business questions
- Working with WHERE clause
- Using AND, OR, NOT and combining them to create complex queries.
- Sorting query results using ORDER BY
- Combining data from two or more tables using JOINS
- Creating reports with GROUP BY
- More than 50 example queries, tips and ideas
Please watch the video below or on my YouTube Channel.
The Queries
Here are some of the example queries covered in the video lesson. Feel free to copy paste them in to SQL console to see how they work.
-- Select everything from sales table
select * from sales;
-- Show just a few columns from sales table
select SaleDate, Amount, Customers from sales;
select Amount, Customers, GeoID from sales;
-- Adding a calculated column with SQL
Select SaleDate, Amount, Boxes, Amount / boxes from sales;
-- Naming a field with AS in SQL
Select SaleDate, Amount, Boxes, Amount / boxes as 'Amount per box' from sales;
-- Using WHERE Clause in SQL
select * from sales
where amount > 10000;
-- Showing sales data where amount is greater than 10,000 by descending order
select * from sales
where amount > 10000
order by amount desc;
-- Showing sales data where geography is g1 by product ID &
-- descending order of amounts
select * from sales
where geoid='g1'
order by PID, Amount desc;
-- Working with dates in SQL
Select * from sales
where amount > 10000 and SaleDate >= '2022-01-01';
-- Using year() function to select all data in a specific year
select SaleDate, Amount from sales
where amount > 10000 and year(SaleDate) = 2022
order by amount desc;
-- BETWEEN condition in SQL with operators
select * from sales
where boxes >0 and boxes <=50;
-- Using the between operator in SQL
select * from sales
where boxes between 0 and 50;
-- Using weekday() function in SQL
select SaleDate, Amount, Boxes, weekday(SaleDate) as 'Day of week'
from sales
where weekday(SaleDate) = 4;
-- Working with People table
select * from people;
-- OR operator in SQL
select * from people
where team = 'Delish' or team = 'Jucies';
-- IN operator in SQL
select * from people
where team in ('Delish','Jucies');
-- LIKE operator in SQL
select * from people
where salesperson like 'B%';
select * from people
where salesperson like '%B%';
select * from sales;
-- Using CASE to create branching logic in SQL
select SaleDate, Amount,
case when amount < 1000 then 'Under 1k'
when amount < 5000 then 'Under 5k'
when amount < 10000 then 'Under 10k'
else '10k or more'
end as 'Amount category'
from sales;
-- GROUP BY in SQL
select team, count(*) from people
group by team
SQL Practice Problems
Once you understand the concepts I’ve demoed in the video, try to solve below homework problems.
If you want to cheat, use the solutions tab to see the answers.
INTERMEDIATE PROBLEMS
You need to combine various concepts covered in the video to solve these
1. Print details of shipments (sales) where amounts are > 2,000 and boxes are <100?
2. How many shipments (sales) each of the sales persons had in the month of January 2022?
3. Which product sells more boxes? Milk Bars or Eclairs?
4. Which product sold more boxes in the first 7 days of February 2022? Milk Bars or Eclairs?
5. Which shipments had under 100 customers & under 100 boxes? Did any of them occur on Wednesday?
HARD PROBLEMS
These require concepts not covered in the video
1. What are the names of salespersons who had at least one shipment (sale) in the first 7 days of January 2022?
2. Which salespersons did not make any shipments in the first 7 days of January 2022?
3. How many times we shipped more than 1,000 boxes in each month?
4. Did we ship at least one box of ‘After Nines’ to ‘New Zealand’ on all the months?
5. India or Australia? Who buys more chocolate boxes on a monthly basis?
INTERMEDIATE PROBLEMS:
— 1. Print details of shipments (sales) where amounts are > 2,000 and boxes are <100?
select * from sales where amount > 2000 and boxes < 100;
— 2. How many shipments (sales) each of the sales persons had in the month of January 2022?
select p.Salesperson, count(*) as ‘Shipment Count’
from sales s
join people p on s.spid = p.spid
where SaleDate between ‘2022-1-1’ and ‘2022-1-31’
group by p.Salesperson;
— 3. Which product sells more boxes? Milk Bars or Eclairs?
select pr.product, sum(boxes) as ‘Total Boxes’
from sales s
join products pr on s.pid = pr.pid
where pr.Product in (‘Milk Bars’, ‘Eclairs’)
group by pr.product;
— 4. Which product sold more boxes in the first 7 days of February 2022? Milk Bars or Eclairs?
select pr.product, sum(boxes) as ‘Total Boxes’
from sales s
join products pr on s.pid = pr.pid
where pr.Product in (‘Milk Bars’, ‘Eclairs’)
and s.saledate between ‘2022-2-1’ and ‘2022-2-7’
group by pr.product;
— 5. Which shipments had under 100 customers & under 100 boxes? Did any of them occur on Wednesday?
select * from sales
where customers < 100 and boxes < 100;
select *,
case when weekday(saledate)=2 then ‘Wednesday Shipment’
else ”
end as ‘W Shipment’
from sales
where customers < 100 and boxes < 100;
HARD PROBLEMS:
Check in next week for solutions.
Resources to Learn More
SQL is a great skill to have if you work with data. Please use below courses, books, articles & websites to learn more.
SQL BOOKs
I recommend getting these SQL books.
- SQL Quick Start Guide by Walter Shields
- SQL for Data Analysis by Cathy T
- SQL All-in-One for Dummies by Allen G. Taylor
SQL COURSEs
I recommend trying out these courses on SkillShare academy.
- SQL 101 by Alvin Wan
- SQL Database & Queries
SQL WEBSITEs
Do check out these helpful websites to learn and understand various SQL concepts.
- W3Schools SQL
- Introduction to SQL by Khan Academy
- Introduction to T-SQL by Microsoft Docs
If you use my links to purchase the books or courses, I get a small affiliate commission.
There is no extra cost to you, obviously.
SQL Alternatives
If you want an alternative to SQL, consider learning Power Query.
Here is an article and here is a video to help you with that.
All the best
I wish you all the best with your SQL learning. Do let me know in the comments below if you have enjoyed this article and the video.
The post Learn SQL for Data Analysis in one hour appeared first on Chandoo.org – Learn Excel, Power BI & Charting Online.
Original source: https://chandoo.org/wp/learn-sql-for-data-analysis/