Power Query (Get & Transform data in Excel) is a true game changer. It can simplify and automate various data activities. The key benefits of Power Query are,
-
Connect to any type of data source and fetch data -
Define process for data cleansing and transformation -
Combine datasets (table joins, appends) -
Automate data collection & management tasks (refresh etc.)
In this Power Query Tutorial, learn what it does, how to use it with 4 practical examples, tips & tricks to work with Power Query better.
Table of Contents
What is Power Query?
Power Query is a feature of Microsoft Excel and Power BI programs. Power Query is used to,
- Set up connections to various data sources
- Pre-process data – ex: cleanup, adding columns, filtering, sorting, mashing up
- Publish the finalized data to source system – Excel or Power BI
Think of Power Query as your own SQL (querying) tool built in to Excel (or Power BI). Use it to manipulate, mash-up and manage your data.
How does Power Query look?
Power Query window looks almost like Excel workbook. This is why most beginners get confused. The key thing to remember is, Power Query is for connecting, cleaning and manipulating data. You do this by applying steps on your data.
Here is an example of Power Query editor window with 6 key areas highlighted. Click on the image to expand.
Power Query Window – 6 Important Areas
- Ribbon: You will see the ribbons on top of Power Query editor UI. There is home ribbon where common query options like column / row adjustments, clean-up and joins are found. The other ribbons are transform, add column and view.
- Queries Pane: This is where all the queries (or connections) you have in the workbook are listed. You can select a query to preview its results and work on it.
- Data view: You find the preview of current step of selected query here. As you make changes to the data, this preview is updated. This data view looks almost like Excel spreadsheet view.
- Query settings: You can adjust query name, see all the applied steps here. We use this to make changes to the query or delete any steps.
- Formula bar (optional): The optional formula bar shows corresponding M Language code for current step. You can enable / disable this formula bar from “View ribbon”.
- Close & Load button: This button, also shown as “Publish” in Power BI is what we use to close Power Query and return to main application (Excel or Power BI).
How to activate Power Query?
The process for activating or launching Power Query is slightly different in Excel vs. Power BI. Refer to below steps to launch Power Query.
Launching Power Query from Excel
- Goto Data ribbon in Excel
- Use buttons in “Get & transform data” area to make a new connection.
- Or click on buttons in “Queries & Connections” area to refresh or view the existing queries.
Power Query from Power BI
- Goto Home ribbon in Power BI
- Use buttons in “data” area to make a new connection.
- Or click on buttons in “Queries” area to refresh or view the existing queries.
Notes on Power Query availability
- Excel: Power Query is available in all versions of Excel from 2013.
- Power BI: Power Query is available in all versions of Power BI.
Power Query Tutorial – Video
I made a 1+hour tutorial on Power Query explaining every aspect of it, along with 4 full examples. Please watch it to understand and master this powerful & time-saving technology.
You can watch it below or head over to my YouTube channel.
Power Query as a Mind-map
Power Query is packed with thousands of features to clean, process and manage data. That is why it is tricky to comprehend the overall picture of it. I made a mind-map of Power Query so you can get holistic view of this life saving tool. See it below (click on the image to enlarge).
How to use Power Query?
Four Examples
In this section, I will demonstrate how to use Power Query with four full-length examples. Each example has sample data and completed workbook for you download and follow along.
Download Example Files
Includes sample data, completed Excel workbook & Power Query Mind-map PDF
Power Query Example 1:
Load and Clean-up Employee Data
In the first Power Query example, we will look at data for one thousand staff and clean it up. The data is in an Excel file. We will load it to Power Query and perform below clean-up activities. After data is clean, we will publish the dataset to Excel for analysis.
- Connect to Employee Data file
- Replace missing gender & department values
- Remove employees without salary
- Extract employee’s country and remove address column
- Extract year of join
- Publish data to Excel
Whenever there are new employees, we will simply refresh the Power Query connection and it will load new data (after apply clean-up steps) automatically. Just like magic.
The below instructions show how to do this with Excel Power Query. You can apply the same steps in Power BI too.
Step 1: Connect to the data set file from Excel
You need the employee data sample file – M01.xlsx for this example. In the downloaded ZIP file, you will find it in “Datasets” folder.
- Go to data ribbon and click on “From File” and select “Excel”. Point to the employee data set.
Here is a quick re-cap of how to connect to data from Power Query.
Step 2: Apply data cleansing steps in Power Query
Once the data is loaded into Power Query, you can quickly apply all the necessary data cleansing steps in there.
The steps will be:
- Remove top rows: The file contains 2 rows of header information which is not needed. From “Home ribbon” in Power Query editor, using the “Remove rows” button, remove top 2 rows.
- Promote headers: Now that our data is has no extra rows on top, let’s use row number 3 as header. From home ribbon, just tap on the “use first row as headers” button.
- Remove blank columns: The file also loads a few blank columns. Just use “Remove columns” button to nix them.
- Replace missing values – gender & department columns: Select each column with missing data and either right click or use “replace values” button to find & replace nulls with alternative values.
- Remove staff with no salary: This operation is also called “Filtering”. Just use the filter button on salary column and remove any “null” values.
I have illustrated the screen buttons for these 5 data cleansing steps on Power Query UI below. Check it out if you need help.
Power Query Tip: Plan first, clean next
Save a lot of time by planning all your data cleansing steps first. Think about ways in which your data could change in future and build your data cleansing around them.
Step 3: Extracting country from address to new column
So far, all our data cleaning steps are in-place. But now, we must add a new column with the country of employee. You can use “Add column” ribbon of Power Query to do such operations.
For example, to extract “USA” from the address “1 Infinite Loop, Los Angels, CA, USA“, we can use text after delimiter option.
To extract the country, select the address column and use Add Column > Extract > Text After Delimiter option.
Note: In the video, I use a more advanced version of this as our addresses are not so straight forward.
You can use similar approach to add “year” from date of join.
Step 4: Publish data to Excel for analysis & reporting
Once your data is clean and ready, click on “Close & load” button in home ribbon. This will load data to Excel as a table. You can use this table for data analysis or reports.
How to refresh:
Whenever there is new data added to the employee data file, just head over to the Excel file with connection and refresh it (shortcut: Ctrl+Alt+F5 will refresh all connections)
Your file, associated analysis and charts will all be updated.
Power Query Save & Load options – Explained
- In Excel: You can load Power Query data in three ways – as a table on spreadsheet, a table to data model or connection only.
- Power BI Save & Load: With in Power BI, you can either load a Power Query table or leave it in the query editor. If you do not load a table, you can still have it refreshable for calculation purposes.
for more on this example…
Please refer to the Power Query tutorial video above, timestamp 24:10 onwards.
Power Query Example 2:
Scraping Web Data & Reshaping it
In this example, let’s use Power Query to scrape web data from List of Indian States page on Wikipedia.
On that page, there is a historical census data table (depicted below) and we will connect to it from Power Query. Once we have the data, we will unpivot it to tabular format for easy analysis.
Please use the URL – https://en.wikipedia.org/wiki/List_of_states_in_India_by_past_population for this example.
Step 1: Connect to web data source
From Data ribbon, use the “from web” button. Paste below URL https://en.wikipedia.org/wiki/List_of_states_in_India_by_past_population
and connect.
Power Query will show all the tables on the web page. Select the census table and click on “transform” button.
Step 2: removing unnecessary rows
The wikipedia table has an extra header row and a grand total row. Just remove these with “Remove Rows” button.
Step 3: Unpivoting the data
The census data has state in one column and each census population in one column. We can turn this in to a standard three column table – state, year, population using unpivot option.
- Right click on state column
- Pick “unpivot other columns” option
- Done, your data is now in tabular format.
Power Query Web Connection Tips
- Privacy settings: When you make a web connection, PQ will prompt you for access type details. Most web data can be accessed anonymously. But you can also use login access or windows credentials to authenticate requests.
- Check frequently: If the source website changes their format or presentation of data, then your Power Query connections will break. It is a good idea to check such connections once in a while to make-sure they are working.
- URL parameters: You can use “Advanced” option during connection time to set up URL parameters or variables. This gives you flexibility to access things like page 5 of a result set.
for more on this example…
Please refer to the Power Query tutorial video above – timestamp 47:09 onwards
More Power Query Web Scraping Examples
- FIFA worldcup – scores, tables & details – web scraping with Power Query [Beginner]
- URL parameter example – weather data to Excel [Beginner]
- Using API to access and export video comments from YouTube with Power Query [Advanced]
Power Query Example 3:
Combine data from all files in a folder
You can use “folder” connection option in Power Query to easily consolidate data from all files in a folder.
In the third example, we will take all the project files in the example dataset and combine data to one table. This process is a bit clumsy to explain in text alone. So please watch the video segment (timestamp 58:48) to understand this fully. I am providing a summary of the folder combine technique below.
In the download ZIP file, use the Datasets > CSV folder as data source for this example.
- Start by making a folder connection (Data ribbon > Get Data > From File > From Folder)
- Point to the folder where your files reside
- Click on “Combine” button for a simple and quick data combine.
- Or choose “Combine & Transform” option for customizing the transformation process.
- Power Query will show one of the files and asks you how you want to extract data. Based on your selection, PQ will apply the same logic to all files and combines the data for you.
Here is a quick overview of the folder connection process.
for more on this example…
Please refer to the Power Query tutorial video above – timestamp 1:09:08 onwards
Power Query Example 4:
Joining and Appending Tables
In the last example of Power Query, we will learn how to merge and append tables. These are similar to SQL operations.
- Power Query Merge = SQL Join
- Append = SQL Union
In the download ZIP file, use the “students & courses.xlsx” file for this example.
Appending two tables with Power Query:
To append one table at the end of another table, you must have same columns in both tables. It doesn’t matter if the columns are not in the same order (for ex. table 1 can have student, course and table 2 can have course, student. PQ will append ok).
What if one table has more or less columns?
In that case, Power Query will still merge, by including all columns. The missing values will be shown as null in the final appended table.
To append two tables:
- Select either table and view in the preview grid.
- Go to Home ribbon > click on “append queries“
- Follow the screen prompts and specify the second table name.
- Power Query will append second table at the end of first table.
You can also use this to append more than two tables.
Merging (Joining) two tables:
In order to merge or join two tables, you nee a common value in both tables. In the below example, you can merge:
- Students & Enrollment on “Student ID“
- Course & Enrollment on “Course ID“
Example of common field between tables
Students Table
- Student ID
- Name
- Date of birth
- Class
- Parent 1 name
- Parent 2 name
Course Table
- Course ID
- Course name
- Instructor
- Credits
Enrollment Table
- Course ID
- Student ID
To merge two tables:
- Select the source table
- Go to Home ribbon and click on “Merge queries“
- Follow screen prompts and select common column (field) between both source and target tables.
- Specify the join type. Leave the default “left outer join” if you just want matching values in target table when present. Try other kinds of join (merge) if you need.
- Power Query will perform the merge and show corresponding rows of target table as a new column.
- Expand this column and your merge will be complete.
Download Power Query Examples
Please use below button to download all Power Query examples, sample data and mind-map PDF.
Download Example Files
Includes sample data, completed Excel workbook & Power Query Mind-map PDF
Power Query – My Top 5 Tips
Look before leap:
Before you create any complex queries, just jot down all the steps needed and arrange them in logical order. Apply filters & error handling steps first, then add columns or transformations.
Profile your columns
Power Query offers easy and useful column profiling options. Use them to investigate how good your data is and fix problems with a click.
Formula bar is your friend
Enable formula bar from “View” ribbon. This is a one time step. Once you have it on, you can passively learn the Power Query language – M. You can use formula bar to quickly adjust or type new M code.
Right click before you look elsewhere
Many common and useful Power Query operations can be done by right clicking on the columns. This saves time too.
There is more than one way to do things
and that is ok. Just stick with what works in the beginning. Learn and improve your queries over time.
Got a Power Query tip? Share it with me in the comments section.
Power Query – Recommended Resources
Power Query Books
I recommend the excellent Collect, combine and transform data using Power Query book by Gil Raviv.
Power Query Courses
If you want a comprehensive and hands-on data analysis course with plenty of Power Query examples, check out my Excel School online classes.
Power Query Web Sites
These are my absolute favorite helpful websites that frequently write about Power Query. Please bookmark and enjoy.
- Excel Guru blog by Ken Puls – For Power Query tips, practical scenarios and guidance.
- Cross Join blog > Power Query pages by Chris Webb – for intermediate to advanced PQ applications.
- M language primer (multi-part series) for advanced users by Ben Gribado
- Power Query best practices [Microsoft Docs]
More PQ examples on Chandoo.org
Check out below examples for more on Power Query.
- Leave analysis with Power Query
- Folder consolidation – advanced example
- Oddly shaped data to table – with Power Query
- More on Power Query
The post Power Query Tutorial – What is it, How to use, Full examples, Tips & Tricks appeared first on Chandoo.org – Learn Excel, Power BI & Charting Online.
Original source: http://feedproxy.google.com/~r/PointyHairedDilbert/~3/TMruqiCG_0Q/