Congratulations to the USA Women’s Hockey team, who won the Olympic gold medal. They beat our Canadian team, in a hard-fought game that went into overtime, and ended with a shootout. The team rosters were available online, so I used those for a hockey player data analysis, in Excel pivot tables. Is there anything in the player data that shows why the USA team won?
Team Canada Fans
Last week on Twitter, Science_Goddess and I were cheering for Team Canada. They had won the previous 4 gold medals, and seemed like a sure bet. Meanwhile, David Napoli was confident that Team USA would win this time.
We made a friendly wager, and the fans of the losing team promised to write a hockey data post, congratulating the winning team. So here I am, saying congratulations to Team USA!
Since I’m not a hockey expert, or a master statistician, I’ll leave the in-depth game analysis to someone else. Instead, here is my rudimentary player data analysis, based on the team rosters. And of course, I used my favourite Excel tool – pivot tables.
Team Rosters
The rosters have player names, height, date of birth, etc., but are set up slightly differently, so there was some cleanup work to do, before starting. If you’d like to do your own analysis, you can find the hockey team rosters here:
Team Canada Site
- Women – Canada
- Men – Canada
Team USA Site
- Women – USA
- Men – USA
Wikipedia
- Women – All Teams
- Men – All Teams
What Matters?
The team rosters don’t show skill rating, or past performance stats, so what could I use for player data analysis?
- What age range is represented in the hockey teams. Are there differences among the teams? Did a team with younger players do better?
- You’d expect the players to be in top physical condition, so what does the height and weight data show? Did the bigger team win, or is there an advantage to being smaller?
- Finally, are the players from a snowy region? Most of Canada gets snow and ice in winter, but what about the USA players? Are they all from the northern states?
Age Calculations
Before doing any player data analysis, a few calculations were needed in the rosters.
First, the rosters had each player’s date of birth, so what formula could you use to get each player’s age?
To calculate the ages, I used this DATEDIF formula, but be careful with that function – it can give incorrect results in some versions.
=DATEDIF([@DOB],TODAY(),”y”)
Player Height in Feet
Next, I had to calculate the height, so it could be used in the data analysis. In the roster, the height was listed as text, with feet and inches, and a foot mark, such as 5’7
How would you convert that to a usable number?
I used the following formula to convert the text measurement to feet – for example, change from 5’7 to 5.58.
=SUM(LEFT([@Height],1), RIGHT([@Height], LEN([@Height])-2)/12)
- The first character at the LEFT is the feet (there were no double-digit heights)
- The 1 or 2 characters at the RIGHT are divided by 12
- SUM those amounts, to get the height in feet
Height and Weight
The weight was entered as a number, so that didn’t need to be changed. I thought it would be interesting to get a height/weight ratio, so I used the BMI calculation for that.
weight in pounds / [height in inches x height in inches] x 703
First, I needed another column, to convert the height to inches, from feet. That was a simple formula – multiple the calculated foot height by 12:
=[@HeightFt]*12
Then, for the BMI calculation, I used this formula – rounding to zero decimals, to limit the number of BMIs. You could round to 1 or 2 decimals, if you needed greater precision.
=ROUND([@Weight]/([@HtIn]*[@HtIn])*703,0)
Summarize the Data
After the calculations were completed, I built pivot tables to show the player data analysis. The first table shows player counts, for each position, plus age, height and weight.
In the pivot table, the age, height and weight are shown using 3 different summary functions – Average, Minimum, and Maximum. After you add the value, right-click on one of the numbers, click Summarize Values by, and choose one of the functions.
In the country subtotal row, conditional formatting highlights the higher amounts in red, and lower amounts in green.
With the Slicer, the Women’s data was selected. On average, the USA team is a couple of years younger, a bit shorter and 4 pounds lighter.
For the Men, the data is similar – the USA team is a little younger, shorter and lighter.
See the Differences
Instead of doing the arithmetic in your head, you can get a pivot table to do custom calculations. I added two more pivot tables, below the first one, connected to the same Slicer.
The first pivot table shows the difference as an amount, and the second pivot table shows the percent difference, based on the amounts for Canada.
I hid the rows for Canada, because they are empty.
Player Age Ranges
Next, I created a pivot table and pivot chart, to show the age ranges for the Mens and Womens players, from both countries. The men have a substantially longer career, from what this shows.
NOTE: I created a new Theme for the workbook, with red and blue as the first 2 accent colours. That way, all the charts had red and blue bars.
Player BMI by Position
Next, I set up a pivot table to show the BMIs, with a count of players in each position – Defence, Forward and Goalie. I changed the values to % of Column total, and created a stacked bar chart to show the results.
To keep the BMI list consistent, I set that field to “Show Items with No Data”. Otherwise, some of the BMIs would not be listed, when you switch been the data for Men and Women.
There’s a noticeable difference between the BMIs for the Women’s teams.
Or, just look at the player counts per BMI on each of the Women’s teams – the patterns are quite different. Was that body type difference a factor in the USA victory?
Home Town Locations
Finally, to see where the player home towns were located, I used the 3D Maps feature in Excel 2016. That command is on the Insert tab, between Charts and Sparklines.
The maps are not in the sample workbook though – I created them in a separate file. If you have the 3D Map tool, you can build your own maps after you download my file.
I was surprised that the map was able to get the locations from the shortened state and province names, e.g. “Ont.”, instead of requiring full names or official postal abbreviations.
Here’s the map of home towns for all the Canadian players. They’re spread from coast to coast, with most cities close to the US border, and some further north, in the the prairie provinces.
And here is the map of USA player home towns. Most are in the north and northeast, with a few scattered across the rest of the country. There were none in Alaska or Hawaii.
Why Did Team USA Win?
From data in the hockey player rosters, I can see that the USA Women team members were slightly younger, shorter and lighter, on average. Maybe that gave them an advantage against our Canadian team.
Or maybe they’re just better under the pressure of a game-deciding shootout!
Get the Player Data Analysis Workbook
Click this link to get my player data analysis workbook. The zipped file is in xlsx format, and does not contain macros.
The file does not contain the 3D map, so you can create one after you download the workbook.
Video – Gold Medal Game Highlights
Here’s a video from CBC Sports, showing some of the Women’s Hockey gold medal game highlights.
________________
The post Hockey Player Data Analysis in Excel appeared first on Contextures Blog.
Original source: http://blog.contextures.com/archives/2018/03/01/hockey-player-data-analysis-in-excel/