Excel 2024: Python for K-Means Clustering of Excel Data
September 27, 2024 - by Bill Jelen
Lately, I've been exercising every day in a Meta Quest virtual reality headset, using a game called Supernatural. There are several communities of Supernatural fans who create quests and challenges while playing Supernatural. A quick shout-out to my friends at ABSN - Adventures by Supernatural Ninjas.
They've launched a quest to attempt to work out in all 146 virtual environments around the world. This seems simple enough - I sorted the latitude and longitude to arrange the data from west to east, starting in Tahiti and going through California, England, Europe, Australia and ending up at Aurora Point New Zealand. During this virtual journey, I noticed one day where the tour went from Utah in the western United States, down to Easter Island in the South Pacific, and then back to Colorado in the western United States.
It struck me that this was not an efficient way to travel the world, even virtually. Here is a scatter plot showing the path.
My first attempt to improve the route was a macro to find the next-closest point that had not been visited.
But then, I ran the data set through Python in Excel using K-Means Clustering. With this tool, you ask Python to find similar groups of customers. In this code, the important variable is in the 4th line, where you specify Clusters=6. I randomly tried different numbers until the clusters visually made sense.
You get to specify how many groups. With the 148 locations, I tried 12 groups, then 10 groups, then 6 groups. Each attempt produced a new chart returned to a cell in Excel.
The data for six clusters looked pretty good. My code added the cluster number as a new column in the Data Frame. I then used a simply bit of python in a new cell that simply says "df" to return the new Data Frame to Excel.
Arranging the data in clusters and then using the macro to find the next closest point resulted in a path that was only 86K miles.
This is just one example of what you can do with Python. It will be interesting to see if Microsoft continues with Python past this 2023 preview period.
Tip: The Excel Labs add-in offers a better Python editor. For details, watch: Episode 2625.
This article is an excerpt from MrExcel 2024 Igniting Excel
Title photo by Gabin Vallet on Unsplash