Dashboard week - day 3

We embarked on an exploration of historical London Marathon data, with a unique twist. Our task required us to narrow down the dataset to marathon runners whose names began with the first two letters of my name, 'Ri.'

Initially, this stipulation appeared to present a challenge, as it threatened to limit our dataset significantly. However, I embraced this constraint and decided to delve deeper, comparing my name, Richardson, against its various derivatives like Richards and Rich. The goal? To identify the best surname to race with among the Richards in our dataset.

Our journey began with web scraping the necessary dataset using Alteryx. This task became somewhat complex due to the data being separated by multiple web pages, but we found a solution by using simple math to determine how many pages required scraping. I generated rows creating a URL row for each page in Alteryx to download, to ensuring that we covered all the data.

Once the data was successfully downloaded, the next step involved parsing it using a series of regular expression tools. This was crucial to ensure that we could work with clean and structured data for analysis.

With the dataset in hand, I turned to Tableau to create charts and visualizations. I created some drill down bar charts to analyze the top surname types and also looked at change in finnishing positions for these surnames over time .

In the end, my visualizations revealed an interesting but perhaps not highly actionable insight: the most prevalent Richard-based surname among marathon runners in our dataset is "Riches." While this might not lead to immediate practical conclusions, it did provide a definitive answer to my unique query.

Author:
Otto Richardson
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab