Dashboard Week Day 5 - Motorsports LEGO Analysis Dashboard

For my final data analytics project, we were given a LEGO database and asked to build a dashboard that analysed motorsport-related sets. The project involved working with a relational SQL database, extracting the relevant data, and presenting it in a way that was easy to understand through visualisations.

The dataset covered LEGO sets from 1949 to 2026. At first, I began narrowing the data down to only the motorsport-related themes. But soon after, I realised I wanted all other lego set data too, in order to compare motorsports to other lego sets, and satisfy that requirement of the brief. From there, the goal was to turn the raw data into a dashboard that highlighted key trends and answered questions about the different themes, brands and set designs.

Building the Database Queries

The first stage of the project was working with the SQL database. First, I had to download the zip files from rebrickable, and extract them. I then created a schema in snowflake, and loaded the extracted tables in. The data was spread across multiple tables, so I had to decide which tables were relevant to my analysis, and write SQL queries that joined a few tables together before I could begin the analysis.

This involved joining tables containing sets, themes, inventories, inventory parts and colours. 

I also spent time checking the data for duplicate inventory records to make sure the final numbers were accurate and consistent throughout the dashboard.

Finally, I connected to snowflake in tableau desktop.

Designing the Dashboard

Once the data was ready, I imported it into Tableau and focused on building a dashboard that was clear and easy to navigate.

I split it into four main sections:

  • An overview showing the total number of sets and pieces.
  • A comparison of the different LEGO themes and how they have changed over time.
  • A treemap showing the distribution of brick colours.
  • A comparison of the different automotive brands featured across the sets.

Keeping these sections separate made it easier for users to move from the overall summary into the more detailed analysis.

What the Data Showed

One of the most interesting findings was the difference between the Speed Champions and Technic themes.

Speed Champions had the highest number of individual sets, with over 100 included in the dataset. Technic, however, accounted for more than 60,000 of the nearly 98,000 total pieces, showing that while there are fewer sets, they are much larger and more detailed.

The timeline also showed a noticeable increase in motorsport set releases from 2024 onwards, with production reaching its highest point around 2025 and 2026.

When comparing manufacturers, McLaren had the largest number of individual sets, while Mercedes had the highest total piece count thanks to several large Technic models. Ferrari and Porsche also featured heavily throughout the dataset.

What I Learned

This project reinforced how important the data preparation stage is. Building the SQL queries and checking the data took far longer than creating the visualisations themselves, but it meant the dashboard was based on accurate and reliable information. However, upon reflection, there were a few fields that I left out of my SQL query, which in hindsight I would have included in the analysis. For example, I would have included a parent code to allow me to generate motorbike data too.

I also gained more experience working with relational databases, writing SQL queries, and designing dashboards that present information clearly without overwhelming the user.

If I were to continue developing this project, I would also try and find pricing data so users could compare the average cost per piece across different manufacturers and themes.

Final Thoughts

This project brought together many of the skills I've developed throughout training, from SQL querying and data cleaning to dashboard design in Tableau. It was a good opportunity to work through the full analytics process, starting with a relational database and finishing with an interactive dashboard that presents the data in a clear and meaningful way.

This was a great challenge for my last ever day of training!

Author:
Kate Loder
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2026 The Information Lab