Dashboard Week - Day 4 - Candy Survey Data

 For the fourth day of dashboard week, we had a real curveball thrown at us. We had to create a dashboard based on survey data about candy. Simple enough, right? Well, interestingly, that wasn’t the case at all. An overarching issue with the data was just how poorly formatted and inconsistent it all was. Ideally, we needed to bring together separate datasets in order to make a consolidated dataset of survey candy data. This was difficult because each of the datasets were structured slightly differently. For example, one dataset was aggregated while the rest were disaggregated. Of the datasets that were disaggregated, they didn’t have the same number of rows or columns and their naming conventions for columns and values were inconsistent. If you think I’m exaggerating, take a look at the data for yourself! Be sure to download data for 2017, 2016, 2015 and 2014 (I was able to consolidate data for 2014 to 2016.) To get a feel for what the datasets convey, refer to survey form for 2017.

 Without a doubt, this dashboard week project was the most data-cleaning intensive. Regardless, it was nice to get more practice with cleaning in Tableau Prep. I also liked that it was necessary to pivot and union the datasets so that the consolidated dataset was “tall” instead of “wide”, which is how the other datasets originally were (with exception to the aggregated dataset.)

 Here’s a picture of the Tableau Prep workflow that I constructed. Below that picture, there will be another picture of the consolidated dataset.

 This is the original workflow that I created. Unfortunately, I wasn’t able to use this workflow because I kept running into an error that prevented the output tool from working properly. While I wasn’t able to really identify or troubleshoot the real issue, I figured out the issue most likely had something to do with the 2017 data. So, I excluded that data (which means that the top/blue flow was removed). In other words, the actual workflow that I used is pretty much the same one as the one above.


 This is a sample of the consolidated dataset which has 4 columns and 281 rows. Given the lack of time and general data issues, I had to remove a lot of columns. If I find the time to revisit this project, I would bring in columns such as age, gender or even country so that way the dashboard could be further enhanced with additional demographic information.

 And with that, we can now jump into the dashboard! Unsurprisingly, the dashboard is quite sparse. After all, the dataset that the dashboard is built off of is quite small. With additional columns, as mentioned in the prior paragraph, the dashboard can be improved.


 Besides including more data, such as demographics, it would be nice to see some statistics, like in the form of a KPI, where it translates to something like “55% of people in 2014 felt joy eating Skittles.” This kind of language and insight was missing in my dashboard and was a frequently mentioned point of feedback across my cohort. Something else I could do to enhance clarity and reduce confusion would be to change the title of the horizontal stacked bar chart (the chart below the pie chart) and change how it’s sorted. Besides these opportunities for enhancement, I was able to incorporate some functionality by applying a filter action to the pie chart. So, whenever someone would select a slice of the pie, that selection would filter the remaining bar charts.

 That all said, I enjoyed working on this project because it really showed me just how messy and silly survey data can be. My cohort and I laughed a lot about some of the data that we saw, such as obviously fake candy items like “dental paraphernalia” and “broken glow stick” or how some survey respondents said that their age was “50 (despair)” or “old enough” and so on. While tricky and a bit tedious (mostly because of repetitive clicks), I enjoyed trying to wrangle the data. Overall, I feel a newfound appreciation for survey data. Some of the most popular or insightful infographics I’ve ever seen ultimately were built from survey data. It takes a lot of data organization to ensure that survey data is properly generated, formatted and visualized so to have some direct experience with doing this work was informative. I can easily imagine what kind of data cleaning work I would need to do, for example, if I ever have the opportunity to work with NGOS, public agencies and other types of organizations that rely heavily on survey data.

Author:
Lyon Abido
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2024 The Information Lab