There is plenty of public data provided by various governing bodies that you can access yourself. One such dataset is Crime reporting data, provided by https://data.police.uk/.
However, this data does not always come in a format that makes for easy analysis with Tableau. This data is given to us in the form of an individual file for each month, so these files must be unioned before any analysis can be done.
Another issue with public data is that it often comes from multiple sources. For example, you may want to analyse Crime rates and see how they compare to population and/or poverty rates but these other statistics come from elsewhere, https://www.nomisweb.co.uk/ and https://www.ons.gov.uk/ respectively. These data sources also have similar issues in that they are not ready for analysis 'out of the box'
Here is a quick look at how these 3 datasets look like
Fortunately, we have Tableau Prep Builder that can do the heavy lifiting for us, we just need to tell it what to do.
Let's tackle each dataset one at a time
UK Crime Data
Let's handle how we deal with multiple files
When you input the file, you can have the option to select multiple files. We want to use a Wildcard union which will 'stack' each file on top of each other. In the image below, you'll see a 2020*. the * indicates that it will look for any file that begins with 2020 in whatever folder you point Tableau Prep to.
That's all we need to do here to collect all the files into one. Very simple and much better than Ctrl C, Ctrl V
We then need to clean the data. Fortunately there isn't much to do here but the main things that we want to do is extract our date from our filepath/filename and extract our MSOA code from LSOA as we'll use this alter to join our data
Tableau Prep will pull in the filepath when you perform a Wildcard Union. We can split the '2020-month' from this and turn it into a date.
We then remove any unwanted fields.
Poverty Rate Data
As we saw above, there are a lot of useless 'header' rows in the data that need to be removed. As the data comes in .xlsx format, tableau prep has a Data Interpreter that can automatically remove these rows.
Once removed the main thing here is to convert the numbers into decimal percentages (i.e. 14% becomes 0.14). We are doing this as it is more useful for calculations and Tableau is able to change the default number property to percentage so that the number displays in a more human friendly way. We do this by making a calculated field
Joining Crime Data and Poverty Data
Previously we extracted our MSOA code from LSOA. We did this so that we have a common joining field in our crime and poverty data.
We perform a left outer join here. This means that it will join all the poverty data where there is an applicable MSOA code. It also means that any crimes without a location are still captured by the join so we don't lose any of our crime data.
We perform a final clean to remove any duplicated fields from the join. We then output to a Tableau hyper file, so that it can be used directly in Tableau
Population Data
The data comes in an individual file for each age group (although there is the option to get a summary file without age groups). We again use a wildcard union to stack the data together.
We can then extract our age groups from the file name
However it does need restructuring so that it can be used in Tableau. Here we perform a 'Column to Row' pivot, so that the years are in rows
This is very straightforward, you just need to drag your years into the the respective containers in the Pivot settings
After this you are free to output your data.
Whilst you can join this data onto our other file using LSOA, this would explode the size of our data as there would be individual rows for each age group, so instead we can use a relationship in Tableau Desktop.
Now that our data is ready, we can go ahead and build some dashboards!
Below is an example of the type of analysis you could perform using our newly cleaned data
You can view this dashboard on https://public.tableau.com/app/profile/jacob.kilroy/viz/CrimePoverty2020/CrimePovertyDashboard?publish=yes