How to Clean Public Data with Tableau Prep Builder: UK Crime, Population & Poverty

There is plenty of public data provided by various governing bodies that you can access yourself. One such dataset is Crime reporting data, provided by https://data.police.uk/.

However, this data does not always come in a format that makes for easy analysis with Tableau. This data is given to us in the form of an individual file for each month, so these files must be unioned before any analysis can be done.

Another issue with public data is that it often comes from multiple sources. For example, you may want to analyse Crime rates and see how they compare to population and/or poverty rates but these other statistics come from elsewhere, https://www.nomisweb.co.uk/ and https://www.ons.gov.uk/ respectively. These data sources also have similar issues in that they are not ready for analysis 'out of the box'

Here is a quick look at how these 3 datasets look like

Crime & Population have multiples files

Fortunately, we have Tableau Prep Builder that can do the heavy lifiting for us, we just need to tell it what to do.

Let's tackle each dataset one at a time

UK Crime Data

Let's handle how we deal with multiple files

Section of the Tableau Prep flow for UK Crime Data

When you input the file, you can have the option to select multiple files. We want to use a Wildcard union which will 'stack' each file on top of each other. In the image below, you'll see a 2020*. the * indicates that it will look for any file that begins with 2020 in whatever folder you point Tableau Prep to.

That's all we need to do here to collect all the files into one. Very simple and much better than Ctrl C, Ctrl V

We then need to clean the data. Fortunately there isn't much to do here but the main things that we want to do is extract our date from our filepath/filename and extract our MSOA code from LSOA as we'll use this alter to join our data

Tableau Prep will pull in the filepath when you perform a Wildcard Union. We can split the '2020-month' from this and turn it into a date.

We then remove any unwanted fields.

Poverty Rate Data

As we saw above, there are a lot of useless 'header' rows in the data that need to be removed. As the data comes in .xlsx format, tableau prep has a Data Interpreter that can automatically remove these rows.

Once removed the main thing here is to convert the numbers into decimal percentages (i.e. 14% becomes 0.14). We are doing this as it is more useful for calculations and Tableau is able to change the default number property to percentage so that the number displays in a more human friendly way. We do this by making a calculated field



Joining Crime Data and Poverty Data

Previously we extracted our MSOA code from LSOA. We did this so that we have a common joining field in our crime and poverty data.


We perform a left outer join here. This means that it will join all the poverty data where there is an applicable MSOA code. It also means that any crimes without a location are still captured by the join so we don't lose any of our crime data.

Join Settings


We perform a final clean to remove any duplicated fields from the join. We then output to a Tableau hyper file, so that it can be used directly in Tableau

Population Data

The data comes in an individual file for each age group (although there is the option to get a summary file without age groups). We again use a wildcard union to stack the data together.

We can then extract our age groups from the file name



However it does need restructuring so that it can be used in Tableau. Here we perform a 'Column to Row' pivot, so that the years are in rows

This is very straightforward, you just need to drag your years into the the respective containers in the Pivot settings

2019 and 2020 have been chosen to pivot from columns to rows

After this you are free to output your data.

Whilst you can join this data onto our other file using LSOA, this would explode the size of our data as there would be individual rows for each age group, so instead we can use a relationship in Tableau Desktop.

Now that our data is ready, we can go ahead and build some dashboards!

Below is an example of the type of analysis you could perform using our newly cleaned data

You can view this dashboard on https://public.tableau.com/app/profile/jacob.kilroy/viz/CrimePoverty2020/CrimePovertyDashboard?publish=yes

Author:
Jacob Kilroy
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab