Aggregation in Tableau Prep Builder - Solving Multiple Issues in One Step

Tableau Prep Builder is Tableau's answer to any task related to preparation. Data preparation is covered early on in your Data School journey. This blog post provides a quick exemplary insight into Tableau Prep Builder, a tool that's widely used to clean & prepare data that's about to be explored and visualized. The data used here is the data from challenge 2022/28 of the Preppin' Data Challenges and can be downloaded here. These are the steps you are required to complete to finish the challenge:

  1. Load the data.
  2. Separate out the "Product Name" field to form "Product Type" and "Quantity".
  3. Rename the fields to "Product Type" and "Quantity" respectively.
  4. Split the flow into two paths, one covering all liquid products, the other covering all solid products.
  5. For each path, clean the "Quantity" field to just leave values, i.e. strip it of all letters.
  6. For liquid products, ensure that every value is in mililitres.
  7. Sum up the sales for each combination of "Store", "Region" and "Quantity".
  8. Also, count the number of orders that has the combination of "Store", "Region" and "Quantity". Name this field "Present in N orders".

This blog post only focuses on points 7 and 8. As a little hint, whenever you read the words "for each", the method you're looking for is likely a grouping of some kind - and the fields that follow the words "for each" are probably the fields that you will be grouping the data by.

Full Data Preparation Workflow for the Challenge

This picture shows the finished Tableau Prep Builder workflow which is the actual solution to the challenge. The first red circle depicts the point in the workflow where the data is loaded. The second one shows the mentioned split into the two different product categories. Red circle number 3 is where the aggregation actually happens. So what is done here? Let's revisit steps 7 and 8 of the challenge's instructions. They do sound like a lot of work. The next image shows what is actually happening during the aggregation step in the workflow (red circle #3):

So, for the most important part of this challenge, all that is really needed is one aggregation - and knowing how to translate the instructions into the right combination of grouped fields and aggregated fields. This part should be pretty self-explanatory in the image above. We're looking for two target values and need them for all existing combinations of the "Quantity", "Store Name" and "Region" coumns. At the bottom part of the image you can also see what the result of the aggregation will look like: You'll end up with the three mentioned dimensions used for grouping, along with the two aggregated fields. All that is left now is renaming the "Order ID" field to "Present in N orders" as written in the instructions and you're good to go. Challenge done. Check out the next one at preppindata.com.

Author:
Matthias Albert
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab