Failing to prepare (your data) is Preparing to Fail (your clients) - II - Preparing Data in Alteryx

by Morgan A Rennie

Preparing data in Alteryx can be a crucial step in any data analysis. In this blog, we will go over some of the key steps and techniques for preparing data in Alteryx.

First, let's define what we mean by "preparing data". In the context of data analysis and machine learning, preparing data refers to the process of transforming raw data into a form that is suitable for analysis. This may involve cleaning the data, transforming it into a specific format, or even combining multiple datasets. Read more about the Importance of Preparing Data here.

The first step in preparing your data is to import it into Alteryx. To do this, you'll need to use the Input Data tool. This can be found in both the "Favourites Tab" and the teal "In/Out" tab. This tool allows you to import data from a variety of sources, including Excel files, CSV files, and databases.

Frequently used data preparation tools

Once your data is imported, you'll need to clean it up. Cleaning the data is part of what we refer to as Data Preparation, followed by transforming and combining. Data Preparation tools can be found within the Navy-Blue "Preparation" tab. Several frequently used tools can be found within the default "Favourites" tab too.

Data cleaning refers to removing any irrelevant or duplicate data and ensuring that all of the columns are properly formatted. The Data Cleansing tool can be used to remove any null rows and columns, remove unwanted characters such as trailing whitespace, and modify text cases.  You can also use the Select tool to choose which columns you want to keep - limiting the number of fields to those relevant to your analysis will make your visualizations faster and your data source less cluttered. The Select tool also allows the change of field type and provides the option to write a description that will be saved in the file metadata. Read more about metadata here.

Once your data has been formatted to your preference, you can use the Sort tool to present your data in an ascending or descending order, it also allows nested ordering. Finally, you can also use the Filter Tool to extract or split certain parts of your data based on a calculation.

Once the data has been cleaned and formatted, it may need to be transformed into a specific format. For example, you may need to convert dates or numbers, or split a column of data into multiple columns. Alteryx includes a number of tools for transforming data. One of which is the Formula tool, which can be used to apply custom transformations to your data, as well as creating or changing columns. A single Formula tool can be used to create several formulas. 

Formula Tool Input Field

Another common step in preparing data is to combine multiple datasets. This can be useful if you need to merge data from different sources, or if you want to combine multiple datasets in order to create a more comprehensive dataset for analysis. This can be achieved through aggregations, unions, and joins - found within the purple "Join" tab. Click Here for an in-depth summary of Combining Data in Alteryx. To ensure your data is complete, it is good practice to perform data checks before exporting. This can be done by using the Summarize (found within the "Transformation" tab) or Browse (found within the "In/Out" tab) tools, which allow you to explore an overview of your data.

Finally, you'll need to output your data - whether that is into a .hyper file for seamless transitioning into Tableau, or into an Excel document. You can use the Output Data tool, found within the teal "In/Out" tab, to finish off the data preparation.

In conclusion, preparing your data in Alteryx involves importing your data, cleaning it up, and performing basic data manipulation, transformations, and combinations before finally outputting your data.


Click here to read Chapter 1: An Introduction to why we Prepare Data.

Click here to read Chapter 3: An Introduction to Preparing Data in Tableau Prep