Regular Expression (RegEx) in Alteryx - Part 2

Welcome back if you are from Regular Expression (RegEx) in Alteryx - Part 1. In part 1, I introduced the qualifiers, quantifiers, and 4 main output methods (Replace, Tokenize, Parse, Match) in the RegEx tool of Alteryx. To continue part 1, I am going to summarize some RegEx functions in Alteryx and the tool that I usually apply RegEx functions to clean or prepare data.


Regular Expression Functions in Alteryx

There are 3 main RegEx functions in Alteryx which are REGEX_CountMatches, REGEX_Match, and REGEX_Replace. In this section, I will introduce each function and its sample.

1/ REGEX_CountMatches(string, RegEx)

REGEX_CountMatches function helps the user to count if the expression matches the string in the text.

Image 1: Sample Text

For example, I have a sample text column with 3 rows. The first row includes 2 dates. The second row contains 1 date. The last row doesn't have any date. I would like to count how many dates in each row. In this case, I should use REGEX_CountMatches

Image 2: Using REGEX_CountMatches to count if the string matches the RegEx expression

I drag the Formula tool to connect with the Text Input tool. Then, I created a new column Count Matches. The expression would be REGEX_CountMatches([Sample Text], "\d+\/\d+\/\d+"). It means in the Sample Text column, count how many strings that match the expression \d+/\d+\d+. If it matches, the Count Matches column will count 1.

In the result window, 2 strings match the RegEx, so it returns 2 in the first row of the Count Matches column. 1 in the second row and 0 in the last row.

2/ REGEX_Match

REGEX_Match function helps the user to check if the searching text matches the pattern in a string. If the RegEx expression matches the string text, then returns -1 (for True); otherwise, it will return 0 (for False).

Image 3: RegEx Match function in Alteryx

In the example above (image 3), there are 3 rows in the Sample Text column. In Record 1, some information about User A with ID and Password. However, in the end, there is a space. In Record 2, information for user B but no space at the end. In Record 3, there is no pass for user C. I would like to check if there is a password after the PASS word.

In this case, I will use the REGEX_Match() function to check each row. I use the Formula tool to connect with the Text Input tool. I created a new column Pass included?. Then, type REGEX_Match([Sample Text], ".+PASS\d+.*"). It means to check the pattern .+PASS\d+.* exists in the Sample Text in each row and returns True (-1) or False (0) whereas:

.+: one or more characters before the PASS word

\d+: one or more digits after the PASS word

.*: zero or more characters after digits

Run the workflow, in the result window, record 1, and record 2 return True (-1) because they match my pattern. Record 3 returns False (0) because no digit after the PASS word.

3/ REGEX_Replace

The REGEX_Replace function helps the user to replace the text data with a new value or change to a new format that the user wants.

Image 4: Using REGEX_Replace functions to change the format of the text

For example, I have a Sample Text column with 3 rows. On each row, it shows how far I run each day. Now, I would like to show only the day and number of kilometers that I run.

In this case, I use the REGEX_Replace function. I also use the Formula tool to connect with the Input Text tool. I type REGEX_Replace([Sample Text], "On\s(.*),.*(\d)km", "$1: $2 kilometer"). It means in the Sample Text column if the string text matches the pattern "On\s(.*),.*(\d)km", then I would like to change the format to "$1: $2 kilometer", whereas:

On\s(.*),.*(\d)km : The text starts with On then a space and some characters before the comma. After the comma, there are some characters and 1 digit with km behind. Characters behind the space and before the comma will be stored as $1 and I need to use parentheses. Same as the digit, I need to put \d in the parentheses.

$1: $2 kilometer : I use $1 to represent the first string in parentheses that I extracted above. In this case, $1 is the day. $2 is the second string in parentheses which is the digit.

Run the workflow, I will get the result I want.


Tools use REGEX functions in Alteryx

Most tools have an expression editor that can use the REGEX function. I usually use the REGEX functions in the Formula tool, Filter tool, Multi-row tool, Multi-field tool, and Generate Row tool.

Image 5: List of tools that usually use the REGEX function

In this part 2, I present 3 REGEX functions in Alteryx. Those functions help me very much in preparing the data besides using the RegEx tool in the Parse palette. Depending on each specific case, I will choose to use the RegEx tool or apply REGEX functions.

I hope this blog and the previous blog will help you save time in preparing data. If you have any questions, feel free to reach out to me.

See you soon in the next blog!

Author:
Le Luu
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab