In the event that there are multiple sheets in an Excel workbook or multiple files that need to be brought into Alteryx, there are various ways to do this. One way would be to bring them all in using an Input tool for each file and blend the data after, but if the sheets have the same schema or in order words, the same fields, same number of fields, and same data types, it would be easier to use wildcards to bring in the data.
The two wildcards that we’ll be covering today are the asterisk (*) and the question mark (?). To use the wildcards, we will enter the wildcard as part of the file path, substituting all characters or one character in the file name.
The asterisk (*) can be used to substitute one or more characters. For example, if all the files you are looking to input have similar file names like Randomfilename_2010, Randomfilename_2015, and Randomfilename_2020, we could replace the last two characters with the * (so Randomfilename_20*) and when we run the workflow, Alteryx will pull all three files in.
The alternative would be to use the question mark (?). A question mark represents one character. So if we were to use just one question mark, none of the files would be read. If we want all three files to be read, we would have to use two question marks: Randomfilename_20?? and this will bring in all three files that match this structure.
Now let’s say, there just so happens to be a folder with multiple files where one file represents the superstore data for one country each, how would we bring this into Alteryx?
Bringing in each file one by one would make the workflow look really intense:
Then these files would need to be combined using one of the blending tools such as a Join, Union, or Append.
Instead of bringing in the files with one input tool each, we could use a wildcard to bring in the fifteen files to keep everything clean and compact. Now the question is, which wildcard should we use? The filename for each file is the name of the country and each one has a different number of characters, using the question mark wildcard would not be able to pull in all the files unless it matched the format of the file path and not to mention, it would be quite a bit of question marks…Argentina.csv would become ?????????.csv and United Kingdom.csv would become ??????????????.csv and even then we wouldn’t be able to pull in all the files we want.
The most efficient way, in this case, would be to use the asterisk (*).
Bring in the Input data tool:
Set up a connection to one of the CSV files:
Hit the Run button:
Replace “Austria” with “*”:
Add a browse button and hit the Run button again:
Please note, however, that Alteryx Designer sets the number of fields and file types based on the first file read. According to the Alteryx documentation, any file read after the first file that doesn’t match are skipped and a warning message is displayed. At this point in time, when using a wildcard, it is also not possible to choose which file is read first, so the wildcard input method should really be reserved for use when you know the data schema matches for each file.
Another note is that using the asterisk will pull in ALL files within the folder or location that contains the file we first connect to. I recommend creating a folder to hold just the files you’re looking to have Alteryx read when using this wildcard method.
With that, I hope the wildcards give you wildly cool ideas for your future workflows! Thank you for reading and let’s analyze away!!