Have you ever gone to a website which has many files that you download and spent ages downloading these files one-by-one? With Alteryx, you can download all of these files saving lots of time.
1 Create a text file and enter (or paste) the web address that you wish to download files from.
2 Add a download tool. As long as you only have one field with one row of data, no configuration should be needed. This will create a ‘DownloadData’ field which has all the HTML code that makes up the webpage.
3 Use a text to columns to split the ‘DownloadData’ field by line breaks (\n).
4 Find out how the files you want appear in the HTML code. You can do this by going to the web page and reading the HTML code. In Google Chrome, you can find this by going to ‘more tools’, ‘Developer tools’ (or CTRL-SHIFT-I). With Chrome, you can click the icon in the top left of the developer tool (or CTRL-SHIFT-C) and hover over the link to the files you wish to download and it will show you the code for that file. For more details, you can see Paul’s blog post.
If you are not using Chrome and can’t find how you can get the code, you can double click the Download data row in the download tool and read it from there.
5 Filter out all the rows which do not contain the files you want. In this example, we want the .csv files so can filter on rows that contain .csv
6 Find a way that you can parse the files out of these cells (Regex can be very useful)
7 If these records do not contain the entire http address where the file is located, go to the webpage and right click one (or more) of the links to find the webpath
Use this information to create a web path for this address. If the records do contain the entire HTTP address then you can skip this step.
8 Decide what you want to call these files then create a field with these names. In this example, we are happy to give these a number for a name so can use the record ID
9 Use this to create a file path, including where you want to save it and the file extension
10 Use a new download tool to download the files. Put the web path as the URL field, tick the Filename from a field box and put the file path in the box under this
11 Press run and watch your files download.
In this case, we only wanted .csv files. If we wanted other files, we would have to adjust our logic to find all the different file types we wanted and created the file paths in a slightly different way (as they will have different file extension)
If it doesn’t work, I’d advise checking step 7 and making sure that all your files come from the same http address. If they don’t, you will need to create some logic to fix this.
This can be a real timesaver and allows you to download 50+ files in just a few clicks. Who wants to manually download 50+ items manually?