SQL & Alteryx

by Elaine Yuan

Last month, there was a Data School New York x Fullstack Academy DATA Meet & Greet. It was a lively meeting, in which current students of the bootcamp program I just graduated from visited my company. I suggested a SQL quiz during the Alteryx demonstration to test their memory of SQL joins and emphasize how easy Alteryx is to use. Instead of writing a SQL query joining tables on related columns, an Alteryx user can drag the Join tool into the canvas and select the related columns.

My intention was not to disparage SQL; I love SQL and my understanding of SQL supports my Alteryx learning. For example, I was able to understand the Join tool in Alteryx due to my knowledge of SQL joins. Therefore, I will compare the SQL order of commands (SFWGHO) to Alteryx tools in this blog post.


SELECT = Select tool

The Select tool selects columns of interest and through this tool, we can rename (or alias) these columns.

FROM = Input tool

The Input tool connects to a file or database to provide data. After inputting data, we can work from the connected file or database.

WHERE = Filter tool

The Filter tool subsets the data into two: one where a given condition is met and the other where it is not.

GROUP BY = Summarize tool

The Summarize tool in Alteryx groups data based on specified column(s) and aggregates based on specified column(s).


This tool is similar to a SQL GROUP BY statement, but it can also perform an aggregation, such as COUNT() or SUM(), in a SQL SELECT statement.

HAVING = Filter tool

Yes, the Filter tool appears twice here.

In SQL, the HAVING clause is used with aggregate values. After aggregation via the Summarize tool, a Filter tool can filter the data after having met a given condition.

ORDER BY command = Sort tool

The Sort tool in Alteryx orders your data based on specified column(s) in ascending or descending order.


To conclude, I would like to show a SQL query and a similar Alteryx workflow. I am starting with the classic superstore dataset and I want to find the sum of sales by subcategories having a sum of sales greater than $50,000 in the East region, sorted by the sum of sales in descending order.

The SQL query is:


On the other hand, the same result can be generated in Alteryx like so:


One thing I like about Alteryx is there are many ways to complete a task. Meanwhile, SQL does not have the same level of flexibility. Though to be fair, Alteryx is software with over 270 tools and SQL is a language to query data. They’re not in the same playing field.

Nevertheless, I look forward to drawing more connections between SQL and Alteryx, as I learn additional Alteryx tools during my training. Will there be a second blog post on this topic? Maybe.

;

Thu 26 Jan 2023

Tue 24 Jan 2023

Mon 23 Jan 2023