In this blog, I’ll be working through the Preppin’ Data challenge called International Bank Account Numbers. You can find it here. However, instead of using Tableau Prep or Alteryx (or Excel), I'll be using SQL. You should definitely try it out. It was such a fun challenge! By the way, feel free to skip out on this article if you haven't completed the challenge yet!
Before I begin, just a note. I completed this challenge using the Data School’s Snowflake environment. So, you won’t be able to connect to the TIL_PLAYGROUND database and query any of its schemas or tables unless you have access to said environment. One other thing is that I decided to use a CTE aliased as data. I did this so that my queries could be easier to write and read.
This article will consist of screenshots of my queries, outputs and descriptions of what the queries are doing.
The primary task for this challenge is to create IBANs (International Bank Account Numbers) for Data Source Bank. The company provided two tables, pictured below.
The figure above corresponds to table pd2023_wk02_transactions from the PREPPIN_DATA_INPUTS schema from the TIL_PLAYGROUND database on Snowflake. For simplicity’s sake, this table has been aliased as trans.
The figure above corresponds to table pd2023_wk02_swift_codes from the PREPPIN_DATA_INPUTS schema from the TIL_PLAYGROUND database on Snowflake. For simplicity’s sake, this table has been aliased as s_code.
In order to create the IBANs, both s_code and trans need to be joined. Fortunately, both tables have a column called bank, so we’re good to go! I can join s_code and trans on the shared bank column. Just a quick note. In this case, it doesn’t matter whether we use an inner join or a left join; we still output a table with 100 rows (the picture below only shows a sample), which is what we’re looking for.
With that, we have all the data that we will need to make the IBANs. As the Preppin’ Data article tells us, the IBANs are created by combining the following columns Country Code, Check Digits, Bank Code, Sort Code and Account Number.
As you can see, we don’t have a Country Code column in our joined table. This means that we’ll need to create this column. There are a couple of ways of creating this column. At first, I was trying to use the ALTER TABLE function but I was running into issues. So, I instead opted to write some conditional logic that creates this column. In this case, I decided to use a CASE statement, but a simple IF statement would work just as well. You will notice that my conditional statement doesn't have any additional conditions. The reason for that is because, as the article tells us, every bank is located in the UK. How convenient!
With that, we have all the columns that we need! But, we do have one last thing to take care of. We need to remove the dashes from the Sort Code column. For this, I decided to use the REPLACE function. However, the SPLIT_PART function could also be used. I used the first function because it’s the most familiar to me (especially whenever I’m working with Alteryx or Regular Expressions).
It’s always good to keep in mind that various versions of SQL will have different naming conventions and capabilities for said functions. In fact, some functions may exist in only one version and not in others!
As you can see, I am replacing the dashes in the Sort Code column with a blank (not a space!) Note: the original Sort Code column is outputted because of how I used the (*) asterisk in my SELECT statement. I did this to avoid having to write out each column, so ignore that. Instead, the Cleaned_Sort_Code column is the cleaned version.
We’re almost there! We just have to combine the relevant columns in the right order and we should be good to go!
And voila! We were able to make the final output! Something that I learned while doing this challenge was how the CONCAT function changes numeric fields to be string fields. This article explores this idea further in the implicit conversion section.
This is really handy as this allows us to avoid having to explicitly convert the data types for numeric fields to be string fields. This simplifies the writing of our query. In our case, we didn't need to explicitly convert the numeric field of Account Number using the CAST function (or the TO_CHAR function).
Thank you for reading this article. I hope it inspires you to tackle the Preppin' Data challenges (or even the Alteryx Weekly Challenges!) using SQL. If you feel up to it, please let me know what SQL functions you use to solve this great challenge!