When preparing data for analysis, we often find ourselves with multiple tables containing related information. Perhaps one file holds customer details, another has their orders, and a third contains payment records.
To unlock insights, we need to combine these sources — and that’s where joins and unions come in.
What is a Join?
A join combines data from two or more tables side by side, based on a common field — often called a key. Think of it like matching puzzle pieces: where two datasets share something in common (like “Customer ID”), we can bring their information together in one view.
What Are the Different Types of Joins?
There are several types of joins — and the one you choose determines which rows appear in your final dataset. Lets take a look at these different types below.
Inner Join:
Keeps only the records that exist in both tables.
Example:

Left Join:
Keeps all records from the left table and adds matching data from the right table. If there’s no match, you’ll still keep the left-hand data — with blanks where no match was found.
Example:

Right Join:
The opposite of a left join — it keeps all records from the right table and adds matches from the left.
Example:

Full Outer Join:
Combines all records from both tables, keeping everything whether or not a match exists.
Example:

Why Do We Use Joins in Data Cleaning?
Joins are powerful for:
- Combining related datasets — for example, linking sale prices to profits for each scent to create one complete view of your products.
- Filling in missing details — adding the profit information from one table into another that only contained sale prices, giving you a richer dataset.
- Spotting mismatches or gaps — such as discovering scents that appear in the profit table but are missing from the sale price list (like Papaya) and vice versa.
What is a Union?
While joins combine data side by side, a union stacks data on top of each other — like adding new rows.
You can think of a union as combining similar tables that share the same columns.
For example, if you receive monthly sales files for January, February, and March, you can union them to create one full dataset.
Example:

When to Use a Union?
Unions are perfect when:
- You’re combining periodic data (monthly, quarterly, or regional datasets).
- Each table contains the same structure (same columns, different rows).
- You want to bring all your data together to see the full picture across different time periods or regions etc.
Join vs Union – The Key Differences?

Final Thoughts...
Whether you’re joining to enrich your data or need to union to expand it, mastering these transformations is key to building clean, analysis-ready datasets.
So next time you’re faced with two (or twenty!) spreadsheets, ask yourself:
“Do I need to join them side by side — or union them top to bottom?”
Either way, you’ll be one step closer to transforming raw data into meaningful insights.