A Complete Guide to Your First Week at The Data School

After searching for some inspiration on what to write my first blog post about, I decided to create a guide on how to get through your first week at The Data School in hopes of bringing some comfort to those who feel overwhelmed and to simplify some of the new lingo you might hear.


Let's start with some general tips!

The main lesson I have learnt is to ask for help! I know it seems obvious, but trust me. Whether it is asking someone in your cohort, your coach, or Gemini, asking for help is always better than struggling alone. After all, a problem shared is a problem halved! This lesson is not limited to the technical information, it is also applicable to anything around the office, like 'Where is the dishwasher?'. Asking for help resolves an issue much quicker than not knowing and struggling.

My second piece of advice is to make friends with your cohort! You will be spending a lot of time with your cohort, so make conversation in between sessions, go out and get lunch together, or initiate small talk by the coffee machine.


Okay, onto the technical stuff. Here's some terms and their definitions we were introduced to in our first week:

  • ETL (Extract, Transform, Load) is the process taking the data from its original source (extracting) and preparing it for analysis (transforming) so that it is accessible to its users (loading).
  • Flat files refer to single data sets with a fixed set of data fields (columns) and records (rows). There are no internal folders or relationships to other files. Common forms of flat files include:
    • .xlsx (Excel)
    • .csv (CSV (Comma Separated Values))
    • .txt
  • Databases live on a central server and store data in multiple linked tables. They organize information into tables (i.e. spreadsheets) which are connected to one another using keys (i.e. unique IDs) to ensure everything matches up.
    • Databases use schemas as a map to keep things tidy.
    • Databases use views to create custom shortcuts for looking at the data without having to make copies.
  • Data warehouses take data from several databases. The data is then cleaned and organized for analysis by the data warehouse.
  • Data lakes are massive storage spaces that hold unstructured data and can include several sources of information (i.e. emails, images, videos, spreadsheets, etc.)
  • Data lakehouses are hybrids between data lakes and warehouses. They combine the best of both worlds, as data lakes allow for massive, cheap storage and data warehouses provide neat , speedy organization.
  • A record in a data set is essentially a complete set of information that describes one instance in the data and is represented by a row.
  • A data field is a single category of information represented by a column.

Over the course of The Data School, I'm sure I'll be dealing with extremely large data sets which can take a long time to process. This is where fact and dimension tables come in handy! Fact and dimension tables allow for big data sets to be divided into smaller data sets for faster processing and less storage space being taken up.

  • Fact tables hold the records of the original data set.
  • Dimension tables hold details about the categorical fields in the original data set. They provide descriptive context and can hold the larger text (strings).
Here, we can see that the dimension table adds context to the fact table with regards to the type of bike.

From this, the concept of schemas is born.

  • A schema is a map that shows how different components of a database fit together.

The two types of schema we looked at are star schemas and snowflake schemas:

  • Star schemas have one fact table and multiple dimension tables.
  • Snowflake schemas are essentially more complex versions of star schemas. They also have one fact table and multiple dimension tables, but they also have some dimension tables referencing other dimension tables.

So, there's the basics of Data Sources and Architecture, which forms the foundation of everything we will be doing at The Data School. I really hope you enjoyed reading this blog post as much as I did writing it. Here's to many more blog posts in the future! 🙌😁📈📊

Author:
Amelia Young
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2026 The Information Lab