Normalization is the process of designing a structured database schema that reduces redundancy and ensures data consistency.Denormalization involves restructuring data by combining tables to enable faster and more efficient querying. It can be described with the following analogy: normalisation is like having a perfectly organised room, everything is in its place. Denormalisation is having all your stuff scattered across the floor where you can reach them easier. It looks messier however you can pick up what you need with minimal effort.
Benefits of normalised data:
- Reduced redundancies
- The data warehouse is organised better
- More consistent data: data is stored once so this reduces possible conflicts
- More secure: when data is stored once it is more clear who has access to it
Drawbacks of normalised data:
- It can require a lot of work in the beginning to normalise the data
- Data cannot be duplicated so tables may need to be joined to get the required output. This can reduce efficiency.
There are three stages of normalisation.
The first normalisation form (1NF):
- There must only be one value in each cell; each column must have one value for each row in the column
- Every table must include a primary key to uniquely identify each record.
- The table should not contain any duplicate rows or columns.
The second normalisation form (2NF):
- The table must already be in 1NF
- All non-key attributes must rely entirely on the whole primary key, with no partial dependencies.
The third normalisation Form (3NF):
- The table must already be in 2NF
- Eliminate indirect dependencies among non-key attributes.
Overall, normalised data should be used when data integrity is the priority such as banking data. De-normalised data should be used when performance and speed is the main priority. It is well-suited for applications with heavy read operations, such as data warehousing and reporting systems.