What is Normalization?

Normalization is the process of organizing data in a database. Normalization includes creating tables and establishing relationships between tables.

Normalization is a refining and optimization of the organization of a database.  In terms of the actual process, normalization is a series of defined steps (forms) undertaken to create a database design permitting efficient access and storage of data in a relational database. The goal is to restructure data so as to reduce data redundancy and inconsistency.  A table in a relational database is in “normal form” if it satisfies specific constraints. Although you can take normalization all the way to a fifth normal form, according to Microsoft Support, most applications don’t require higher than third normal form.

The first normal form includes three steps:  eliminating repeating groups in individual tables, creating a separate table for each set of related data and identifying each set of related data with a primary key. One rule of thumb with the first normal form is to not use multiple fields in a single table to store similar data. Instead of adding a field, which requires program and table modification, it is better to add an additional table and link as required.

The second normal form includes two steps: creating separate tables for sets of values applying to multiple records, relating the tables with a foreign key.

The third normal form involves eliminating fields that do not depend on the key.  Values in a record that are not part of the key for that record should not appear. When the contents of a group of fields potentially apply to more than one record in a table, it is generally a good idea to set up those fields in a separate table.