Normal forms

Udgivet 2016-10-09 - Skrevet af

Normal forms

The data depends on the key [1NF], the whole key [2NF] and nothing but the key [3NF] (so help me Codd).

The three normal forms is a way to eliminate all redundant data from the tables in your database, making it more efficient. The forms are structured such that they build upon each other, meaning that you cannot have 2NF (the second normal form) without also having 1NF. Let's start with the first one.


The first normal form state that there can only be one piece of information per cell in the database tables. An example of a conflict can be seen in the table below (the violation is highlighted):


Here we see that Dwight has two telephone numbers in his telephone number cell, which isn't according to 1NF. It is of course something you could do, but it is a flawed way of thinking and very inefficient.


The second normal form has the requirement of all non-key attributes has to be dependent on the table's primary key. Ask yourself, “Does this column serve to describe what the primary key identifies?”.
If the answer is no, the given attribute has to be moved to another table.

An example of this can be seen here below in the table:

1143223Dunder Mifflin, Scranton
2233123Dunder Mifflin, Scranton
3322115Dunder Mifflin, Utica

Here we see an attribute that does not help to describe the primary key. What we are trying to do with the table is to describe who works where, bringing the employee and the branch together. The reason why we simply do not implement a branchId attribute to the employee table is because some people may be working for several branches, introducing the problem we saw in 1NF.
The table branchEmployees is brought to 2NF by removing the branchName attribute, and leaving that for a table detailing the branches (it can be found via the branchId attribute).


The third normal for is about removing the last of the redundant data. Even though 2NF removed a lot, there is still a bit of redundant data that can slip through. Take a look at the table below and see if you can find redundant data:

1432DwightSchrute202-555-0117Honesdale, PA18431
2331JimHalpert202-555-0165Scranton, PA18447
3221KarenFilippelli202-555-0145Utica, NY13532

The redundancy lies in the city attribute. Even though it is describing what the primary key identifies it is still obsolote, as we already have the postalCode attribute. The city and postalCode attributes can be paired together in another table, thus reducing the redundant data.
So the 3NF redundancy comes from information you can gather from the other attributes. Another example I found online was one where the attributes BMI and overweight was in the same table, but the overweight attribute is irrelevant as you already have the BMI.

I hope you found the guide useful.


Der er ingen kommentarer.

Tilføj kommentar