# Normalization

*Normalization **is the process that improves a database design by generating relations that are of higher normal forms*

## First Normal Form (1 NF)

1 NF is a property of a relation in a relational database. A relation is in first normal form iff the domain of each attribute contains only atomic values, and the value of each attribute contains only a single value from that domain.

No two rows of data must contain repeating group of information i.e., each set of column must have a unique value, such that multiple columns cannot be used to fetch the same row. Each row should have a primary key that distinguishes it as unique.

## Second Normal Form (2 NF)

2 NF is a normal form used in database normalization. a table that in 1 NF must meet additional criteria if it is to qualify for 2 NF.

As per 2 NF, there must be not any partial dependency of any column on primary key. It means that for a table has concatenated primary key, each column in the table that is not a part of primary key must depend upon the entire concatenated key for its existence. If any column depends only on one part of the concatenated key, then the table fails 2 NF.

## Third Normal Form (3 NF)

A relation is in third NF if it is in 2 NF and no non key attribute is transitively dependant on the primary key.

3 NF applies that every non prime attribute of table must be dependant on primary or we say that there should not be the case that a non prime attribute is determined by another non prime attribute. So, this transitive dependency should be removed from the table and also the table must be in 2 NF.

## Boyce – Codd Normal Form (BCNF)

BCNF is an higher version of 3 NF. This form deals with certain type of anomaly that is not handled by 3 NF. A 3 NF table which does not have multiple overlapping candidate keys. For BCNF, following conditions must be satisfied,

- Relationship ‘ R ‘ must be in 3 NF
- for each functional dependency (X -> Y), X should be a super key

Consider the following relationship: R(A, B, C, D) and following dependencies:

A -> BCD

BC -> AD

D -> B

Above relationship is already 3 NF, keys are A and BC. Hence, in the functional dependency, A -> BCD, A is the super key. In the 2nd relationship, BC -> AD, BC is also a key but in D -> B, D is not a key.

Here we can break our relationship R into two relationships R1 and R2:

Breaking table into two tables, one with ADC while other with DB

## Fourth Normal Form (4 NF)

4 NF is the level of database normalization where there are no non trivial multivalued dependencies other than a candidate key. It builds on the first three normal forms and BCNF.

An entity must be in BCNF. If an attribute is based on value, list must be taken out as a separate entity.

## Fifth Normal Form (5 NF)

5 NF also known as *project join normal form* i.e., PJNF is a level of database normalization designed to reduce redundancy in relational databases recording multivalued facts by isolating semantically related multiple relationships.

An entity must be in 4 NF. If an attribute is combined with repeated values then it must be taken out as a separate entity.

