Most of the modern day organization consists of multiple data sources which consists of multiple databases with complex data structures. Its is very essential for an orgnization to break their table complexities and maintain a proper relation between them .Normalization is the process of successively reducing relations with anomalies to produce smaller, well-structured relations between the data in the tables.
It makes no assumptions about how data will be used in displays, queries, or reports. It places no constraints on how data can or should be physically stored or,therefore, on processing performance. Normalization is a logical data modeling technique used to ensure that data are well structured from an organization-wide view. It is a process of organizing the data in database to avoid data redundancy, insertion anomaly, update anomaly & deletion anomaly.
Normalizations consists of set of procedures that eliminates the domains that are non-atomic and redundancy of data that prevents data manipulation and loss of data integrity.
Most commonly used normal forms in normalization are given below.
- First normal form(1NF)
- Second normal form(2NF)
- Third normal form(3NF)
- Boyce & Codd normal form (BCNF)
What are the different types of normalization that exists in the database?
There are 9 normalizations that are used inside the database. These are shown in below table:
|First normal form||This table represents a relation/entities with no repeating groups of attributes|
|Second normal form||Non- prime attributes are not functional dependent on subset of any candidate key|
|Third normal form||In a table every non- prime attribute is non-transitively dependent on every candidate key|
|Elementary key normal form||superkey dependency or elementary key dependency effects the functional dependency in a table|
|Boyce codd normal form||every non-trivial functional dependency in the table is dependent on superkey|
|Fourth normal form||Every non-trivial multivalued dependency in the table is a dependent on a superkey|
|Fifth normal form (5NF)||Every non-trivial join dependency in the table is implied by the superkeys of the table|
|Domain/key normal form (DKNF)||Every constraint on the table is a logical consequence of the table’s domain constraints and key constraints|
|Sixth normal form (6NF)||Table features no non-trivial join dependencies at all|
How de-normalization is different from normalization?
- Analytical processing databases are not very normalized. The operations which are used are read most databases.
- It is used to extract the data that are ancient and accumulated over long period of time. For this purpose de-normalization occurs that provide smart business applications.
- Dimensional tables in star schema are good example of de-normalized data.
- The de-normalized form must be controlled while extracting, transforming, loading and processing.
- There should be constraint that user should not be allowed to view the state till it is consistent.
- It is used to increase the performance on many systems without RDBMS platform.
What is the type of de-normalization?
Non-first normal form (NFA)
– It describes the definition of the database design which is different from the first normal form.
- It keeps the values in structured and specialized types with their own domain specific languages.
- The query language used in this is extended to incorporate more support for relational domain values by adding more operators.
It is a type of database which is optimized for data warehouse and online analytical processing(OLAP) applications. This database are normally created from the inputs obtained from existing relational databases.
Relational database can be accessed using a SQL query whereas a multidimensional database allows a user to answer analytical question from those relational database.Multidimensional database is mainly used to summarize business operations and trends.
Advantage of Normalization in Tables
- Size of the database is decreased as duplicate data is eliminated by normalization
- Performance is improved because of the fine-tuned tables with small size
- Decrease in column size will lead to fewer indexes in a table which makes it easier for maintenance tasks such as index rebuilding
- Joins are needed when data is needed across the tables
Disadvantage of Normalization
- As data are spread out throughout the database there are more tables to join
- As tables does not contain duplicate data ,joins are required to get the data across the tables .This increases the complexity of the SQL queries making it more slower to read from database.
Jeffrey A. Hoffer, Ramesh Venkataraman, and Heikki Topi. 2010. Modern Database Management (10th ed.). Prentice Hall Press, Upper Saddle River, NJ, USA.