Top Qs
Timeline
Chat
Perspective
Third normal form
Level of database normalization From Wikipedia, the free encyclopedia
Remove ads
Third normal form (3NF) is a level of database normalization defined by English computer scientist Edgar F. Codd. A relation (or table, in SQL) is in third normal form if it is in second normal form and also lacks non-key dependencies, meaning that no non-prime attribute is functionally dependent on (that is, contains a fact about) any other non-prime attribute. In other words, each non-prime attribute must depend solely and non-transitively on each candidate key.[1] William Kent summarised 3NF with the dictum that "a non-key field must provide a fact about the key, the whole key, and nothing but the key".[2][citation needed]
An example of a violation of 3NF would be a Patient relation with the attributes PatientID, DoctorID and DoctorName, in which DoctorName would depend first and foremost on DoctorID and only transitively on the key, PatientID (via DoctorID's dependency on PatientID). Such a design would cause a doctor's name to be redundantly duplicated across each of their patients. A database compliant with 3NF would store doctors' names in a separate Doctor relation which Patient could reference via a foreign key.
3NF was defined, along with 2NF (which forbids dependencies on proper subsets of composite keys), in Codd's paper "Further Normalization of the Data Base Relational Model" in 1971,[3] which came after 1NF's definition in "A Relational Model of Data for Large Shared Data Banks" in 1970.[citation needed] 3NF was itself followed by the definition of Boyce–Codd normal form in 1974, which seeks to prevent anomalies possible in relations with several overlapping composite keys.
Remove ads
Definition of third normal form
Summarize
Perspective
Codd's definition states that a relation R is in 3NF if and only if it is in second normal form (2NF) and every non-prime attribute of R is non-transitively dependent on each candidate key. A non-prime attribute of R is an attribute that does not belong to any candidate key of R.[4]
Codd defines a transitive dependency of an attribute set Z on an attribute set X as a functional dependency chain X → Y → Z that must be satisfied for some attribute set Y, where it is not the case that Y → X, and all three sets must be disjoint.[5]
A 3NF definition that is equivalent to Codd's, but expressed differently, was given by Carlo Zaniolo in 1982. This definition states that a table is in 3NF if and only if for each of its functional dependencies X → Y, at least one of the following conditions holds:[6][7][need quotation to verify]
- X contains Y (that is, Y is a subset of X, meaning X → Y is a trivial functional dependency),
- X is a superkey,
- every element of Y \ X, the set difference between Y and X, is a prime attribute (i.e., each attribute in Y \ X is contained in some candidate key).
To rephrase Zaniolo's definition more simply, the relation is in 3NF if and only if for every non-trivial functional dependency X → Y, X is a superkey or Y \ X consists of prime attributes. Zaniolo's definition gives a clear sense of the difference between 3NF and the more stringent Boyce–Codd normal form (BCNF). BCNF simply eliminates the third alternative ("Every element of Y \ X, the set difference between Y and X, is a prime attribute.").
The definition offered by Zaniolo can be shown to be equivalent to the Codd definition in the following way: let X → A be a nontrivial functional dependency (i.e., one where X does not contain A) and let A be a non-prime attribute. Also let Y be a candidate key of R. Then Y → X. Therefore, A is not transitively dependent on Y if and only if there is a functional dependency X → Y, i.e., if and only if X is a superkey of R.
Remove ads
Example
Summarize
Perspective
Design which violates 3NF
The following relation, with the composite key {Name, Year}, fails to meet the requirements of 3NF. The non-prime attributes WinnerName and WinnerBirthdate are only transitively dependent on the composite key via their dependence on the non-prime attribute WinnerID. This creates redundancy and the potential for inconsistency in the case that a winner of multiple tournaments is accidentally given different dates of birth in different tuples.
Design which complies with 3NF
To bring the relation into compliance with 3NF, WinnerID, WinnerName and WinnerBirthdate can be transferred to a separate table.
Tournament's WinnerID attribute now acts as a foreign key referencing the primary key of Winner. Unlike before, it is not possible for a winner to be associated with multiple dates of birth.
Remove ads
"Nothing but the key"
Summarize
Perspective
A paraphrase of Codd's definition of 3NF parodying the traditional oath to tell the truth in a court of law was given by William Kent: "a non-key field must provide a fact about the key, the whole key, and nothing but the key".[2] Requiring that non-key attributes be dependent on "the whole key" ensures compliance with 2NF, and further requiring their dependency on "nothing but the key" ensures compliance with 3NF. A common variation supplements the paraphrase with the addendum "so help me Codd".[8]
While the phrase is a useful mnemonic, the mention of only a single key makes fulfilling it necessary but not sufficient to satisfy 2NF and 3NF, both of which are concerned with all candidate keys of a relation and not just any one.[citation needed]
Christopher J. Date notes that, adapted to refer to all fields rather than just non-key fields, the summary can also encompass the slightly stronger Boyce–Codd normal form, in which prime attributes must not be functionally dependent at all.[9] Prime attributes are considered to provide a fact about the key in the sense of providing part or all of the key itself. (This rule applies only to functionally dependent attributes, as applying it to all attributes would implicitly prohibit composite keys, since each part of any such key would violate the "whole key" clause.)
Computation
|  | This section may be confusing or unclear to readers.  (July 2025) | 
A relation can always be decomposed in third normal form, that is, the relation R is rewritten to projections R1, ..., Rn whose join is equal to the original relation. Further, this decomposition does not lose any functional dependency, in the sense that every functional dependency on R can be derived from the functional dependencies that hold on the projections R1, ..., Rn. What is more, such a decomposition can be computed in polynomial time.[10]
To decompose a relation into 3NF from 2NF, break the table into the canonical cover functional dependencies, then create a relation for every candidate key of the original relation which was not already a subset of a relation in the decomposition.[11]
Remove ads
Considerations for use in reporting environments
|  | This section may be confusing or unclear to readers.  (July 2025) | 
While 3NF was ideal for machine processing, the segmented nature of the data model can be difficult to intuitively consume by a human user. Analytics via query, reporting, and dashboards were often facilitated by a different type of data model that provided pre-calculated analysis such as trend lines, period-to-date calculations (month-to-date, quarter-to-date, year-to-date), cumulative calculations, basic statistics (average, standard deviation, moving averages) and previous period comparisons (year ago, month ago, week ago) e.g. dimensional modeling and beyond dimensional modeling, flattening of stars via Hadoop and data science.[12][13] Hadley Wickham's "tidy data" framework is 3NF, with "the constraints framed in statistical language".[14]
Remove ads
See also
- Attribute-value system
- First normal form (1NF)
- Second normal form (2NF)
- Boyce–Codd normal form (BCNF or 3.5NF)
- Fourth normal form (4NF)
- Fifth normal form (5NF)
- Sixth normal form (6NF)
References
Further reading
External links
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads