We left our DOG table in the 1NF because I wanted to illustrate how to eliminate transitive dependencies (i.e., bring it to the third normal form). Here is our flawed table for reference:
DOG Table
| Dog | First Name | Last Name | Sex | Kennel | Street | City | State | ZIP | Phone |
|---|---|---|---|---|---|---|---|---|---|
| 3 | Tuck | Yang | M | Ruff Haus | 1200 Frank St. | Chattanooga | TN | 37411 | 423-820-2412 |
| 12 | Nip | Yin | M | K9HQ | 4051 Lewis St. | Chattanooga | TN | 37412 | 423-833-4122 |
| 13 | Shadow | Muttley | U | K9HQ | 4051 Lewis St. | Chattanooga | TN | 37412 | 615-833-4122 |
The problem is with the kennel information. Specifically, if we delete Tuck's record, we lose all the information for the Ruff Haus kennel. In our scenario that's especially bad because we want to send advertising mailouts to all area kennels to increase our business. That certainly includes kennels for which we have no clients! In simple terms, the Ruff Haus information DEPENDS on Tuck's record: it's a "transitive dependency."
We eliminate transitive dependencies the same way we've solved the other problems: split it into another table. The 3NF also solves some 2NF problems as well. First, data consistency is maintained. The phone number for K9HQ, for example, only has to be maintained in one location. This demonstrates the second advantage: maintenance efficiency: all dog records will automatically refer to the correct phone number. In a large database, this means one update instead of thousands or millions. Also, consider the maintenance impact if we added Fax Number for our kennel customers!
Third, we realize a tremendous space savings. Kennel information could be considerably more extensive. One copy isn't a burden to store, but if it were stored with every dog record, its impact is immense on a large database. An unnoralized database can dramatically increase your disk storage costs and it can also have an unbelievable impact on performance! Now let's look at how DOG table is split to conform to the 3NF:
DOG Table| Dog | First Name | Last Name | Sex | Kennel |
|---|---|---|---|---|
| 3 | Tuck | Yang | M | 1 |
| 12 | Nip | Yin | M | 2 |
| 13 | Shadow | Muttley | U | 2 |
| Kennel | Name | Street | City | State | ZIP | Phone |
|---|---|---|---|---|---|---|
| 1 | Ruff Haus | 1200 Frank St. | Chattanooga | TN | 37411 | 423-820-2412 |
| 2 | K9HQ | 4051 Lewis St. | Chattanooga | TN | 37412 | 423-833-4122 |
Drum roll please! Here it is, our scenario database in the third normal form:
Now consider that this is a very simplistic database! An application such as PeopleSoft's HRMS/Payroll system is composed of over 3,000 tables. The ERD for such a system won't even fit on a large poster — it's more like an entire wall! But normalization is worth it. The 3NF gives us amazing query flexibility, an environment that promotes data integrity/consistency, and reduces storage requirements. If some ERD's seem complex, it is because the 3NF can represent extremely complex real-world relationships. Only the 3NF can model such complexity, such as a payroll system, with accuracy and efficiency. But there is a dark side to normalization, and that's our next topic.