Informal Design Guidelines for Relational Schema
Informal Design Guidelines for Relational Schema
Before introducing functional dependencies and normalization, relational database design uses informal guidelines to judge whether a schema is good or bad. These guidelines help detect design problems early, such as redundancy, anomalies, NULLs, and spurious tuples.
There are four main informal design guidelines.
Guideline 1: Impart Clear Semantics to Attributes
Meaning
A relation schema should have clear and unambiguous meaning.
Each relation should represent only one entity type or one relationship type.
Why this matters
-
Users must easily understand what each tuple represents
-
Queries become easier and less error-prone
-
Prevents mixing unrelated real-world concepts
Good Design Example
-
EMPLOYEE(Ename, Ssn, Bdate, Address, Dnumber)-
Each tuple = one employee
-
-
DEPARTMENT(Dname, Dnumber, Dmgr_ssn)-
Each tuple = one department
-
These relations are easy to explain and interpret.
Bad Design Example (Violation)
-
EMP_DEPT(Ename, Ssn, Bdate, Address, Dnumber, Dname, Dmgr_ssn)-
Mixes employee attributes and department attributes
-
A tuple no longer represents a single real-world object clearly
-
Guideline Statement
Do not combine attributes from multiple entity types or relationship types into one relation.
Guideline 2: Reduce Redundant Information and Update Anomalies
Meaning
Avoid storing the same information repeatedly in multiple tuples.
Why this matters
Redundancy leads to update anomalies, which are serious consistency problems.
Types of Update Anomalies
1. Insertion Anomaly
-
Cannot insert one fact without inserting another unrelated fact
-
Example:
-
Cannot insert a new department unless at least one employee exists
-
Must repeat department data for every employee
-
2. Deletion Anomaly
-
Deleting one tuple unintentionally removes important information
-
Example:
-
Deleting the last employee of a department deletes the department itself
-
3. Modification Anomaly
-
Same fact stored multiple times must be updated everywhere
-
Example:
-
Changing a department manager requires updating many employee tuples
-
Missing one update → inconsistent database
-
Good Design Solution
-
Store department data once in
DEPARTMENT -
Reference it using a foreign key in
EMPLOYEE
Guideline Statement
Design base relations to avoid insertion, deletion, and modification anomalies.
Guideline 3: Minimize NULL Values
Meaning
Avoid relations where many attributes frequently contain NULL values.
Why this matters
-
Wastes storage space
-
Complicates query processing
-
Makes JOIN and aggregation results unpredictable
-
NULLs have multiple meanings, such as:
-
Not applicable
-
Unknown
-
Not yet recorded
-
Using a single NULL symbol for all meanings causes confusion.
Bad Design Example
-
EMPLOYEE(..., Office_number)-
Only 15% of employees have offices
-
85% of tuples contain NULL → poor design
-
Good Design Alternative
-
Create a separate relation:
Only employees with offices appear in this table.
Guideline Statement
Avoid placing attributes in a relation when they frequently contain NULL values.
Guideline 4: Avoid Spurious Tuples
Meaning
Relations should be designed so that joining them reconstructs the original information correctly.
Spurious Tuples
-
Incorrect tuples generated during JOIN operations
-
Represent false or meaningless information
Cause of Spurious Tuples
-
Joining relations on attributes that are not primary key–foreign key pairs
-
Example:
-
Joining on
Plocationinstead of(Pnumber, Ssn)
-
Bad Decomposition Example
Splitting:
-
NATURAL JOIN produces extra invalid tuples
-
Original data cannot be recovered
Good Design Principle
-
Joins should be based on:
-
Primary key ↔ Foreign key
-
-
Ensures lossless (nonadditive) join
Guideline Statement
Design relations so that valid joins do not generate spurious tuples.
Summary of Informal Design Guidelines
| Guideline | Goal |
|---|---|
| Guideline 1 | Clear semantics |
| Guideline 2 | Eliminate redundancy & anomalies |
| Guideline 3 | Reduce NULL values |
| Guideline 4 | Prevent spurious tuples |
Why These Guidelines Are Important
These guidelines:
-
Motivate functional dependencies
-
Lead to normal forms (1NF, 2NF, 3NF, BCNF)
-
Provide intuition behind normalization
-
Help designers detect bad schemas before implementation
Conclusion
Informal design guidelines help evaluate the quality of relational schemas by ensuring clear semantics, minimizing redundancy and NULL values, avoiding update anomalies, and preventing spurious tuples. These guidelines form the foundation for the formal theory of functional dependencies and normalization.
Comments
Post a Comment