Binary Relational Operations: JOIN
Binary Relational Operations: JOIN
Binary relational operations operate on two relations at a time. Among these, JOIN is the most important and most frequently used operation in relational algebra because it allows us to combine related data stored in different relations.
1. JOIN Operation
Definition
The JOIN operation combines tuples from two relations into single, longer tuples whenever a specified join condition is satisfied.
Symbolically:
JOIN is essential because real-world databases store related information in multiple tables, and JOIN allows us to reconstruct meaningful information from them.
2. Motivation for JOIN
In a relational database:
-
Information is distributed across relations
-
Relationships are represented using primary key–foreign key pairs
To answer queries that involve relationships (for example, “Who is the manager of each department?”), we must combine related tuples from different relations—this is exactly what JOIN does.
3. Example: Department Managers
Relations Involved
-
DEPARTMENT(Dname, Dnumber, Mgr_ssn, Mgr_start_date)
-
EMPLOYEE(Fname, Minit, Lname, Ssn, …)
Here:
-
Mgr_ssnin DEPARTMENT is a foreign key -
Ssnin EMPLOYEE is a primary key
Relational Algebra
Meaning
-
Combine each department with the employee who manages it
-
Then project only the department name and manager’s name
This produces a single relation showing:
Department name → Manager’s first and last name
4. JOIN vs CARTESIAN PRODUCT
CARTESIAN PRODUCT
-
Combines every tuple of one relation with every tuple of the other
-
Often produces meaningless results
-
Must be followed by SELECT to filter matching tuples
JOIN
-
Combines only those tuples that satisfy the join condition
-
Conceptually equivalent to:
JOIN exists because this pattern is extremely common in database queries.
5. General Form of JOIN
For relations:
Result
-
A relation with n + m attributes
-
One tuple for each pair of tuples from R and S that satisfies the join condition
6. Theta Join (θ-JOIN)
Definition
A Theta Join is a JOIN that uses any comparison operator:
Join Condition Format
Example
Key Points
-
Most general form of JOIN
-
Tuples with NULL join attributes or false conditions are excluded
-
JOIN does not preserve all tuples from the original relations
7. Equijoin
Definition
An Equijoin is a Theta Join that uses only equality (=) as the comparison operator.
Example
Observation
-
Both join attributes appear in the result
-
Their values are identical in every tuple
-
This redundancy leads to the NATURAL JOIN
*8. Natural Join (⨝ or )
Definition
A Natural Join:
-
Is an Equijoin
-
Automatically joins relations on attributes with the same name
-
Removes duplicate join attributes from the result
Symbol:
Example 1: PROJECT and DEPARTMENT
If:
-
PROJECT(Dnum, …) -
DEPARTMENT(Dnumber, …)
First rename:
Result
-
Each project is combined with its controlling department
-
Only one Dnum attribute appears in the result
Example 2: DEPARTMENT and DEPT_LOCATIONS
If both relations already share Dnumber:
-
No renaming required
-
Produces one tuple per department location
9. Properties of JOIN
-
Result size is between:
-
If no join condition → degenerates into CARTESIAN PRODUCT
-
Join selectivity measures how many tuple combinations satisfy the condition
-
JOINs discussed here are inner joins
10. Multiway JOIN
JOINs can be chained to combine more than two relations.
Example: Project–Department–Manager
Result
Each tuple contains:
-
Project details
-
Controlling department
-
Department manager information
11. JOIN in SQL (Conceptual Mapping)
JOIN in relational algebra corresponds to:
-
JOIN conditions in WHERE clause
-
JOIN … ON syntax (SQL2)
-
Nested queries and joined tables
Relational algebra provides the formal foundation, while SQL provides the practical implementation.
Summary
| JOIN Type | Key Idea |
|---|---|
| Theta Join | Any comparison operator |
| Equijoin | Equality condition only |
| Natural Join | Equijoin + remove duplicate attributes |
| Inner Join | Only matching tuples included |


Comments
Post a Comment