Binary Relational Operations: JOIN

 

Binary Relational Operations: JOIN

Binary relational operations operate on two relations at a time. Among these, JOIN is the most important and most frequently used operation in relational algebra because it allows us to combine related data stored in different relations.


1. JOIN Operation

Definition

The JOIN operation combines tuples from two relations into single, longer tuples whenever a specified join condition is satisfied.

Symbolically:

R  <join condition>  SR \;\bowtie_{<join\ condition>}\; S

JOIN is essential because real-world databases store related information in multiple tables, and JOIN allows us to reconstruct meaningful information from them.


2. Motivation for JOIN

In a relational database:

  • Information is distributed across relations

  • Relationships are represented using primary key–foreign key pairs

To answer queries that involve relationships (for example, “Who is the manager of each department?”), we must combine related tuples from different relations—this is exactly what JOIN does.


3. Example: Department Managers

Relations Involved

  • DEPARTMENT(Dname, Dnumber, Mgr_ssn, Mgr_start_date)

  • EMPLOYEE(Fname, Minit, Lname, Ssn, …)

Here:

  • Mgr_ssn in DEPARTMENT is a foreign key

  • Ssn in EMPLOYEE is a primary key

Relational Algebra

DEPT_MGR ← DEPARTMENT ⋈Mgr_ssn = Ssn EMPLOYEE RESULT ← πDname, Lname, Fname(DEPT_MGR)

Meaning

  • Combine each department with the employee who manages it

  • Then project only the department name and manager’s name

This produces a single relation showing:

Department name → Manager’s first and last name



4. JOIN vs CARTESIAN PRODUCT

CARTESIAN PRODUCT

  • Combines every tuple of one relation with every tuple of the other

  • Often produces meaningless results

  • Must be followed by SELECT to filter matching tuples

JOIN

  • Combines only those tuples that satisfy the join condition

  • Conceptually equivalent to:

RconditionSσcondition(R×S)R \bowtie_{condition} S \equiv \sigma_{condition}(R \times S)

JOIN exists because this pattern is extremely common in database queries.


5. General Form of JOIN

For relations:

  • R(A1,A2,,An)R(A_1, A_2, …, A_n)

  • S(B1,B2,,Bm)S(B_1, B_2, …, B_m)

R<join condition>SR \bowtie_{<join\ condition>} S

Result

  • A relation with n + m attributes

  • One tuple for each pair of tuples from R and S that satisfies the join condition


6. Theta Join (θ-JOIN)

Definition

A Theta Join is a JOIN that uses any comparison operator:

θ{=, <, , >, , }\theta \in \{=,\ <,\ \le,\ >,\ \ge,\ \neq\}

Join Condition Format

Ai  θ  BjA_i \;\theta\; B_j

Example

EMPLOYEE ⋈Salary > Budget DEPARTMENT

Key Points

  • Most general form of JOIN

  • Tuples with NULL join attributes or false conditions are excluded

  • JOIN does not preserve all tuples from the original relations


7. Equijoin

Definition

An Equijoin is a Theta Join that uses only equality (=) as the comparison operator.

RA=BSR \bowtie_{A = B} S

Example

DEPARTMENT ⋈Mgr_ssn = Ssn EMPLOYEE

Observation

  • Both join attributes appear in the result

  • Their values are identical in every tuple

  • This redundancy leads to the NATURAL JOIN


*8. Natural Join (⨝ or )

Definition

A Natural Join:

  • Is an Equijoin

  • Automatically joins relations on attributes with the same name

  • Removes duplicate join attributes from the result

Symbol:

RSR * S

Example 1: PROJECT and DEPARTMENT

If:

  • PROJECT(Dnum, …)

  • DEPARTMENT(Dnumber, …)

First rename:

DEPT ← ρ(Dname, Dnum, Mgr_ssn, Mgr_start_date)(DEPARTMENT) PROJ_DEPT ← PROJECT * DEPT

Result

  • Each project is combined with its controlling department

  • Only one Dnum attribute appears in the result


Example 2: DEPARTMENT and DEPT_LOCATIONS

If both relations already share Dnumber:

DEPT_LOCS ← DEPARTMENT * DEPT_LOCATIONS
  • No renaming required

  • Produces one tuple per department location






9. Properties of JOIN

  • Result size is between:

    0RSR×S0 \le |R \bowtie S| \le |R| \times |S|
  • If no join condition → degenerates into CARTESIAN PRODUCT

  • Join selectivity measures how many tuple combinations satisfy the condition

  • JOINs discussed here are inner joins


10. Multiway JOIN

JOINs can be chained to combine more than two relations.

Example: Project–Department–Manager

((PROJECT ⋈Dnum=Dnumber DEPARTMENT) ⋈Mgr_ssn=Ssn EMPLOYEE))

Result

Each tuple contains:

  • Project details

  • Controlling department

  • Department manager information


11. JOIN in SQL (Conceptual Mapping)

JOIN in relational algebra corresponds to:

  • JOIN conditions in WHERE clause

  • JOIN … ON syntax (SQL2)

  • Nested queries and joined tables

Relational algebra provides the formal foundation, while SQL provides the practical implementation.


Summary 

JOIN Type        Key Idea
Theta Join                    Any comparison operator
Equijoin                    Equality condition only
Natural Join                    Equijoin + remove duplicate attributes
Inner Join                    Only matching tuples included

Comments

Popular posts from this blog

Database Management Systems DBMS PCCST402 Semester 4 KTU CS 2024 Scheme

Data Models, Schemas and Instances

Introduction to Database Management System -DBMS