Introduction to Concurrency Control in DBMS

 

Introduction to Concurrency Control in DBMS

In a Database Management System (DBMS), multiple users may access and update the database simultaneously. When several transactions execute at the same time, their operations may interfere with each other, potentially causing incorrect or inconsistent results.

Concurrency control is the mechanism used by a DBMS to manage simultaneous execution of transactions so that the database remains consistent and correct.

Its main goal is to ensure the isolation property of transactions and to guarantee serializability of schedules.


1. What is Concurrency Control?

Concurrency control refers to:

The techniques used by a DBMS to coordinate the execution of multiple transactions so that they do not interfere with each other and the database remains consistent.

When transactions run concurrently, their operations may be interleaved, forming different schedules. Concurrency control ensures that these schedules are equivalent to some serial schedule, meaning they produce correct results.


2. Why Concurrency Control is Needed

Without concurrency control, several problems may occur.

1. Lost Update Problem

Occurs when two transactions update the same data item and one update overwrites the other.

Example:

T1: Read X = 100 T2: Read X = 100 T1: X = X + 10 → Write 110 T2: X = X + 20 → Write 120

Final value becomes 120, but correct value should be 130.


2. Dirty Read Problem

Occurs when a transaction reads data written by another transaction that has not yet committed.

Example:

T1: Write X = 200 T2: Read X = 200 T1: Abort

T2 used invalid data.


3. Unrepeatable Read

Occurs when a transaction reads the same data item twice and gets different values because another transaction modified it.


4. Incorrect Summary Problem

Occurs when a transaction calculates an aggregate (like SUM) while another transaction updates the data simultaneously.


3. Objectives of Concurrency Control

Concurrency control mechanisms aim to achieve the following goals:

1. Isolation

Each transaction should execute as if it is the only transaction running.


2. Serializability

Concurrent execution must produce results equivalent to a serial execution.


3. Database Consistency

The database must move from one consistent state to another.


4. Maximum Concurrency

The system should allow many transactions to execute simultaneously to improve performance.


4. Concurrency Control Protocols

A protocol is a set of rules that transactions must follow to ensure correct execution.

Most concurrency control protocols guarantee serializability of schedules.

The main categories of concurrency control techniques are:

  1. Lock-Based Protocols

  2. Timestamp-Based Protocols

  3. Multiversion Concurrency Control

  4. Validation (Optimistic) Protocols

  5. Snapshot Isolation


5. Lock-Based Concurrency Control

One of the most common techniques used in DBMS is locking.

A lock is applied to a data item to control access by transactions.

Before accessing a data item, a transaction must obtain a lock on that item.


Types of Locks

Lock TypePurpose
Shared Lock (S)        Used for reading data
Exclusive Lock (X)        Used for writing data

Two-Phase Locking (2PL)

A widely used locking protocol is Two-Phase Locking.

It has two phases:

  1. Growing Phase – transaction acquires locks

  2. Shrinking Phase – transaction releases locks

This protocol guarantees conflict serializability.

However, locking protocols may cause high overhead and deadlocks.


6. Timestamp-Based Concurrency Control

Another method uses timestamps.

A timestamp is a unique identifier assigned to each transaction when it starts.

Properties:

  • Generated in the order of transaction start times

  • Used to determine execution order

The system ensures that transactions execute in timestamp order, preventing conflicts.

Advantages:

  • Avoids deadlocks

  • Simple concept

Disadvantages:

  • May cause frequent rollbacks


7. Multiversion Concurrency Control (MVCC)

This technique maintains multiple versions of a data item.

Instead of overwriting data, the system creates a new version.

Transactions can read older versions while updates create new versions.

Advantages:

  • High concurrency

  • Readers do not block writers

Example protocols include:

  • Multiversion Timestamp Ordering

  • Multiversion Two-Phase Locking

Many modern DBMS systems use MVCC.


8. Optimistic Concurrency Control (Validation-Based)

This method assumes that conflicts are rare.

Transactions execute without locking resources initially.

At the end of execution, the system performs validation to check if conflicts occurred.

Phases:

  1. Read Phase

  2. Validation Phase

  3. Write Phase

If validation fails, the transaction is rolled back.

This approach is called optimistic because it assumes conflicts will not occur frequently.


9. Snapshot Isolation

Snapshot isolation allows each transaction to work with a snapshot (copy) of the database at a specific time.

Characteristics:

  • Transactions see consistent database snapshots

  • Updates create new versions

Advantages:

  • High concurrency

  • Lower overhead compared to locking

Many commercial DBMSs (like PostgreSQL, Oracle) implement snapshot-based techniques.


10. Granularity of Data Items

Concurrency control also depends on granularity, which refers to the size of the data item being locked or controlled.

Granularity levels may include:

Level    Example
Attribute level        Individual field
Record level        One tuple
Block level        Disk block
File level        Entire table
Database level        Whole database

Smaller granularity → higher concurrency but more overhead.

Larger granularity → lower overhead but reduced concurrency.


11. Concurrency Control with Indexes

Indexes introduce additional challenges because:

  • Multiple transactions may access index structures

  • Updates to indexes must remain consistent with the database

Special concurrency control techniques are used to maintain index consistency.


12. Importance of Concurrency Control

Concurrency control is essential because it:

  • Maintains data consistency

  • Prevents transaction interference

  • Improves system performance

  • Ensures correct concurrent execution

  • Supports ACID properties of transactions


Summary

Concurrency control is a critical component of transaction management in DBMS. It ensures that multiple transactions can execute concurrently without violating database consistency.

Main techniques include:

TechniqueMain Idea
Lock-Based Protocols        Use locks to control access to data
Timestamp Ordering        Execute transactions based on timestamps
Multiversion Control        Maintain multiple versions of data
Validation Protocols        Validate transactions before committing
Snapshot Isolation        Transactions read consistent snapshots

These techniques help maintain serializability, isolation, and correct database behavior during concurrent transaction execution.

Comments

Popular posts from this blog

Database Management Systems DBMS PCCST402 Semester 4 KTU CS 2024 Scheme

Data Models, Schemas and Instances

Introduction to Database Management System -DBMS