IAC 2024 - IA Fast Track to Search Focused AI Solutions
UNIT 2- TRANSACTION CONCEPTS AND CONCURRENCY CONCEPTS (1).pdf
1. Unit 2
Transaction Concepts and Concurrency Control
By Kavita Shinde
Asst. Professor
Computer Science
MIT ACSC, Alandi Pune.
2. Transaction:
A transaction is a collection of operations that performs a single logical function in a
database application.
E.g. transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
3. The Four Properties of Transactions:
Atomicity
Consistency
Isolation
Durability
4. The Four Properties of Transactions:
Atomicity:
Either all the operation of transaction are executed or none. There must be no state
in a database where a transaction is left partially completed.
Consistency:
The database must remain in a consistent state after any transaction.
If the database was in a consistent state before the execution of a transaction, it
must remain consistent after the execution of the transaction as well.
Durability:
It states that once a transaction has been complete the changes it has made should
be permanent.
Isolation:
All the transactions will be carried out and executed as if it is the only transaction in
the system. No transaction will affect the existence of any other transaction.
6. Active state:
-First state of every transaction.
-In this state, the transaction is being executed.
Partially committed:
-A transaction executes its final operation, but the data is still not saved to the DB.
Committed:
-It executes all its operations successfully.
-All the effects are now permanently saved on the database system.
Failed:
-If any of the checks made by the database recovery system fails, then the
transaction is said to be in the failed state.
Aborted:
-If any of the checks fail and the transaction has reached a failed state then the
database recovery system will make sure that the database is in its previous
consistent state.
-If not then it will abort or roll back the transaction to bring the database into a
consistent state.
7. 2.2 Executing transactions concurrently associated problem in concurrent
execution:
-When many transactions execute concurrently in an uncontrolled or unrestricted
manner, then it might lead to several problems
(i)Temporary Update Problem
(ii)Lost Update Problem
(iii)Unrepeatable Read Problem
(iv)Phantom Read Problem
(v)Incorrect Summary Problem
8. Temporary Update Problem or (dirty read problem ):
-Reading the data written by an uncommitted transaction is called as dirty read.
-There is always a chance that the uncommitted transaction might roll back later.
-Thus, uncommitted transaction might make other transactions read a value that does
not even exist.
9. Unrepeatable Read Problem:
-This problem occurs when a transaction gets to read unrepeated i.e. different
values of the same variable in its different read operations even when it has not
updated its value.
10. Lost Update Problem: (W-W Conflict)
This problem occurs when multiple transactions execute concurrently and updates
from one or more transactions get lost.
11. Phantom Read Problem:
This problem occurs when a transaction reads some variable from the buffer and
when it reads the same variable later, it finds that the variable does not exist.
12. Incorrect Summary Problem:
Consider a situation, where one transaction is applying the aggregate function on
some records while another transaction is updating these records. The aggregate
function may calculate some values before the values have been updated and
others after they are updated.
13. 2.3 Schedules, types of schedules, concept of Serializability, Precedence graph
for Serializability:
Schedules :
A sequences of instructions that specify the chronological order in which instructions
of concurrent transactions are executed.
OR
A schedule in DBMS is the order in which the operations of multiple transactions
appear for execution.
-It must consist of all instructions of those transactions.
-It must preserve the order in which the instructions appear in each individual
transaction.
14.
15. Types of schedules:
(A) Serial Schedule:
-All the transactions execute serially one after the other.
-When one transaction executes, no other transaction is allowed to execute.
Ex:
Execute all the operations of T1 which was followed by all the operations of T2.
16. (B)Non-Serial Schedules:
-Multiple transactions execute concurrently.
-Operations of all the transactions are inter leaved or mixed with each other.
-There are two transactions T1 and T2 executing concurrently.
-The operations of T1 and T2 are interleaved.
17. Concept of Serializability:
The Non-Serial Schedule can be divided further into
(A)Serializable
(B)Non-Serializable
(A)Serializable:
-It is used to maintain the consistency of the database.
-It is mainly used in the Non-Serial scheduling to verify whether the scheduling will
lead to any inconsistency or not.
-These are of two types:
(1)Conflict Serializable
(2)View Serializable
18. (1)Conflict Serializable:
-If a given non-serial schedule can be converted into a serial schedule by swapping
its non-conflicting operations, then it is called as a conflict serializable schedule.
-Two operations are said to be conflicting if all conditions satisfy:
(1)They belong to different transactions
(2)They operate on the same data item
(3)At Least one of them is a write operation
19. Conflict serializability Example:
T1 T2
R(A)
R(B)
R(A)
R(B)
W(B)
W(A)
-To convert above schedule into a serial schedule, we must have to swap the R(A)
operation of transaction T2 with the W(A) operation of transaction T1.
-However we cannot swap these two operations because they are conflicting
operations, thus we can say that this given schedule is not Conflict Serializable.
20. Conflict serializability Example:
T1 T2
R(A)
R(A)
R(B)
W(B)
R(B)
W(A)
The above schedule is conflict serializable. We con convert above non-serial
schedule into serial one.
21. Precedence Graph or Serialization Graph:
- Is used commonly to test Conflict Serializability of a schedule.
- It is a directed Graph (V, E) consisting of a set of nodes V = {T1, T2, T3……….Tn}
and a set of directed edges E = {e1, e2, e3………………em}.
- The set of vertices is used to contain all the transactions participating in the
schedule.
- The set of edges is used to contain all edges Ti ->Tj for which one of the
three conditions holds:
Create a node Ti → Tj if Ti executes write (Q) before Tj executes read (Q).
Create a node Ti → Tj if Ti executes read (Q) before Tj executes write (Q).
Create a node Ti → Tj if Ti executes write (Q) before Tj executes write (Q).
Note: The Schedule S is serializable if there is no cycle in the precedence graph.
22. Precedence Graph or Serialization Graph Example:
S : r1(x) r1(y) w2(x) w1(x) r2(y)
(1)Make two nodes corresponding to Transaction T1 and T2.
(2)For the conflicting pair r1(x) w2(x), where r1(x) happens before w2(x), draw an
edge from T1 to T2.
Since the graph is cyclic, it not conflict serializable.
(3)For the conflicting pair w2(x) w1(x), where w2(x) happens before w1(x), draw
an edge from T2 to T1.
23. (2)View serializable:
-A Schedule is called view serializable if it is view equal to a serial schedule (no
overlapping transactions).
-Two view equivalent schedules S1 and S2 should satisfy the following conditions:
1. Initial Read
-An initial read of both schedules must be the same.
-Suppose two schedule S1 and S2. In schedule S1, if a transaction T1 is reading the
data item A, then in S2, transaction T1 should also read A.
Above two schedules are view equivalent because Initial read operation in S1 is done
by T1 and in S2 it is also done by T1.
24. 2. Updated Read
-In schedule S1, if Ti is reading A which is updated by Tj then in S2 also, Ti should
read A which is updated by Tj.
Above two schedules are not view equal because, in S1, T3 is reading A updated
by T2 and in S2, T3 is reading A updated by T1.
25. 3. Final Write:
-A final write must be the same between both the schedules.
-In schedule S1, if a transaction T1 updates A at last then in S2, final writes
operations should also be done by T1.
Above two schedules is view equal because Final write operation in S1 is done by
T3 and in S2, the final write operation is also done by T3.
26. (B)Non-Serializable:
The non-serializable schedule is divided into two types:
(1)Recoverable
(2)Non-recoverable Schedule.
(1) Recoverable Schedule:
-Schedules in which transactions commit only after all transactions whose changes
they read commit are called recoverable schedules.
-If some transaction T2 is reading value updated or written by some other
transaction T1, then the commit of T1 must occur before the commit of T2.
-Three types of recoverable schedule:
1.Cascading Schedule
2.Cascade less Schedule
3.Strict Schedule
27. Cascading Schedule:
-If in a schedule, failure of one transaction causes several other dependent
transactions to rollback or abort, then such a schedule is called as a Cascading
Schedule or Cascading Rollback or Cascading Abort.
28. Cascade less Schedule:
-A transaction is not allowed to read a data item until the last transaction that has
written it is committed or aborted, then such a schedule is called as a Cascadeless
Schedule.
In other words,
-Cascade less schedule allows only committed read operations.
-However, it allows uncommitted write operations.
29. Strict Schedule:
-A transaction is neither allowed to read nor write a data item until the last transaction
that has written it is committed or aborted, then such a schedule is called as a Strict
Schedule.
-In other words,
Strict schedule allows only committed read and write operations.
30. (2)Non-recoverable Schedule.
-T2 read the value of A written by T1, and committed.
-T1 later aborted, therefore the value read by T2 is wrong, but since T2 committed,
this schedule is non-recoverable.
T1 T2
R(A)
W(A)
W(A)
R(A)
commit
abort
31. 2.4 Ensuring Serializability by locks, different lock modes, 2PL and its variations.
Lock-Based Protocol:
-Any transaction cannot read or write data until it acquires an appropriate lock on it.
-There are two types of lock:
1. Shared lock: (S) (Read Only Lock)
-The data item can only read by the transaction.
-It can be shared between the transactions because when the transaction holds a lock,
then it can't update the data on the data item.
2. Exclusive lock: (X) (Read and Write Lock)
-The data item can be both reads as well as written by the transaction.
-This lock is exclusive, and in this lock, multiple transactions do not modify the same data
simultaneously.
S X
S Y N
X N N
32. The Two-Phase Locking Protocol (2PL):
-Ensures serializability.
-A transaction is said to follow Two Phase Locking protocol if Locking and Unlocking
can be done in two phases.
Growing Phase:
New locks on data items may be acquired but none can be released.
Shrinking Phase:
Existing locks may be released but no new locks can be acquired.
34. Variations of 2 PL:
(1) Strict Two-phase locking:
-In the first phase, after acquiring all the locks, the transaction continues to execute
normally.
-The only difference between 2PL and strict 2PL is that Strict-2PL does not release a
lock after using it.
-Strict-2PL waits until the whole transaction to commit, and then it releases all the
locks at a time.
-Strict-2PL protocol does not have shrinking phase of lock release.
35. (2) Rigorous Two-Phase Locking:
-Rigorous Two – Phase Locking Protocol avoids cascading rollbacks.
-This protocol requires that all the share and exclusive locks to be held until the
transaction commits.
(3) Conservative Two-Phase Locking Protocol: Static Two – Phase Locking Protocol.
- Basic 2 phase locking is modified and lock conversions are allowed.
- It requires locking of all data items to access before the transaction starts.
Growing Phase: Upgrading of lock (from S(a) to X (a)) is allowed.
Shrinking Phase: Downgrading of lock (from X(a) to S(a)) must be done here.
36. Timestamp Ordering Protocol:
- This protocol decides the ordering of transactions in advance to determine the
serializability order.
- When each transaction Ti enter into system, a fixed value called timestamp is
associated Ts(Ti)
- When new transaction suppose Tj enters into the system. Then Ts(Ti) < Ts(Tj)
- 2 Methods:
(1) Use value of system clock as timestamp: (Value of system clock)
A transaction's timestamp is equal to the value of system clock when transaction
enters into the system.
(1) Use logical counter : (Counter)
It is increased after a new timestamp has been assigned:
-The timestamp-ordering protocol ensures that any conflicting read and write
operations are executed in TimeStamp order.
-The timestamp of transaction Ti is denoted as Ts(Ti).
-Read time-stamp of data-item X is denoted by R-timestamp(X).
-Write time-stamp of data-item X is denoted by W-timestamp(X).
37. W_TS(X) is the largest timestamp of any transaction that executed write(X) successfully.
R_TS(X) is the largest timestamp of any transaction that executed read(X) successfully.
Timestamp ordering protocol works as follows −
(A)If a transaction Ti issues a read(X) operation −
-If TS(Ti) < W-timestamp(X)
Operation rejected.
-If TS(Ti) >= W-timestamp(X)
Operation executed.
-All data-item timestamps updated.
(B)If a transaction Ti issues a write(X) operation −
-If TS(Ti) < R-timestamp(X)
Operation rejected.
-If TS(Ti) < W-timestamp(X)
Operation rejected and Ti rolled back.
-Otherwise, operation executed
38. Thomas' Write Rule:
(A)If a transaction Ti issues a read(X) operation −
-If TS(Ti) < W-timestamp(X)
Operation rejected.
-If TS(Ti) >= W-timestamp(X)
Operation executed.
-All data-item timestamps updated.
(B)If a transaction Ti issues a write(X) operation −
-If TS(Ti) < R-timestamp(X)
Operation rejected.
-If TS(Ti) < W-timestamp(X)
W operation is ignored.
-Otherwise, operation executed
39. 2.6 Locks with multiple granularity, dynamic database concurrency (Phantom
Problem):
Granularity: It is the size of data item allowed to lock.
Multiple Granularity:
-It can be defined as hierarchically breaking up the database into blocks which can
be locked.
-It enhances concurrency and reduces lock overhead.
-It maintains the track of what to lock and how to lock.
41. Intention Mode Lock –
In addition to S and X lock modes, there are three additional lock modes with
multiple granularity:
Intention-Shared (IS):
Explicit locking at a lower level of the tree but only with shared locks.
Intention-Exclusive (IX):
Explicit locking at a lower level with exclusive or shared locks.
Shared & Intention-Exclusive (SIX):
The sub-tree rooted by that node is locked explicitly in shared mode and
explicit locking is being done at a lower level with exclusive mode locks.
Compatibility Matrix with Intention Lock Modes:
43. 2.8 Deadlock and deadlock handling - Deadlock Avoidance( wait-die, wound-
wait), Deadlock Detection and Recovery (Wait for graph).
Deadlock:
A deadlock is an unwanted situation in which two or more transactions are waiting
indefinitely for one another to give up locks.
Deadlock Conditions:
-Mutual exclusion: A resource can be held by at most one process.
-Hold and Wait: Processes that already hold resources can wait for another resource.
-Non-preemption: A resource, once granted, cannot be taken away.
-Circular wait: Two or more processes are waiting for resources held by one of the
other processes.
44. Deadlock Handling :
There are three classical approaches for deadlock handling:
(1)Deadlock prevention.
(2)Deadlock avoidance.
(3)Deadlock detection and removal.
45. Deadlock Avoidance:
-The deadlock avoidance approach handles deadlocks before they occur.
-There are two algorithms :
wait-die and wound-wait.
Wait-Die Scheme:
-It is a non-preemptive technique for deadlock prevention.
-When transaction T1 requests a data item currently held by T2,
T1 is allowed to wait only if it has a timestamp smaller than that of T2
(That is T1 is older than T2), otherwise T1 is rolled back (dies).
Ex:
Suppose that transaction T22, T23, T24 have time-stamps 5, 10 and 15 respectively. If
T22 requests a data item held by T23 then T22 will wait. If T24 requests a data item
held by T23, then T24 will be rolled back.
46. Wound-Wait Scheme:
-It is a preemptive technique for deadlock prevention.
-It is a counterpart to the wait-die scheme.
-When Transaction T1 requests a data item currently held by T2, T1 is allowed to
wait only if it has a timestamp larger than that of T2, otherwise T2 is rolled back
(T2 is wounded by T1)
For example:
Suppose that Transactions T22, T23, T24 have time-stamps 5, 10 and 15 respectively .
-If T24 requests a data item held by T23, then T24 will wait.
-If T22 requests a data item held by T23, then data item will be preempted from
T23 and T23 will be rolled back.
47. Deadlock Detection and Recovery:
-Aborting a transaction is not always a practical approach.
-Instead, deadlock avoidance mechanisms can be used to detect any deadlock
situation in advance.
Wait-for Graph:
-A graph is created based on the transaction and their lock.
-If the created graph has a cycle or closed loop, then there is a deadlock.
-The wait for the graph is maintained by the system for every transaction which
is waiting for some data held by the others.