Distributed Systems

Distributed Systems Session 8: Concurrency Control Christos Kloukinas Dept. of Computing City University London

Last session • 1 Location Transparency • Not a good idea to hard code location information in components--> Migration difficult • 2 Naming • Associating external names to references • 3 Trading • looking up servers by what services they offer

0.1 Naming 1 Naming Service Examples e.g NFS, X.500, DNS 2 Common Characteristics • External names, hierarchies, contexts, persistence of bindings, resolve and bind operations. 3 CORBA Naming Service interface NamingContext 4 Limitations- not always the case that we know names

0.2. Java Example: Client Finding Objects ORB ORB.init(args,null); 1. org.omg.CORBA.Object objRef= org.omg.CORBA.resolve_initial_references ("NameService"); CosNaming.NamingContext root= CosNaming.NamingContextHelper.narrow(objRef); 2. CosNaming.NameComponent name[] = { new NameComponent(“UEFA”,”ORG”), new NameComponent(“England”,”Country”), new NameComponent(“Premier”,”League”), new NameComponent(“Arsenal”,”Club”)} 3. Team t=TeamHelper.narrow(root.resolve(name)); 4. t.print(); Transparently get the naming service Casting

0.3 Trading • Characteristics • Need a trader (mediator), Quality of service, language to express quality of service. • Quality of service can be expressed statically (e.g. privacy, precision) or dynamically (e.g performance) • Service matching and service shopping • Example: Video on Demand • OMG/CORBA Trading Service

Session 8 - Outline 1 Motivation 2 Concurrency Control Techniques 3 CORBA Concurrency Control Service 4 Summary

1 Motivation • How can multiple components in a distributed system use a shared component concurrently without violating the integrity of the component? • This question is of fundamental importance as there are only very few distributed systems where all components are only used by a single component at a time.

1 Motivation (ctd.) • Resources maintained concurrently may be hardware components (e.g. a printer), operating system resources (e.g. files or sockets), databases (e.g. the bank accounts kept by different banks) or CORBA objects. • For some types of accesses, resources may have to be accessed in mutual exclusion • It does not make sense to have print jobs of different users being printed in an interleaved way; • Only one user should be editing a file at a time, otherwise the changes made by other users would be overwritten if the last user saves his or her file; • integrity of databases or CORBA objects may be lost through concurrent updates. • Hence, the need arises to restrict the concurrent access of multiple components to a shared resource in a sensible way.

1 Motivation (ctd.) • Concurrent access and updates of resources which maintain state information may lead to: • lost updates • inconsistent analysis • Motivating example for lost updates: • Cash withdrawal from ATM and concurrent • Credit of cheque • Motivating example for inconsistent analysis: • Funds transfer between accounts of one customer • Sum of account balances (Report for Inland Revenue)

1 Motivating Examples class Account { protected float balance; public float get_balance() {return balance;}; void debit(float amount){ float new=balance-amount; balance=new; }; void credit(float amount) { float new=balance+amount; balance=new; }; }; The object stores the balance in the instance variable balance. The object can return the current balance through operation get_balance(). The debit() operation subtracts the amount passed as a parameter from the balance and the credit() operation adds the amount passed as a parameter.

1 Lost Updates Balance of account anAcc at t0 is 75 Customer@ATM: Clerk@Counter: t0 anAcc.debit(50): new=25; balance=25; t1 anAcc.credit(50); new=125; balance=125; t2 t3 t4 t5 t6 Time WRITER WRITER

1 Inconsistent Analysis Balances at t0Acc1: 7500, Acc2: 0 Funds transfer: Inland Revenue Report: t0 Acc1.debit(7500): Acc1.new=0; Acc1.balance=0; Acc2.credit(7500): Acc2.new=7500; Acc2.balance=7500; t1 float sum=0; sum+=Acc2.get_bal(): // sum=0; sum+=Acc1.get_bal(): // sum=0; t2 t3 t4 t5 t6 t7 Time WRITER READER

2 Concurrency Control Techniques 1 Assessment Criteria 2 Pessimistic Concurrency Control • e.g. Two Phase Locking (2PL) 3 Optimistic Concurrency Control 4 Comparison

Concurrency Control Techniques • Ensures integrity of shared resource amidst concurrent access • e.g in database, ensures users from editing same record at the same time • concerned with serialising transactions, ensuring safe execution • resolving conflicts and deadlocks • ensuring fairness among concurrent processes • restoring component integrity

2.1 Assessment Criteria • Serialisability: Concurrent threads are serialisable, if they can be executed one after another and have the same effect on shared resources. It can be proven that serialisable threads do not lead to lost updates and inconsistent analysis. • Deadlockfreedom: Concurrency control techniques that use locking may force threads to wait for other threads to release a lock before they can access a resource. This may lead to situations where the wait-for relationship is cyclic and threads are deadlocked. • Fairness: refers to the fact whether all threads have the same chances to get access to resources. • Complexity: On the other hand to compute precisely those and only those schedules that are serialisable may be very complex and we are interested in the complexity that a concurrency control schedule has in order to estimate its performance overhead. • Concurrency!!!: We are also interested in the degree of concurrency that a control scheme allows threads to perform. It is obviously undesirable to restrict schedules that do not cause serialisability problems.

Concurrency Control Techniques: Families • Pessimistic • Assumes that collisions are likely to occur. Locks are used. • + Changes are consistent and safe • - Is not scalable • Optimistic • The idea is that you accept the fact that collisions occur infrequently, and instead of trying to prevent them you simply choose to detect them and then resolve the collision when it does occur. • Uses timestamps, and actions can be rolled back

2.2 Two Phase Locking (2PL) • The most popular concurrency control technique. Used in: • RDBMSs (Oracle, Ingres, Sybase, DB/2, etc.) • ODBMSs (O2, ObjectStore, Versant, etc.) • Transaction Monitors (CICS, etc) • The principal component that implements 2PL is a lock manager from which concurrent processes or threads acquire locks on every shared resource they access. • The lock manager investigates the request and compares it with the locks that were already granted on the resource . • If the requested lock does not conflict with an already granted lock, the lock manager will grant the lock and note that the requester is now using the resource.

Terminology • Locks and Locksets • Locking • Lock Compatibility • Locking Conflict • Deadlocks • Waiting graph • Locking granularity • Hierarchical Locking • Locking transparency

2.2 Locks • A lockis a token that indicates that a process accesses a resource in a particular mode. • Minimal lockmodes: read and write. • Locks are used to indicate to concurrent processes or threads the way in which a resource is used. • The lock manager, therefore, maintains a set of locks for each resource I.e. associates locksets with every shared object

2.2 Locking • Processes acquire locks before they access shared resources and release locks afterwards. • 2PL: Processes do not acquire locks once they have released a lock. • Typical 2PL locking profile of a process: Number of locks held Time

2.2 Locking • 2PL is based on the assumption that processes or threads always acquire locks before they access a shared resource and that they release a lock if they do not need the resource anymore. • In 2PL, processes do not acquire locks once they have released a lock. • This means that threads operate in cycles where there is a lock acquisition phase and a lock release phase in each cycle. • 2PL has its name due to these two phases.

2.2 Lock Compatibility • The lock manager grants locks to requesting processes or threads on the basis of already granted locks and their compatibility with the requested lock. • The very core of any pessimistic concurrency control technique that is based on locking is the definition of a lockcompatibilitymatrix. It defines the different lock modes and the compatibility between them. • Minimal lock compatibility matrix:

2.2 Locking Conflicts • Locking conflict: When access cannot be granted due to incompatibility between requested lock and previously-granted lock • On the occasion of a locking conflict, • Requester cannot use the resource until the conflicting lock has been released. • There are two approaches to handle locking conflicts. • The requesting process can be forced to wait until the conflicting locks are released. This may, however, be too restrictive since the process or thread may well do other computations in between. • Alert the process or thread that the lock cannot be granted. It can then continue with other processing until a point in time when it definitely needs to get access to the resource. • Several 2PL implementations provide two locking operations, a blocking and a non-blocking one, so the requester can decide.

2.2 Example (Avoiding Lost Updates) Balance of account anAcc at t0 is 75 Customer@ATM: Clerk@Counter: anAcc.debit(50): anAcc.lock(write); new=75-50=25; balance=25; anAcc.unlock(write); t0 anAcc.credit(50); anAcc.lock(write); new=25+50=75; balance=75; anAcc.unlock(write); t1 t2 t3 t4 t5 t6 Time

2.2 Example (Avoiding Lost Updates) • Before the account objects are changed, the debit and credit operations request a lock on the account object from the lock manager. • Then the lock manager detects a write/write locking conflict and forces the second process to wait until the first process has released its lock. Then the second process reads the up-to-date value of the balance of the account and modifies it without loosing the update of the first process.

2.2 Deadlocks • Recall that lock manager may force processes or threads to wait for other processes to release locks. • This solves problem of lost update and inconsistent analysis. • Processes may request locks for more than one object • Situations may arise where two or more processes or threads are mutually waiting for each other to release their locks.. • These situations are called deadlocks and • Very undesirable as they block threads and prevent them from finishing their jobs. • Hence 2PL is NOT deadlock-free.

Waiting Graph p4 p2 p1 p3 p9 p6 p5 p8 p7 In this process waiting graph, the four processes P1, P2,P3,P7 are in a deadlock

2.2.1 Deadlock Detection and Resolution • Deadlocks are resolved by lock managers. • Manager maintains up-to-date representation of the waiting graph. • Manager records every locking conflict by inserting a graph edge. • Also when a conflict is resolved by releasing a conflicting lock the respective edge has to be deleted. • Manager uses waiting graph to detect deadlocks. • Resolution: Break cycles, i.e. select one process or thread that participates in such a cycle and abort it. • Select a node that has maximum incoming or outgoing edges to reduce chances of further deadlock • An abortion of a process requires to undo all actions that the process has performed and to release all locks the process has held!!!

2.2 Locking Granularity • Observation: Objects that are accessed concurrently are often contained in more coarse grained composite objects e.g • Directories can contain other directories, files are contained in directories, files have records • Relational databases contain a set of tables, which contain a set of tuples, which contain attributes; or • Distributed composite objects may act as containers for component objects, which may again contain other objects • A normal access pattern is to visit all or a large subset of the objects that are contained. • Concurrency control manager can save effort by exploiting containment hierarchies.

2.2.1 Locking Granularity • Two phase locking is applicable to resources of any granularity. • It works for CORBA objects as well as for files and directories or even complete databases. • However, the degree of concurrency that is achieved with 2PL depends on the granularity that is used for locking. • A high degree of concurrency is achieved with small locking granules. • The disadvantage of choosing a small locking granularity is that a huge number of locks have to be acquired if bigger granules have to be locked. • Trade-off : Degree of concurrency Vs locking overhead. • If we decrease the granularity we can process more processes concurrently but have to be prepared to spend higher costs for the management of locks. • The dilemma can be resolved using an optimisation, which is hierarchical locking.

2.2.2 Containment Hierarchy Bank Bank G2 Gn Group of Branches G1 B1 B2 Bn Branches Containment hierarchy of account objects Accounts

2.3 Hierarchical Locking • Allows locking of all objects contained in a composite object (container). • BUT also allows a process to indicate, at container level, the sub-resources that it is intending to use in a particular mode. • The hierarchical locking schemes therefore introduce intention locks, such as intention read and intention write locks. • I.e intention locks are acquired for a composite object before a process requests a real lock for an object that is contained in the composite object. • Intention locks signal to those processes that wish to lock entire composite object that some other processes currently has locks for objects contained in composite object

2.3.1 Hierarchical Locking • Intention ReadIndicate that some process has or is about to acquire read lock on the objects inside a composite object • Intention Write indicate that some process has or is about to acquire write locks on object in composite object. • Processes that want to lock a certain resource would then acquire intention locks on the container of that resource and all its containers. • The lock compatibility matrix is defined in a way that a locking conflict will arise if a container object is already locked in either read or write mode.

2.3.2 Hierarchical Locking • NB: Intention read and intention write are compatible because they do not actually correspond to any locks. • Other modes: • IR lock is compatible with R lock because accessing object for reading does not change values • IR lock is incompatible with W lock because it is not possible to modify every element of the composite object while some other process process is reading the state of an object of the composite • etc etc • Hence the advantage of hierarchical locking is that it • enables different lock granularities to be used at the same time • Overhead is that for every individual object intention locks have to be used on every composite object in which the object is contained. (may be contained in more than one containers)

2.4 Transparency of Locking • The last question that we have to discuss is WHO is acquiring the locks, i.e. who invokes the lock operation for a resource. The options are: • the concurrency control infrastructure, such as the concurrency control manager of a database management system; • the implementation of components or • the clients of the components. • The first option is very much desirable as then concurrency control would be transparent to the application programmers of both the component and its clients. • Unfortunately this is only possible on limited occasions (in a database system) because the concurrency control manager would have to manage all resources and it would have to be informed about every single resource access. • The last option is very undesirable and it is in fact always avoidable. Hence distributed components should be designed in a way that concurrency control is hidden within their implementation and not exposed at their interface and is transparent to designers of CLIENTS

2.4 Optimistic Concurrency Control • In general, the complexity of two phase locking is linear in the number of the accessed resources. With hierarchical locking it is even slightly more complex as also containers of resources have to be locked in intentional mode. • This overhead, however, is unreasonable if the probability of a locking conflict is very limited. • Given the motivating examples we discussed earlier, it is quite unlikely that you withdraw cash from an ATM in that very millisecond when a clerk credits a cheque. • This is where optimistic concurrency control comes in. • It follows a laissez-faire approach and works as a watchdog that detects conflicts only when they really happen.

2.3 Optimistic Concurrency Control (ctd.) • Every thread or process works on its private logical copy of the set of shared resources. • While a process or thread accesses resources, the concurrency control manager keeps a log of them. • Timestamps are required • At a certain point in time, the access patterns are validated against conflicts with concurrent processes or threads. • If no conflicts occurred the changes done can be made known to the global set of resources. • If conflicts occurred the process has to discard its logical copy and start over again on an up-to-date copy of the resources.

Phases • 1. Read: • Process/transaction executes reading values ,writing to a private copy • 2. Validation • when process completes, manager checks whether process could have possibly conflicted with any other concurrent process. If there is a possibility, the process aborts, and restarts. • 3. Write: • If there is no possibility of conflict, the transactions commits. • If there are few conflicts, • validation can be done efficiently, and leads to better performance than other concurrency control methods. Unfortunately, if there are many conflicts, the cost of repeatedly restarting operation, hurts performance significantly

2.3 Validation Prerequisites • As a pre-requisite for optimistic concurrency control it is required to separate the overall sequence of operations a process performs into distinguishable units. A validation of the access pattern of a unit is then performed during a validation phase at the end of each unit. • For each unit the following information has to be gathered: • Starting time of the unit st(U). • Time stamp for start of validation TS(U). • Ending time of unit E(U). • Read and write sets RS(U) and WS(U). (set of resources U has accessed in read and write mode) • Needs precise time information!!! • Requires synchronisation of the local clocks!!! (of resources CORBA objects)

2.3 Validation Set • The validation of a unit has to be done against all concurrent units that have already been successfully validated. We, therefore denote the set of those units as the validation set VU(u). • VU(u) is formally defined as: VU(u):={x | st(u)<E(x) and x has been validated } i.e VU(u) contains units x that were active concurrently with u but have been validated before it

2.3 Conflict Detection • During the validation phase, the concurrency control manager has to look for two types of conflicts: read/write and write/write conflicts. • A read/write conflict occurred during the course of a unit u iff: u’  VU(u) : WS(u)  RS(u’)  RS(u)  WS(u’)  • A write/write conflict occurred during the course of a unit u iff: u’  VU(u) : WS(u)  WS(u’)  • In both cases the unit cannot be completed but has to be undone. --u has written a resource that this other unit U’ has read and vice versa --u has modified a resource that this other unit u’ has modified as well

Optimistic Conc. Control – Example (1/3) • Assume that you have the following optimistic units: Unit# |start time |end time |read set |write set 1 | 1 | 5 | 1,3,5 | 2,4 2 | 3 | 7 | 2,3,5 | 6,4 3 | 5 | 9 | 2,3,5 | 7,8 4 | 10 | 15 | 7,3,5 | 7,8 • What is the validation set (VU) of each one of them? • Which ones have a conflict (read/write or write/write) and where exactly does the conflict appear? • Which of the transactions in the table above will get validated?

Optimistic Conc. Control – Example (2/3) VU(1) = {} Why? Because when it finishes, no other unit has finished yet. So, unit 1 gets validated immediately. VU(2) = {1} Why? Because the end time of unit 1 (5) is greater than the starting time of unit 2 (3) and unit 1 has been validated. Unit 2 has a read/write conflict with unit 1 (in resource 2) and a write/write with unit 1 (in resource 4).

Optimistic Conc. Control – Example (3/3) VU(3) = {} Why? Because only unit 2 has an end time greater than the starting time of unit 3 but unit 2 has not been validated (so it’s ignored). Therefore, unit 3 gets validated immediately. VU(4) = {} Why? Because no unit has an end time greater than the starting time of unit 4. Thus, unit 4 will be validated as well.

2.4 Comparison • Both, pessimistic and optimistic techniques, • guarantee serialisability of processes • impose a serious complexity in that they need the ability to undo the effect of processes and threads. • Pessimistic techniques cause a • considerable concurrency control overhead through locking and • they are not deadlock-free • However, they are sufficiently efficient when conflicts are likely. • A serious advantage of optimistic techniques • a neglectable overhead when conflicts are unlikely • Furthermore they are deadlock-free. • However the computation of conflict sets is very, very difficult and complex in a distributed setting. Moreover the optimistic techniques assume the existence of synchronised clocks, which are generally not available in a distributed setting.

2.4 Comparison (ctd.) • In summary, the disadvantages of optimistic concurrency control overwhelm the advantages and in most distributed systems concurrency is controlled using pessimistic techniques.

3 CORBA Concurrency Control Service Application Objects CORBAfacilities Object Request Broker CORBAservices Concurrency Control

3 Lock Compatibility • The Concurrency Control service supports hierarchical locking, as many CORBA objects take the role of container objects. • As a further optimisation the service defines a lock type for upgrade locks. • Upgrade locks are read locks that are not compatible to themselves. Upgrade locks are used in occasions when the requester knows that it only needs a read lock to start with but later will have to acquire a write lock on that resource as well. • If two processes are in this situation, they would run into a deadlock if they used only read locks. With upgrade locks the deadlock can be prevented as the second process trying to acquire the upgrade lock will be delayed already.

3 Lock Compatibility (ctd.) • Compatibility matrix:

3 Locksets • The central object type defined by the Concurrency Control service is the lockset. A lockset is associated to a resource. • With the Concurrency Control service, concurrency control has to be managed by the implementation of a shared resource. Hence the implementation of a resource would usually have a hidden lockset attribute. • Operation implementations included in that resource acquire locks before they access or modify the resource.

Distributed Systems