Operating Systems / Chapter 5: Parallel Processes /

Deadlocks

The lock is an important synchronization tool, but they can cause big problems if we are not careful. For example, if a thread acquires a lock but never releases it, then other threads that are waiting for the lock won't get to run. This can happen in a few different ways.

Case 1: lock.release() is never called. The programmer simply forgot to call lock.release().

Case 2: lock.release() is called only if certain conditions are met. Here is an example:

lock.acquire()
if a == 5:
    lock.release()

In this example, If a is not 5, then lock.release() is not called.

Case 3: the thread never exits the critical section. Here is an example:

lock.acquire()
while True:
    a = a + 1
lock.release()

In this case, the critical section is an infinite loop. Thus, the thread will never reach lock.release(), causing waiting threads to wait forever.

The 3 cases above are relatively easy to detect and resolve. Let's now turn our attention to a more interesting scenario that involves 2 locks and 2 different threads.

Suppose that we are building an online bank called OB. In OB, customers can perform the following actions:

  • Deposit money.
  • Withdraw money.
  • Transfer money from one account to another within the same bank.
  • Check account balance.

We begin by creating an Account class and the withdraw, deposit, transfer, and check_balance functions as follows:

class Account:
    def __init__(self, name, balance):
        self.name = name
        self.balance = balance

def withdraw(account, amount):
    account.balance = account.balance - amount

def deposit(account, amount):
    account.balance = account.balance + amount

def transfer(from_account, to_account, amount):
    from_account.balance = from_account.balance - amount
    to_account.balance = to_account.balance + amount

def check_balance(account):
    return account.balance    

accounts = [ Account("Ace", 300), Account("Bob", 400) ]

Account objects contain the account holder's name and the account balance in US dollars. To withdraw or deposit money, OB adjusts the balance attribute of the Account object. As a test, we created two accounts, one for Ace and another for Bob.

We then add a loop that serves customer requests as follows:

while True:
    request = receive_http_request()
    if request['path'] == '/withdraw':
        args = request['body']
        index = args['account']
        amount = args['amount']
        thread = Thread(withdraw, [ accounts[index], amount ] )
        thread.start()
    elif request['path'] == '/deposit':
        args = request['body']
        index = args['account']
        amount = args['amount']
        thread = Thread(deposit, [ accounts[index], amount ] )
        thread.start()
    elif request['path'] == '/transfer':
        args = request['body']
        from_index = args['from_account']
        to_index = args['to_account']
        amount = args['amount']
        thread = Thread(transfer, [ accounts[from_index], accounts[to_index], amount ] )
        thread.start()
    else request['path'] == '/check_balance':
        args = int(request['body'])
        index = args['account']
        thread = Thread(check_balance, [ accounts[index] ] )
        thread.start()

This server processes HTTP requests. receive_http_request() blocks the Python thread until a request is received. When the request arrives, the thread parses the message into a dictionary and stores it in request. The request path indicates the action that the client wants to perform. For example, if the path is /withdraw, then the client wants to withdraw money. The next line in the request message, called the body, contains the account number and the amount the client wishes to withdraw. For example, if the request body is:

account=0&amount=100

Then the server should withdraw 100 dollars from the accounts[0]. Thankfully, Python parses the body into a dictionary and stores it in request['body']. To process the withdrawal, the server spawns a thread which executes the withdraw function. The withdraw function decreases the balance attribute of the Account object by request['body']['amount'].

Now, suppose that server receives two withdraw requests at around the same time. Both requests wish to withdraw 20 dollars from Ace's account. Since Ace's account initially contains $300, after 2 withdrawals, the final balance should be $260. However, after both requests are processed, we find that the balance is $280. How did this happen?

The withdraw function executes the following statement:

account.balance = account.balance - amount

We can break this statement into 3 steps:

  1. Retrieve account.balance and amount.
  2. Subtract amount from account.balance and push the result to the Values Stack.
  3. Pop the Values Stack and store the popped value in account.balance.

Suppose that the first thread performs steps 1 and 2. After step 2, the Values Stack contains 280. But before the first thread can perform step 3, a timer interrupt pauses the first thread and starts the second thread. The second thread then performs all 3 steps. Note that each thread has its own Values Stack, so the threads are not overwriting each other's Values Stacks.

At this point, account.balance contains 280. The second thread is complete, and now the first thread resumes execution. The first thread then pops its Values Stack, which contains 280. The first thread then stores 280 in account.balance and terminates.

The deposit function has a similar problem: when two threads deposit to an account at the same time, only one of the deposits may actually go through.

In both cases, problems arise due to multiple threads stepping over each other. Let's solve this problem by adding locks to accounts as follows:

class Account:
    def __init__(self, name, balance):
        self.name = name
        self.balance = balance
        self.lock = Lock()

def withdraw(account, amount):
    account.lock.acquire()
    account.balance = account.balance - amount
    account.lock.release()

def deposit(account, amount):
    account.lock.acquire()
    account.balance = account.balance + amount
    accoutn.lock.release()

def transfer(from_account, to_account, amount):
    from_account.balance = from_account.balance - amount
    to_account.balance = to_account.balance + amount

def check_balance(account):
    return account.balance    

Now only one thread can execute read and update the account balance at a time.

Next, let's take a look at the transfer function. A money transfer consists of withdrawing money from one account and depositing the same amount of money in the other account. We want the two statements to execute atomically. That is, we want to ensure that nothing can happen in between the withdrawal and deposit.

Let's see if we can achieve atomicity by surrounding both operations with locks as before. Here is an example:

def transfer(from_account, to_account, amount):
    from_account.lock.acquire()
    from_account.balance = from_account.balance - amount
    from_account.lock.release()

    to_account.lock.acquire()
    to_account.balance = to_account.balance + amount
    to_account.lock.release()

The function now ensures that only 1 thread can withdraw at a time and only 1 thread can deposit at a time. However, the transfer operation as a whole is not atomic, because while one thread is in between the withdrawal and the deposit operations, the OS can pause the thread and let another thread execute the full transfer function.

Here is a concrete example. Suppose that Ace's account currently has $300 and Bob's account currently has $0. Suddenly, 3 clients decide to transfer 100 dollars from Ace's account to Bob's account at the same time. Let's walk through the program to see what could go wrong.

The program spawns 3 threads, TA, TB, and TC, to handle these requests. Suppose that TA decreases Ace's balance by 100 and release Ace's account lock. But before TA can add 100 to Bob's balance, the OS switches to TB. Since Ace's account lock has been released, TB immediately acquires Ace's account lock and decreases Ace's balance by another 100. Then, before TA or TB can increase Bob's balance, OS switches to TC. TC also decreases Ace's balance by $100. At this point, accounts[0].balance is 0 and accounts[1].balance is also 0. If the bank happened to be auditing user accounts, alarms would go off because $300 just disappeared. Now, Bob's account will eventually be deposited, but we don't want the intermediate states to be visible to other threads. In a big bank, thousands of transactions are happening every second. In such settings, the probability that at least one account is in the intermediate state at any given time is very high. If the intermediate states are visible to the auditor, the bank may never be able to audit customer accounts accurately.

Alternatively, we can acquire both account locks before the withdrawal and release both locks after the deposit as shown below:

def transfer(from_account, to_account, amount):
    from_account.lock.acquire()
    to_account.lock.acquire()
    from_account.balance = from_account.balance - amount
    to_account.balance = to_account.balance + amount
    to_account.lock.release()
    from_account.lock.release()

Threads now must acquire both account locks before updating account balances. Thus, while one thread is transferring money from from_account to to_account, other threads cannot withdraw or deposit money from either account. But threads can still check the account balances in between the withdrawal and the deposit statements. We can fix that by acquiring the account lock before we check the balance. check_balance is now:

def check_balance(account):
    account.lock.acquire()
    balance = account.balance
    accoutn.lock.release()
    return balance  

Can we declare victory yet? Not quite. In the process of blocking concurrent access to the transfer function, we created a new problem. For example, suppose that one thread, TA, calls transfer(accounts[0], accounts[1], 100), and another thread, TB, calls transfer(accounts[1], accounts[0], 100) at around the same time. Here is one possible sequence of events:

  1. TA acquires accounts[0].lock in line 1.
  2. The timer interrupt causes the OS to switch to TB.
  3. TB acquires accounts[1].lock in line 1.
  4. TB tries to acquire accounts[0].lock in line 2, but fails, since TA already owns accounts[0].lock. TB blocks.
  5. The OS switches to TA.
  6. TA tries to acquire accounts[1].lock in line 2, but fails, since TB already owns accounts[1].lock. TA blocks.

At this point, both threads are blocked. Neither lock can resume until the other thread releases its lock. The threads are deadlocked.

How did this happen? TA tried to acquire Ace's lock first and then Bob's lock, while TB tried to acquire Bob's lock first and then Ace's lock. The acquisition order of TA is the reverse of TB. On an unlucky day, the threads both acquire one lock each and then tries to acquire each other's locks.

Once threads are deadlocked, they have no way out; the threads have to be terminated. Thus we need to prevent deadlocks from happening at all. The deadlock occurred when the acquisition order of one thread is the reverse of the other. Thus, we need to ensure that both threads acquire the locks in the same order. For example, both threads can try to acquire accounts[0].lock first and accounts[1].lock second. Then the second thread has to wait until the first thread acquires both locks, transfers the money, and releases the locks. One way to enforce the order of acquisitions is to sort the locks by the account name.

Here is an example:

def transfer(from_account, to_account, amount):
    if from_account.name > to_account.name:
        lock1 = from_account.lock
        lock2 = to_account.lock
    else:
        lock1 = to_account.lock
        lock2 = from_account.lock

    lock1.acquire()
    lock2.acquire()
    from_account.balance = from_account.balance - amount
    to_account.balance = to_account.balance + amount
    lock2.release()
    lock1.release()
Quiz: Check All That Apply (1 point)

Please select all the choices where the lock may not be released.

   
   
   
Become a subscriber to save your progress, see the correct answer, and more!
Quiz: Check All That Apply (1 point)

Consider the following program:

def append_data(name, address):
    name_lock.acquire()
    names.append(name)
    name_lock.release()

    address_lock.acquire()
    addresses.append(address)
    address_lock.release()

names = []
addresses = []
name_lock = Lock()
address_lock = Lock()
t1 = Thread(append_data, ['Adrian Day', 'Adelaide'])
t2 = Thread(append_data, ['Brian Summer', 'Bristol'])
t1.start()
t2.start()

Please check all the possible values of the two lists after this program is executed.

   
   
   
   
   
Become a subscriber to save your progress, see the correct answer, and more!
Quiz: Check All That Apply (1 point)

Consider the following program:

def append_data_1(name, address):
    name_lock.acquire()
    address_lock.acquire()
    names.append(name)
    addresses.append(address)
    address_lock.release()
    name_lock.release()

def append_data_2(name, address):
    address_lock.acquire()
    name_lock.acquire()
    names.append(name)
    addresses.append(address)
    name_lock.release()
    address_lock.release()

names = []
addresses = []
name_lock = Lock()
address_lock = Lock()
t1 = Thread(append_data_1, ['Adrian Day', 'Adelaide'])
t2 = Thread(append_data_2, ['Brian Summer', 'Bristol'])
t1.start()
t2.start()

Please check all the possible values of the two lists after this program is executed.

   
   
   
   
   
Become a subscriber to save your progress, see the correct answer, and more!
Quiz: Check All That Apply (1 point)

Please select all true statements.

   
   
   
   
Become a subscriber to save your progress, see the correct answer, and more!
Previous Lesson Next Lesson

Comments

Please log in to add comments