butcher-2014

Anthony Graca

Citation #

@book{butcher2014,
  title={Seven Concurrency Models in Seven Weeks: When Threads Unravel},
  author={Butcher, Paul},
  year={2014},
  publisher={The Pragmatic Bookshelf}
}

Summary. What are the statements being made? #

1. Introduction #

Concurrent or Parallel? #

  • Concurrent and parallel refer to two related but different things
  • “A concurrent program has multiple logical threads of control. These threads may or may not run in parallel (Butcher 2014, 1).”
    • Concurrency is an aspect of the problem domain, meaning your algorithm needs to handle simultaneous events
    • “Concurrency is about dealing with lots of things at once - rob pike”
  • “A parallel program potentially runs more quickly than a sequential program by executing different parts of the computation simultaneously in parallel. It may or may not have more than one logical thread of control”
    • Parallelism is an aspect of the solution domain, meaning you want to make your program faster
    • “Parallelism is about doing lots of things at once - rob pike”
  • Traditional threads and locks don’t provide any direct support for parallelism.
    • In order to exploit multiple cores with threads and locks, you need to create a concurrent program and then run it on parallel hardware.
    • This is problematic because concurrent programs are nondeterministic but parallel programs are usually not

Parallel Architecture #

  • There are multiple levels of parallelism
    • Moving from 8-bits to 32-bits is a form of Bit-Level Parallelism. Adding two 32-bit numbers in a 8-bit architecture would take multiple steps. Doing this operation in a 32-bit system is done in a single step
    • CPU architectures use pipeling, out-of-order execution, and speculative execution to obtain instruction instruction-level parallelism
    • Data-parallelism is achieved by applying the same operations to a large amount of data in parallel. Imagine increasing the brightness of an image where each pixel easily handled by a GPU
  • What we are interested in is Task-Level Parallelism
    • There are two models of multiprocessor architectures:
      1. Shared-memory :: where each processor can access any memory location and interprocess communication is done through memory
      2. Distributed memory :: where each processor has its own local memory and IPC is done via the network.

Concurrency: Beyond Multiple Cores #

  • Concurrency is key to software being responsive, fault tolerant, efficient, and simple

Responsive #

  • the world is concurrent so software should be concurrent to interact with it properly
  • Examples:
    1. a mobile phone can play music, talk to the network, and pay attention to touch gestures all at the same time
    2. an IDE checks for syntax in the background while code is being typed
    3. A flight system simultaneously monitors sensors, displays information to the pilot, obeys commands, and moves control surfaces
  • Concurrency is key to responsive systems.
    • Doing things in the background avoids forcing users to wait and stare at a loading screen.

Fault Tolerant #

  • Distributed systems are fault tolerant. Shutdown on one data center doesn’t halt the entire system
  • Concurrency also enables fault detection.
    • A task that fails can notify a separate task to perform remedial action.
    • Sequential software can never be as resilient as concurrent software.

Simple #

  • Threading bugs are difficult to diagnose.
  • Concurrent solutions can be simpler and clearer than its sequential equivalent
    • Translating a concurrent real-world problem to its sequential solution hides detail and requires more work

The Seven Models #

  1. Threads and locks
  2. Functional programming
    • eliminates mutable state so functional programs are intrinsically thread-safe
  3. The Clojure Way - Separating identity and state
  4. Actors
    • Concurrent programming model with strong support for fault tolerance and resilence
  5. Communicating Sequential Processes
    • Emphasizes channels for communication
  6. Data Parallelism - Using GPUs
  7. The Lambda Architecture
    • Big data with MapReduce and stream processing to handle terabytes of data

2. Threads and Locks #

  • “Threads-and-locks programming is like a Ford Model T. It will get you from point A to point B, but it is primitive, difficult to drive, and both unreliable and dangerous compared to newer technology (Butcher 2014, 9).

The Simplest Thing that Could Possibly Work #

  • Threads and locks are little more than a formalization of what the underlying hardware actually does.
    • similar idea to pointers and goto statements

Day 1: Mutual Exclusion and Memory Models #

Mutual exclusion
Using locks to ensure that only one thread can access data at a time.
  • usage avoids race conditions and deadlocks

Creating a thread #

public class HelloWorld {
  public static void main(String[] args) throws InterruptedException {
    Thread myTrhead = new Thread() {
        public void run() {
          System.out.println("Hello from new thread");
        }
      };
    myThread.start();
    Thread.yield(); // necessary since myThread has some start time
    System.out.println("Hello from main thread");
    myThread.join();
  }
}

Our First Lock #

  • We can try to create 2 threads to count to 10,000 however the code below encounters issues
    • Instead of the expected output of 20,000 we see two behaviors
      1. Result is always some number below what we expect
      2. Result is always different
    • This is caused by a race condition when two threads call increment() simultaneously
      • when two threads read count at the same time and write at the same time, we encounter a case where the same value is incremented
public class Counting {
  public static void main(String[] args) throws InterruptedException {
    class Counter {
      private int count = 0;
      public void increment() { ++count; }
      public int getCount() { return count; }
    }
    final Counter counter = new Counter();
    class CountingThread extends Thread {
      public void run() {
        for (int x = 0; x < 10000; ++x) {
           counter.increment();
        }
      }
    }

    CountingThread t1 = new CountingThread();
    CountingThread t2 = new CountingThread();
    t1.start(); t2.start();
    t1.join(); t2.join();
    System.out.println(counter.getCount());
  }
}
  • The solution is to synchronize access to count by using java’s intrinsic lock
    • This means whenever increment() is called, it claims the Counter object’s lock so that when another thread tries to access it, it is blocked until the lock is free.
    • this leads to the correct output of 20,000
public class Counting {
  public static void main(String[] args) throws InterruptedException {
    class Counter {
      private int count = 0;
      public synchronized void increment() { ++count; }
      public int getCount() { return count; }
    }
    final Counter counter = new Counter();
    class CountingThread extends Thread {
      public void run() {
        for (int x = 0; x < 10000; ++x) {
           counter.increment();
        }
      }
    }

    CountingThread t1 = new CountingThread();
    CountingThread t2 = new CountingThread();
    t1.start(); t2.start();
    t1.join(); t2.join();
    System.out.println(counter.getCount());
  }
}

Issue 1: Race Conditions #

Issue 2: Memory Visibility #

Issue 3: Deadlock #

Day 2: Beyond Intrinsic Locks #

Day 3: On the Shoulders of Giants #

3. Functional Programming #

  • p 49

If it Hurts, Stop Doing It #

Day 1: Programming Without Mutable State #

Day 2: Functional Parallelism #

Day 3: Functional Concurrency #

4. The Clojure Way - Separating Identity from State #

  • p 85

The Best of Both Worlds #

Day 1: Atoms and Persistent Data Structures #

Day 2: Agents and Software Transactional memory #

Day 3: In Depth #

5. Actors #

  • p 115

More Object-Oriented than Objects #

Day 1: Messages and Mailboxes #

Day 2: Error Handling and Resilience #

Day 3: Distribution #

6. Communicating Sequential Processes #

  • p 153

Communication Is Everything #

Day 1: Channels and Go Blocks #

Day 2: Multiple Channels and IO #

Day 3: Client-Side CSP #

7. Data Parallelism #

  • p 189

The Supercomputer Hidden in Your Laptop #

Day 1: GPGPU Programming #

Day 2: multiple Dimensions and Work-Groups #

Day 3: OpenCL and OpenGL - Keeping it on the GPU #

8. The Lambda Architecture #

  • p 223

Parallelism Enables Big Data #

Day 1: MapReduce #

Day 2: The Batch Layer #

Day 3: The Speed Layer #

9. Wrapping Up #

  • p 263

Next #