butcher-2014
— Anthony GracaCitation #
@book{butcher2014,
title={Seven Concurrency Models in Seven Weeks: When Threads Unravel},
author={Butcher, Paul},
year={2014},
publisher={The Pragmatic Bookshelf}
}
Summary. What are the statements being made? #
1. Introduction #
Concurrent or Parallel? #
- Concurrent and parallel refer to two related but different things
- “A concurrent program has multiple logical threads of control. These threads may or may not run in parallel (Butcher 2014, 1).”
- Concurrency is an aspect of the problem domain, meaning your algorithm needs to handle simultaneous events
- “Concurrency is about dealing with lots of things at once - rob pike”
- “A parallel program potentially runs more quickly than a sequential program by executing different parts of the computation simultaneously in parallel. It may or may not have more than one logical thread of control”
- Parallelism is an aspect of the solution domain, meaning you want to make your program faster
- “Parallelism is about doing lots of things at once - rob pike”
- Traditional threads and locks don’t provide any direct support for parallelism.
- In order to exploit multiple cores with threads and locks, you need to create a concurrent program and then run it on parallel hardware.
- This is problematic because concurrent programs are nondeterministic but parallel programs are usually not
Parallel Architecture #
- There are multiple levels of parallelism
- Moving from 8-bits to 32-bits is a form of Bit-Level Parallelism. Adding two 32-bit numbers in a 8-bit architecture would take multiple steps. Doing this operation in a 32-bit system is done in a single step
- CPU architectures use pipeling, out-of-order execution, and speculative execution to obtain instruction instruction-level parallelism
- Data-parallelism is achieved by applying the same operations to a large amount of data in parallel. Imagine increasing the brightness of an image where each pixel easily handled by a GPU
- What we are interested in is Task-Level Parallelism
- There are two models of multiprocessor architectures:
- Shared-memory :: where each processor can access any memory location and interprocess communication is done through memory
- Distributed memory :: where each processor has its own local memory and IPC is done via the network.
- There are two models of multiprocessor architectures:
Concurrency: Beyond Multiple Cores #
- Concurrency is key to software being responsive, fault tolerant, efficient, and simple
Responsive #
- the world is concurrent so software should be concurrent to interact with it properly
- Examples:
- a mobile phone can play music, talk to the network, and pay attention to touch gestures all at the same time
- an IDE checks for syntax in the background while code is being typed
- A flight system simultaneously monitors sensors, displays information to the pilot, obeys commands, and moves control surfaces
- i think of douglass-2003
- Concurrency is key to responsive systems.
- Doing things in the background avoids forcing users to wait and stare at a loading screen.
Fault Tolerant #
- Distributed systems are fault tolerant. Shutdown on one data center doesn’t halt the entire system
- Concurrency also enables fault detection.
- A task that fails can notify a separate task to perform remedial action.
- Sequential software can never be as resilient as concurrent software.
Simple #
- Threading bugs are difficult to diagnose.
- Concurrent solutions can be simpler and clearer than its sequential equivalent
- Translating a concurrent real-world problem to its sequential solution hides detail and requires more work
The Seven Models #
- Threads and locks
- Functional programming
- eliminates mutable state so functional programs are intrinsically thread-safe
- The Clojure Way - Separating identity and state
- Actors
- Concurrent programming model with strong support for fault tolerance and resilence
- Communicating Sequential Processes
- Emphasizes channels for communication
- Data Parallelism - Using GPUs
- The Lambda Architecture
- Big data with MapReduce and stream processing to handle terabytes of data
2. Threads and Locks #
- “Threads-and-locks programming is like a Ford Model T. It will get you from point A to point B, but it is primitive, difficult to drive, and both unreliable and dangerous compared to newer technology (Butcher 2014, 9).
The Simplest Thing that Could Possibly Work #
- Threads and locks are little more than a formalization of what the underlying hardware actually does.
- similar idea to pointers and goto statements
Day 1: Mutual Exclusion and Memory Models #
- Mutual exclusion
- Using locks to ensure that only one thread can access data at a time.
- usage avoids race conditions and deadlocks
Creating a thread #
public class HelloWorld {
public static void main(String[] args) throws InterruptedException {
Thread myTrhead = new Thread() {
public void run() {
System.out.println("Hello from new thread");
}
};
myThread.start();
Thread.yield(); // necessary since myThread has some start time
System.out.println("Hello from main thread");
myThread.join();
}
}
Our First Lock #
- We can try to create 2 threads to count to 10,000 however the code below encounters issues
- Instead of the expected output of 20,000 we see two behaviors
- Result is always some number below what we expect
- Result is always different
- This is caused by a race condition when two threads call
increment()
simultaneously- when two threads read
count
at the same time and write at the same time, we encounter a case where the same value is incremented
- when two threads read
- Instead of the expected output of 20,000 we see two behaviors
public class Counting {
public static void main(String[] args) throws InterruptedException {
class Counter {
private int count = 0;
public void increment() { ++count; }
public int getCount() { return count; }
}
final Counter counter = new Counter();
class CountingThread extends Thread {
public void run() {
for (int x = 0; x < 10000; ++x) {
counter.increment();
}
}
}
CountingThread t1 = new CountingThread();
CountingThread t2 = new CountingThread();
t1.start(); t2.start();
t1.join(); t2.join();
System.out.println(counter.getCount());
}
}
- The solution is to synchronize access to
count
by using java’s intrinsic lock- This means whenever
increment()
is called, it claims theCounter
object’s lock so that when another thread tries to access it, it is blocked until the lock is free. - this leads to the correct output of 20,000
- This means whenever
public class Counting {
public static void main(String[] args) throws InterruptedException {
class Counter {
private int count = 0;
public synchronized void increment() { ++count; }
public int getCount() { return count; }
}
final Counter counter = new Counter();
class CountingThread extends Thread {
public void run() {
for (int x = 0; x < 10000; ++x) {
counter.increment();
}
}
}
CountingThread t1 = new CountingThread();
CountingThread t2 = new CountingThread();
t1.start(); t2.start();
t1.join(); t2.join();
System.out.println(counter.getCount());
}
}
Issue 1: Race Conditions #
Issue 2: Memory Visibility #
Issue 3: Deadlock #
Day 2: Beyond Intrinsic Locks #
Day 3: On the Shoulders of Giants #
3. Functional Programming #
- p 49
If it Hurts, Stop Doing It #
Day 1: Programming Without Mutable State #
Day 2: Functional Parallelism #
Day 3: Functional Concurrency #
4. The Clojure Way - Separating Identity from State #
- p 85
The Best of Both Worlds #
Day 1: Atoms and Persistent Data Structures #
Day 2: Agents and Software Transactional memory #
Day 3: In Depth #
5. Actors #
- p 115
More Object-Oriented than Objects #
Day 1: Messages and Mailboxes #
Day 2: Error Handling and Resilience #
Day 3: Distribution #
6. Communicating Sequential Processes #
- p 153
Communication Is Everything #
Day 1: Channels and Go Blocks #
Day 2: Multiple Channels and IO #
Day 3: Client-Side CSP #
7. Data Parallelism #
- p 189
The Supercomputer Hidden in Your Laptop #
Day 1: GPGPU Programming #
Day 2: multiple Dimensions and Work-Groups #
Day 3: OpenCL and OpenGL - Keeping it on the GPU #
8. The Lambda Architecture #
- p 223
Parallelism Enables Big Data #
Day 1: MapReduce #
Day 2: The Batch Layer #
Day 3: The Speed Layer #
9. Wrapping Up #
- p 263
Next #
hoare-1978 - CSP
A theory of communicating sequential processes - 1984
Carl Hewitt Actor Model - Every object is a process
- Hewitt 1973 - A Universal Modular Actor Formalism for Artificial Intelligence
- sicp
Communicating and Mobile Systems: The π-Calculus by Robin Milner
- milner 1989 - every function is a process - task oriented