Notes on Parallel and Concurrent Programming
To refresh my knowledge about concurrency and parallel programming, I took “Parallel and Concurrent Programming with Java 1–2" course in LinkedIn and I thought of sharing my notes, hoping it will help you, too.
Flynn’s Taxonomy distinguishes 4 classes of computer architecture based on 2 factors:
1) The number of concurrent instruction or control streams
2) The number of data streams
- Single Instruction Stream Single Data Stream
- Single Instruction Stream Multiple Data Stream
- Multiple Instruction Stream Single Data Stream
- Multiple Instruction Stream Multiple Data Stream
MISD
- operates on same single stream of data (not common)
SIMD
- same instruction is executed at a given time
- operating on different data
- suitable for image processing
MIMD
- units executing different series of instructions at the same time
- even the processor can be operating on a different set of data
- Single Program, Multiple Data (SPMD)
- Multiple Program, Multiple Data (MPMD)
SPMD
- multiple processing units executing a copy of the same single program simultaneously.
- each can use a different data.
- different from SIMD because they don’t have to be executing the same instruction at the same time (async running)
- conditional logic in the program that allows different tasks within the program to only execute specific parts of the overall program.
- more common than MPMD
MPMD
- processors can be executing different independent programs at the same time while also be operating on different data.
- one processing node will be selected as the host/manager which runs one program that forms out data to the other nodes running a second program. others return processed data back to manager.
Shared vs Distributed Memory
Shared Memory
- all processors across the same memory with global address space.
- if one changes a memory location, all others will see the change.
based on how processors are connected to memory:
- Uniform Memory Access (UMA)
- Non-uniform Memory Access (NUMA)
UMA
- equally fast
- asymmetric multiprocessing system (SMP)
- cache memory -> cache coherency — handled by hardware in the processors
NUMA
- physically connecting multiple SMP systems together
- access -> non-uniform
Distributed Memory
- each processor has its own local memory with its own address space (no global address space)
- up to the programmer to define how&when data is communicated between nodes.
Process
- includes code, data, state information
- independent instance of a running program
- separate address space
Threads (in processes)
- independent path of execution
- subset of a process
- OS scheduler threads for execution
- shares process’s address space, same memory resource
Inter-Process Communication (IPC)
- sockets, pipes
- shared memory (inter-process shared)
- remote procedure calls
Multiple Processes or Multiple Threads?
- depends on what you’re doing & environment it’s running in (OS, programming language)
- mainly stick to multiple threads
Threads vs. Processes
- Threads are lightweight -> require less overhead to create/terminate
- OS can switch between threads (of same process) faster than processes.
JVM
- each Java app executes within its own instance of JVM.
- each JVM instance is a separate, independent process.
For more information about JVM, you can have a look at my previous post:
About PC Performance
- Having 12 cores and 24 logical processors: 12 separate complete physical processing cores. each of those cores support hyper-threading -> enables them to run 2 independent apps at the same time.
hyper-threading: takes advantage of unused parts of the processor, if one thread is paused or not using a certain resource (other thread may be able to use it)
Concurrency
- ability of a program to be broken into parts that can run independent of each other.
- Concurrent Execution: 1 single processor -> swapping places (illusion of concurrency, not equal to Parallel Execution. Parallel Execution requires hardware.)
- dependent on program structure
- dealing with multiple things at once
- I/O operations, e.g.
Parallelism
- simultaneous execution
- doing multiple things at once
- matrix multiplication, e.g.
Context Switch (requires keeping last state)
- the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point.
Scheduling Algorithms
- first come, first served
- shortest job next
- priority
- shortest remaining time
- round-robin
- multiple-level queues
Preemptive
- low priority is paused when higher comes in.
Scheduling Goals
- maximize throughput
- maximize fairness
- minimize wait time
- minimize latency
mini reminder:
- cache coherency -> handled by processor hardware
- modern multi-core systems -> Flynn's Taxonomy MIMD
- parallel computing increases:
* speed
* # of tasks executed
* scaling
Main Thread
- created when a program is started.
- can spawn new threads — child threads
- child threads can have children too)
- children notify their parents when they’re finished
- some programming languages require you to start the thread after it’s created (like Java).
new -> start -> runnable => so that OS can schedule it to execute
- if a thread waits for an event to occur (input, time)
runnable -> waiting -> blocked
blocked -> resume -> runnable
- if all is finished but waiting for a single thread to finish, can call “join()” (means “wait until another thread completes its execution). finished ones go to a blocked state, the other runs to finish.
runnable -> finished -> terminated (normal or abnormal)
Lifecycle of a Thread
- Java has 6 states:
- NEW
- RUNNABLE
- BLOCKED (waiting for a monitor lock)
- TERMINATED
- WAITING (waiting indefinitely for other threads)
- TIMED_WAITING (waiting for other threads up to a specific waiting time; thread.sleep for seconds, e.g.)
Thread State Methods
- thread.getState()
- currentThread() -> returns Thread
- thread ID : positive, long, unique =>getId()
- thread name: can have the same name (default Thread-N where N is a positive int) => set/get
- thread priority: 1–10 => set/get
Threads don’t maintain a reference to their parent thread for garbage collection purposes.
Java Thread Class
- “extends Thread” and implement your own run method, use a Thread constructor, pass a Runnable object as its target.
- “implements Runnable”
- prefer interface over implementation (OOP Principle)
Daemon Thread (Background)
- spawned garbage collector as a child thread
- it won’t be able to exit until gc have terminated but…
- threads (that are performing background tasks) can be detached from the main program = daemon thread
- when the main thread terminates (no non-daemon threads left running), daemon thread will terminate with it.
- an abrupt termination of gc won’t be a problem since memory will be cleaned up with process termination.
- setDaemon() in Java
- threads inherit daemon status from their parent.
Mutual Exclusion
- Data Race: 2 or more threads // same memory location // at least 1 is modifying
- compiler flags
- Critical Section (critical region): 1 thread/process at a time
- Mutex (Lock): a mechanism for mutual exclusion
- Atomic Operation: (uninterruptible) operation that requires the lock
- if lock is already taken, block/wait for it to be available.
- “java.util.concurrent.locks” package, Lock interface, ReentrantLock class, lock/unlock method. AtomicInteger (incrementAndGet()), AtomicLong instead of Integer, Long classes for thread-safety.
Intrinsic Locks
- every Java object has an intrinsic lock associated with it.
- synchronized method, synchronized statement
- if it is a static method -> it will be class-wide
- synchronized statement: specify the object to provide the intrinsic lock
- you should avoid using Java’s synchronized statement on an immutable object such as Integer because if you change that variable’s value, you will be synchronized to a different object.
synchronized(object) {
// protected code
}
Locks
Reentrant Lock
- Deadlock: All processes and threads are unable to continue executing.
- there may be times where a thread heads to lock a mutex multiple times before unlocking it.
- reentrant lock can be locked multiple times by the same thread (how many times locked info is kept by the owner thread).
- must be unlocked as many times as it was locked.
- Reentrant Mutex
- Reentrant Lock
- Recursive Mutex
- Recursive Lock
There is no non-reentrant lock in Java (getHoldCount()):
- ReentrantLock
- ReentrantReadWriteLock.ReadLock
- ReentrantReadWriteLock.WriteLock
Try Lock (Try Enter)
- non-blocking lock/acquire method for mutex
- if mutex is available->returns true; if not->returns false
- tryLock() in Java
Read_Write Lock (Shared Mutex)
- SHARED READ : multiple threads at once
- EXCLUSIVE WRITE: only one thread at a time
- if there is a read, no write is allowed.
- if there is a write, no read is allowed.
- suitable for -> # threads reading > # threads writing
- new ReentrantReadWriteLock()
https://www.baeldung.com/java-concurrent-locks :
Read Lock — if no thread acquired the write lock or requested for it then multiple threads can acquire the read lock
Write Lock — if no threads are reading or writing then only one thread can acquire the write lock
ReadWriteLock Interface
- readLock() & writeLock() => returns a new Lock object where you can perform lock&unlock.
ReadWriteLock rwLock = new ReentrantReadWriteLock();
Lock readLock = rwLock.readLock();
Lock writeLock = rwLock.writeLock();
...
rwLock.getReadLockCount();
Liveliness
- Deadlock does not affect CPU.
- Dining Philosophers: multiple threads need to acquire multiple locks. each member is waiting for another member to take action. take turns. prioritize locks.
Lock Ordering
- make sure locks are taken in the same order by any thread
Lock Timeout
- back up and free all locks taken, wait, try again
Abandoned Lock
- abnormal end: lock was not released. throws an error. there will be unlocked locks left.
- you should put “critical section” in try and perform unlock in finally.
try {
...
...
} finally {
secondLock.unlock();
firstLock.unlock();
}
Starvation
- unable to access to a necessary source
- a process or thread is perpetually denied the resources it needs.
- can occur in “threads given different priorities” case.
Livelock
- unlike deadlock, wears out CPU
- overly polite; tries to solve the problem (unlike starvation)
- multiple threads or processes are actively responding to each other to resolve conflict but that prevents them from making process.
- priority or random selection.
Happy Coding!