Reference: The Little Book of Semaphores by Allen B. Downey Link1
Synchronization
Synchronization (synch) is about making 2 things happen at same time, however it is a little more general in programming. It refers to relationship among events (any number and any kind).
Synch constraints:
- Serialization:
EventA
must happen beforeEventB
- Mutual Exclusion:
EventA
andEventB
must not happen at the same time
So what are the complications in developing concurrent software?
- Multiple Threads
- Multiple Processors
Due to OS scheduling and pre-empting threads, programmer has no control over how threads are run and how instructions are evaluated. Thus, both the cases are equivalent from introducing synch coordination
Serialization with messages
Alice Bob
1 Eat BreakFast 1 Eat BreakFast
2 Work 2 Wait for Call from ThreadA
3 Eat Lunch 3 Eat Lunch
4 Call B
order of events:
A < B
impliesA
happens beforeB
Here order of events is:
A1 < A2 < A3 < A4
B1 < B2 < B3
What we want is: that Alice should EatLunch before Bob, B2 < A4 < B3
. To make this happens some sort of coordination/synchronization is needed. The constraint is for a set of activities, other activities can happen concurrently
and no relationship exists between them, ie: A1 < A2
and B1 < B2
can happen in whenever and are not related.
Two events are concurrent, if we cannot tell by looking at the program which will happen first. Well this contributes to the non-deterministic execution of program, at least the concurrent execution
Shared variables
Most of the times variables are thread-local/thread-specific, however some of those might be shared between threads. In such cases threads communicate/interact with each other thru them. In c/c++ language- global or heap allocated memory/variables are shared between threads (as threads are running in same process- these things make life fun in shared-memory parallel programming…)
Concurrent Updates
Writes
x = 0
ThreadA ThreadB
1 x = 5 1 x = 7
2 print x
possible executions:
1 2 3
x = 5 x = 5 x = 7
print x x = 7 x = 5
x = 7 print x print x
Here:{A1 < A2} and B1
are the event orderings.
Assuming, there is sequential consistency such that
A1 < A2
and that updates tox
are atomic and not values are not read from caches
Read, Update, Modify:
x = 0
ThreadA ThreadB
x = x + 1 x = x + 1
|
|
V
1 tempA = x 1 tempB = x
2 x = tempA + 1 2 x = tempB + 1
possible executions:
1 2 3 4
(A1 < B1 < A2 < B2) (A1 < B1 < A2 < B2) (A1 < B1 < A2 < B2) (A1 < B1 < A2 < B2)
-> 1 -> 2 -> 2 -> 1
Correct answer to this is 2
and will be evaluated to it only when operations are serlized:(A1 < A2) < (B1 < B2)
or (B1 < B2) < (A1 < A2)
. Basically increments need to be atomic or mutually exclusive
How would it behave for following code, when it is executed by 100 threads
concurrently:
for i in range (0..100):
temp = count
count = temp + 1
There are a ton of race-conditions here. Question Dr. Downey posed is what is the max
and min
value this can have. I think max
value will be observed when threads run in convoy, one after another, with no contention and race and the value will be 10000.
The min
value is a bit tricky and multiple threads could run and same time could be pre-empted and temp
being a thread-local it will stomp on count
value as it moves forward. I think the min
value has to be 100
.
Smallest value:
for i in range (0..100):
# Thread0 Starts executing..
temp = count # when count was 0 to being with
# temp = 0
# Thread0 -> preempts
# Thread1-Thread99 are scheduled and finish executing
# Thread0 <- schedule again
count = temp + 1