Concurrency and multithreading is all about running multiple pieces of code in parallel. If you have the hardware for it in the form of a nice shiny multi-core CPU or a multi-processor system then this code can run truly in parallel, otherwise it is interleaved by the operating system — a bit of one task, then a bit of another. This is all very well, but somehow you have to specify what code to run on all these threads.
Let’s get started with std::thread
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include<thread>#include<iostream>voidfunction(){std::cout<<"From thread 1"<<std::endl;}intmain(){std::threadt(function);t.join();std::cout<<"From main thread"<<std::endl;std::cin.ignore();return0;}
First two lines include thread and iostream header files. Then we define a function thread_function that will be executed by the thread, for now it will just print something to shell. In main function we first create a thread object t and pass the function to be executed by this thread, in this case it is thread_function. thread object construction launches a thread and executes the function but in the meanwhile main thread continues with the next statement. The main thread continues on irrespective of progress of thread it spawned. If main thread returns while thread t is still working then application terminates, thereby killing the spawned thread t. That’s where the state t.join() comes into picture, it ensures that main thread does not progress further till thread t returns. The statement std::cin.ignore(); ensures that shell waits for you for keyboard prompt before it terminates.
Here is how the application progresses:
In previous example we saw thread was created with function to be executed as a parameter. Actually what thread really wants is a callable entity. Now we know that any entity that has operator() is a callable entity:
Here in this code snippet we are creating three threads:
Thread t1 is passed a function to execute
Thread t2 is passed a functor to execute
Thread t3 is passed a lambda to execute
All these work fine, as they are callable entities. All these examples have been simple as the function to execute are not passed any data. Here is another example where you can pass some parameters to executing thread:
#include<thread>#include<iostream>structFunctor{voidoperator()(doublex,doubley)const{std::cout<<"From functor -- sum of x & y is:"<<(x+y)<<std::endl;}};intmain(){Functorfunctor;std::threadt(functor,10,12);t.join();std::cin.ignore();return0;}
All of this has been just demonstration code, lets write something that is bit more useful, like computing dotproduct:
Update: This problem has been resolved in the latest version of Visual Studio (VS 2011 Beta at this point of time)
#include<thread>#include<iostream>structDotProduct{double*dp;double*a;double*b;size_tnumElems;DotProduct(double*result,double*a,double*b,size_telems):dp(result),a(a),b(b),numElems(elems){}voidoperator()()const{for(decltype(numElems)idx=0;idx<numElems;++idx){*dp+=a[idx]*b[idx];}}};intmain(){staticconstsize_tNumElems=100000;double*a=newdouble[NumElems];double*b=newdouble[NumElems];for(size_tidx=0;idx<NumElems;++idx){a[idx]=idx;b[idx]=NumElems-idx;}// for now we are going to have 4 threads
//
size_tincrement=NumElems/4;// we ensure that each DotProduct object holds onto separate range
// so these can be executed in parallel
// as the computation in each thread is fairly predictable
// we can go with equal distribution
//
doubledp1_sum=0;DotProductdp1(&dp1_sum,a+0*increment,b+0*increment,increment);doubledp2_sum=0;DotProductdp2(&dp2_sum,a+1*increment,b+1*increment,increment);doubledp3_sum=0;DotProductdp3(&dp3_sum,a+2*increment,b+2*increment,increment);doubledp4_sum=0;DotProductdp4(&dp4_sum,a+3*increment,b+3*increment,increment);// create four threads and assign each dotproduct
// evaluation job to each
//
std::threadt1(dp1);std::threadt2(dp2);std::threadt3(dp3);//std::thread t4(dp4);
dp4();// ensure that main thread does not proceed further
// till all the threads have completed execution
//
t1.join();t2.join();t3.join();//t4.join();
// at the end just add-up all the dot-products computed by
// each thread
//
doubledotprod=dp1_sum+dp2_sum+dp3_sum+dp4_sum;std::cout<<"Dotproduct is "<<dotprod<<std::endl;delete[]a;delete[]b;std::cin.ignore();return0;}
Well Microsoft’s thread library is not entirely bug-free, with above code I ran into various threading issues with mutex lock and unlock, here is the bug report in case if you are interested.
I intend to explore some more thread enhancements with Visual Studio 2011 Developer Preview. Stay tuned…