Part I: Sample Program π Computation g Integration formula of T 4 1+x d=>,+05y2N 1=01+( aA sequential c code to compute T: #definen 1000000 main double local, pi=0.0. w: ng 1, 10/N; for(i=0i<N计i++){ local=(+0.5)*w; pi=pi+4.0/(1.0+local*local) printf("pi is %f、n”pi*w) NHPCC(Hefei)·USTC· CHINA glchenaustc edu.ci
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn p Computation + Integration formula of p : + A sequential C code to compute p : #define N 1000000 main(){ double local,pi=0.0,w; long i; w=1.0/N; for (i=0;i<N;i++){ local=(i+0.5)*w; pi=pi+4.0/(1.0+local*local); } printf(“pi is %f\n”,pi*w); } Part III:Sample Program 4 3 2 1 0 1 ò å - = ´ + + » + = 1 0 1 0 2 2 1 ) 0.5 1 ( 4 1 4 N i N N i dx x p 3 - 6
Part Ill: Shared-Memory Programming Standards ANSI X3H5 c Parallel Construct: Using parallel construct to specify parallelism of X3H5 program Inside a parallel construct includes either parallel block, parallel loop, or single process. program main I The program begins in sequential mode I A is executed by only the base thread parallel I Switch to parallel mode I B is replicated by every team member sections I Starts a parallel block section I One team mem ber executes C section D Another team member executes d ps ections Wait till both C and D are completed using Temporarily switch to sequential mode E E is executed by one team member end singl I Switch back to parallel mode pdo i=1, 6 I Starts a pdo construct 'The team members share the 6 iterations of F end pdo no wait No implicit barrier More replicate code end parallel I Switch back to sequential mode H H is executed by only the initial process I There could be more parallel constructs d Implicit barrier(fence operation: Located parallel,end paralleled section, end pdo, end single forces all memory accesses up to this point to become consistent. Thread interaction and synchronization, including four types of synchronization variables: Latch, Lock, Event and Ordinal NHPCC(Hefei)·USTC· CHINA glchen @ustc.ed.cl
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn ANSI X3H5 + Parallel Construct:Using parallel construct to specify parallelism of X3H5 program. Inside a parallel construct includes either parallel block,parallel loop,or single process. + Implicit barrier(fence operation):Located parallel,end parallel,end psection,end pdo,end psingle forces all memory accesses up to this point to become consistent. + Thread interaction and synchronization,including four types of synchronization variables:Latch,Lock,Event and Ordinal. Part III:Shared-Memory Programming Standards 3 - 7
Part Ill: Shared-Memory Programming Standards POSIX Threads(Pthreads) Pthreads standard was established by ieee standards committee which is similar to Solaris Threads g Thread Management Primitives Function Prototype Meaning int pthread create(pthread t* thread id, pthread attr t*attr, Create a thread void*('mmyroutine)(void*), void*arg) void pthread exit(void*status) A thread exits int pthread join(pthread t thread, void** status) Join a thread pthread t pthread self( void) Retums the calling thread ID g Threads Synchronization Primitives Function Meaning pthread mutex init() Creates a new mutex variable pthread mutex destroy(.) Destroy a mutex variable thread mutex lock(.) Lock(acquire) a mutex variable pthread mutex trylock(.) Try to acquire a mutex variable pthread mutex unlock(.) Unlock(release)a mutex variable pthread cond ini(…) Creates a new conditional variable pthread cond destroy.) Destroy a conditional variable pthread cond wai(…) Wait(block)on a conditional variable pthread_ _cond timedwait() Wait on a conditional variable up to a time limit pthread cond signal(.) Post an event, unlock one waiting process pthread_cond broadcast(.) Post an event, unlock all waiting process NHPCC(Hefei)·USTC· CHINA glchenaustc edu.ci
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn POSIX Threads(Pthreads) Pthreads standard was established by IEEE standards committee which is similar to Solaris Threads. + Thread Management Primitives : + Threads Synchronization Primitives : Part III:Shared-Memory Programming Standards 3 - 8
Part Ill: Shared-Memory Programming Standards Shared-Variable Parallel Code to Compute T The following code is a C-like notation: #define N 1000000 maino double local, pi =0.0,w longi; A w=1.0/N B: #pragma parallel #pragma shared( pi, w) #pragma local (i, local #pragma pfor iterate (i=0; N: 1) for(i=0;i<N;i++)& local =(1+0.5)*w: local=4.0/(1.0+ local local #pragma critical pI- PI al; printf("pi is f n", pi *w); 3/* mainO */ NHPCC(Hefei)·USTC· CHINA glchen @ustc.ed.cl
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn Shared-Variable Parallel Code to Compute p The following code is a C-like notation : Part III:Shared-Memory Programming Standards 3 - 9