11.2.7 Implementation Terminology supporting n active Implies allowing an active parallel region to be enclosed by n-I active parallel 3 levels of parallelism regions. g Supporting at least one active level of parallelism. 5 supporting nested Supporting more than one active level of parallelism. parallelism 67 internal control A conceptual variable that specifies runtime behavior of a set of threads or tasks in variable an OpenMP program. 8 COMMENT:The acronym ICV is used interchangeably with the term 9 internal control variable in the remainder of this specification. 19 compliant An implementation of the OpenMP specification that compiles and executes any implementation conforming program as defined by the specification. 12 COMMENT:A compliant implementation may exhibit unspecified behavior when compiling or executing a non-conforming program. 4 unspecified behavior 1 A behavior or result that is not specified by the OpenMP specification or not known prior to the compilation or execution of an OpenMP program 16 Such unspecified behavior may result from: 1> Issues documented by the OpenMP specification as having unspecified behavior. 18 .A non-conforming program. 19 .A conforming program exhibiting an implementation-defined behavior 20 implementation defined Behavior that must be documented by the implementation,and is allowed to vary 21 among different compliant implementations.An implementation is allowed to define this behavior as unspecified. COMMENT:All features that have implementation-defined behavior are 24 documented in Appendix A. 25 deprecated For a construct,clause,or other feature,the property that it is normative in the 2 current specification but is considered obsolescent and will be removed in the future. 271.2.8 Tool Terminology 8 29 tool Executable code.distinct from application or runtime code,that can observe and/or modify the execution of an application. CHAPTER 1.INTRODUCTION 17
1 1.2.7 Implementation Terminology supporting n active levels of parallelism 2 Implies allowing an active parallel region to be enclosed by n-1 active parallel 3 regions. supporting the OpenMP API 4 Supporting at least one active level of parallelism. supporting nested parallelism 5 Supporting more than one active level of parallelism. internal control variable 6 A conceptual variable that specifies runtime behavior of a set of threads or tasks in 7 an OpenMP program. 8 COMMENT: The acronym ICV is used interchangeably with the term 9 internal control variable in the remainder of this specification. compliant implementation 10 An implementation of the OpenMP specification that compiles and executes any 11 conforming program as defined by the specification. 12 COMMENT: A compliant implementation may exhibit unspecified 13 behavior when compiling or executing a non-conforming program. 14 unspecified behavior A behavior or result that is not specified by the OpenMP specification or not known 15 prior to the compilation or execution of an OpenMP program. 16 Such unspecified behavior may result from: 17 • Issues documented by the OpenMP specification as having unspecified behavior. 18 • A non-conforming program. 19 • A conforming program exhibiting an implementation-defined behavior. 20 implementation defined Behavior that must be documented by the implementation, and is allowed to vary 21 among different compliant implementations. An implementation is allowed to define 22 this behavior as unspecified. 23 COMMENT: All features that have implementation-defined behavior are 24 documented in Appendix A. 25 deprecated For a construct, clause, or other feature, the property that it is normative in the 26 current specification but is considered obsolescent and will be removed in the future. 27 1.2.8 Tool Terminology 28 tool Executable code, distinct from application or runtime code, that can observe and/or 29 modify the execution of an application. CHAPTER 1. INTRODUCTION 17
1 first-party tool A tool that executes in the address space of the program that it is monitoring. 2 third-party tool A tool that executes as a separate process from the process that it is monitoring and potentially controlling activated tool A first-party tool that successfully completed its initialization 5 event A point of interest in the execution of a thread. 6 native thread A thread defined by an underlying thread implementation. > tool callback A function that a tool provides to an OpenMP implementation to invoke when an 8 associated event occurs. registering a callback Providing a tool callback to an OpenMP implementation. 10 dispatching a callback Processing a callback when an associated event occurs in a manner consistent with at an event the return code provided when a first-party tool registered the callback. 格 thread state An er on type that desc ribes the e current OpenMP activity of a thread.A thread can be in only one state at any time 14 wait identifier A unique opaque handle associated with each data object(for example,a lock)used 1 by the OpenMP runtime to enforce mutual exclusion that may cause a thread to wait actively or passively. 17 frame Asra area on a thread's stack associated with a procedure invocation.Aframe includes space saved registers space for saved arguments,local variables,and padding for alignment. 20 canonical frame An address associated with a procedure frame on a call stack that was the value of the address stack pointer immediately prior to calling the procedure for which the invocation is 2 represented by the frame 2 runtime entry point A function interface provided byan OpenMPruntim e for use by a tool.A runtim entry point is typically not associated with a global function symbo 25 trace record A data structure in which to store information associated with an occurrence of an 6 event. 27 native trace record A trace record for an OpenMP device that is in a device-specific format 28 signal A software interrupt delivered to a thread signal handler A function called asynchronously when a signal is delivered to a thread. 0 asyne signal safe The guarantee that interruption by signal delivery will not interfere with a set of 3 operations.An async signal safe runtime entry point is safe to call from a signal handler. 18 OpenMP API-Version 5.0 November 2018
1 first-party tool A tool that executes in the address space of the program that it is monitoring. 2 third-party tool A tool that executes as a separate process from the process that it is monitoring and 3 potentially controlling. 4 activated tool A first-party tool that successfully completed its initialization. 5 event A point of interest in the execution of a thread. 6 native thread A thread defined by an underlying thread implementation. 7 tool callback A function that a tool provides to an OpenMP implementation to invoke when an 8 associated event occurs. 9 registering a callback Providing a tool callback to an OpenMP implementation. dispatching a callback at an event 10 Processing a callback when an associated event occurs in a manner consistent with 11 the return code provided when a first-party tool registered the callback. 12 thread state An enumeration type that describes the current OpenMP activity of a thread. A 13 thread can be in only one state at any time. 14 wait identifier A unique opaque handle associated with each data object (for example, a lock) used 15 by the OpenMP runtime to enforce mutual exclusion that may cause a thread to wait 16 actively or passively. 17 frame A storage area on a thread’s stack associated with a procedure invocation. A frame 18 includes space for one or more saved registers and often also includes space for saved 19 arguments, local variables, and padding for alignment. canonical frame address 20 An address associated with a procedure frame on a call stack that was the value of the 21 stack pointer immediately prior to calling the procedure for which the invocation is 22 represented by the frame. 23 runtime entry point A function interface provided by an OpenMP runtime for use by a tool. A runtime 24 entry point is typically not associated with a global function symbol. 25 trace record A data structure in which to store information associated with an occurrence of an 26 event. 27 native trace record A trace record for an OpenMP device that is in a device-specific format. 28 signal A software interrupt delivered to a thread. 29 signal handler A function called asynchronously when a signal is delivered to a thread. 30 async signal safe The guarantee that interruption by signal delivery will not interfere with a set of 31 operations. An async signal safe runtime entry point is safe to call from a signal 32 handler. 18 OpenMP API – Version 5.0 November 2018
code block A contiguous region of memory that contains code of an OpenMP program to be executed on a device. 34 OMPT An interface that helps a first-parry tool monitor the execution of an OpenMP program. OMPT interface state A state that indicates the permitted interactions between a first-party tool and the 6 OpenMP implementation. OMPT active 89 which a first-party tool can invoke runtime entry points if not otherwise restricted. OMPT pending An OMPT interface state in which the OpenMP implementation can only call 1 functions to initialize a first party tool and in which a first-party tool cannot invoke 12 runtime entry points. 设 OMPT inactive An OMPTinterface state in which the OpenMP implementation will not make any callbacks and in which a first-party tool cannot invoke runtime entry points. 1 OMPD An interface that helps a third-party tool inspect the OpenMP state of a program that 16 has begun execution. 17 OMPD library A dynamically loadable library that implements the OMPD interface 18 image file An executable or shared library. 19 address space A collection of logical,virtual,or physical memory address ranges that contain code 20 stack,and/or data.Address ranges within an address space need not be contiguous. 21 An address space consists of one or more segments. 22 segment A portion of an address space associated with a set of address ranges 23 OpenMP architecture The architecture on which an OpenMP region executes. tool architecture The architecture on which an OMPD tool executes. 2 OpenMP process A collection of one or more threads and address spaces.A process may contain threads and addr ess space es for multiple OpenMP architectures.At least one thread 27 in an OpenMP process is an OpenMP thread.A process may be live or a core file 28 address space handle A handle that refers to an address space within an OpenMP process. 9 thread handle A handle that refers to an OpenMP thread. 30 parallel handle A handle that refers to an OpenMP parallel region. 31 task handle A handle that refers to an OpenMP task region. 32 descendent handle An output handle that is returned from the OMPD library in a function that accepts 33 an input handle:the output handle is a descendent of the input handle. CHAPTER 1.INTRODUCTION 19
1 code block A contiguous region of memory that contains code of an OpenMP program to be 2 executed on a device. 3 OMPT An interface that helps a first-party tool monitor the execution of an OpenMP 4 program. 5 OMPT interface state A state that indicates the permitted interactions between a first-party tool and the 6 OpenMP implementation. 7 OMPT active An OMPT interface state in which the OpenMP implementation is prepared to accept 8 runtime calls from a first party tool and it dispatches any registered callbacks and in 9 which a first-party tool can invoke runtime entry points if not otherwise restricted. 10 OMPT pending An OMPT interface state in which the OpenMP implementation can only call 11 functions to initialize a first party tool and in which a first-party tool cannot invoke 12 runtime entry points. 13 OMPT inactive An OMPT interface state in which the OpenMP implementation will not make any 14 callbacks and in which a first-party tool cannot invoke runtime entry points. 15 OMPD An interface that helps a third-party tool inspect the OpenMP state of a program that 16 has begun execution. 17 OMPD library A dynamically loadable library that implements the OMPD interface. 18 image file An executable or shared library. 19 address space A collection of logical, virtual, or physical memory address ranges that contain code, 20 stack, and/or data. Address ranges within an address space need not be contiguous. 21 An address space consists of one or more segments. 22 segment A portion of an address space associated with a set of address ranges. 23 OpenMP architecture The architecture on which an OpenMP region executes. 24 tool architecture The architecture on which an OMPD tool executes. 25 OpenMP process A collection of one or more threads and address spaces. A process may contain 26 threads and address spaces for multiple OpenMP architectures. At least one thread 27 in an OpenMP process is an OpenMP thread. A process may be live or a core file. 28 address space handle A handle that refers to an address space within an OpenMP process. 29 thread handle A handle that refers to an OpenMP thread. 30 parallel handle A handle that refers to an OpenMP parallel region. 31 task handle A handle that refers to an OpenMP task region. 32 descendent handle An output handle that is returned from the OMPD library in a function that accepts 33 an input handle: the output handle is a descendent of the input handle. CHAPTER 1. INTRODUCTION 19
ancestor handle An input handle that is passed to the OMPD library in a function that returns an output handle:the input handle is an ancestor of the output handle.For a given 3 handle,the ancestors of the handle are also the ancestors of the handle's descendent. COMMENT:A handle cannot be used by the tool in an OMPD call if any 56 ancestor of the handle has been released,except for OMPD calls tha release the handle. tool context An opaque reference provided by a tool to an OMPD library.A tool context uniquely identifies an abstraction. 9 address space context A tool context that refers to an address space within a process. thread context A tool context that refers to a native thread. 11 native thread identifier An identifier for a native thread defined by a thread implementation. 121.3 Execution Model 13 The OpenMP API uses the fork-join model of parallel execution.Multiple threads of execution perform tasks defined implicitly or explicitly by OpenMP directives.The OpenMP API is intended to support programs that will execute correctly both as parallel programs(multiple threads of ograms (directives ignored and a 167819 execution and a full OpenMP support library)and as sequential pr simpe library oweve executes as a parallel program but not as asequential program.or that produces different results when executed as a parallel program compared to when it is executed as a sequential 0 program.Furthermore,using different numbers of threads may result in different numeric results because of changes in the association of numeric operations.For example,a serial addition 2 reduction may have a different pattern of addition associations than a parallel reduction.These different associations may change the results of floating-point addition 2425 An OpenMP program begins as a single thread of execution,called an initial thread.An initial thread executes sequentially,as if the code encountered is part of an implicit task region,called an 6 initial task region,that is generated by the implicit parallel region surrounding the whole program. The thread that executes the implicit parallel region that surrounds the whole pr rogram executes on the host device.An implementation m orted,one or more threads that are distinct from threads that execute on another device.Threads cannot migrate from 1 one device to another device.The execution model is host-centric such that the host device offloads target regions to target devices. 20 OpenMP API-Version 5.0 November 2018
1 ancestor handle An input handle that is passed to the OMPD library in a function that returns an 2 output handle: the input handle is an ancestor of the output handle. For a given 3 handle, the ancestors of the handle are also the ancestors of the handle’s descendent. 4 COMMENT: A handle cannot be used by the tool in an OMPD call if any 5 ancestor of the handle has been released, except for OMPD calls that 6 release the handle. 7 tool context An opaque reference provided by a tool to an OMPD library. A tool context uniquely 8 identifies an abstraction. 9 address space context A tool context that refers to an address space within a process. 10 thread context A tool context that refers to a native thread. 11 native thread identifier An identifier for a native thread defined by a thread implementation. 12 1.3 Execution Model 13 The OpenMP API uses the fork-join model of parallel execution. Multiple threads of execution 14 perform tasks defined implicitly or explicitly by OpenMP directives. The OpenMP API is intended 15 to support programs that will execute correctly both as parallel programs (multiple threads of 16 execution and a full OpenMP support library) and as sequential programs (directives ignored and a 17 simple OpenMP stubs library). However, it is possible and permitted to develop a program that 18 executes correctly as a parallel program but not as a sequential program, or that produces different 19 results when executed as a parallel program compared to when it is executed as a sequential 20 program. Furthermore, using different numbers of threads may result in different numeric results 21 because of changes in the association of numeric operations. For example, a serial addition 22 reduction may have a different pattern of addition associations than a parallel reduction. These 23 different associations may change the results of floating-point addition. 24 An OpenMP program begins as a single thread of execution, called an initial thread. An initial 25 thread executes sequentially, as if the code encountered is part of an implicit task region, called an 26 initial task region, that is generated by the implicit parallel region surrounding the whole program. 27 The thread that executes the implicit parallel region that surrounds the whole program executes on 28 the host device. An implementation may support other target devices. If supported, one or more 29 devices are available to the host device for offloading code and data. Each device has its own 30 threads that are distinct from threads that execute on another device. Threads cannot migrate from 31 one device to another device. The execution model is host-centric such that the host device offloads 32 target regions to target devices. 20 OpenMP API – Version 5.0 November 2018
When a target construct is encountered,a new target task is generated.The target task region encloses the target region.The target task is complete after the execution of the target region 3 is complete. 4567 region is part of an initial task region that is generated by an implicit parallel region.If the target device does not exist or the implementation does not support the target device,all target regions associated with that device execute on the host device. 910 The implementation must ensure that the tar region executes as if it were executed in the data environment of the target device unle ess ancause is present and thef clause expression evaluates to false. 12 The teams construct creates a league of teams,where each team is an initial team that comprises 1 an initial thread that executes the teams region.Each initial thread executes sequentially,as if the 4 1 If a construct creates a data environment,the data environment is created at the time the construct is encountered.The description of a construct defines whether it creates a data environment. 1 When any thread encounters a p arallel construct.the thread creates a team of itself and zero or 19 more additional threads and becomes the master of the new team.A set of implicit tasks.one per 022 ode for ea ch task is defin d h is assigned to a different threa the tamnd装h2ocon is always executed by the thread to which it is initially assigned.The task region of the task being executed 23 by the encountering thread is suspended,and each member of the new team executes its implicit task.There is an implicit barrier at the end of the parallel construct.Only the master thread 25 resumes execution beyond the end of the parallel construct,resuming the task region that was 267 eedinin program encou a11a1 construct. Any nur mber of parallel constructs parallel regions may be arbitrarily nested inside each other.If nested parallelism is disabled,or 29 is not supported by the OpenMP implementation,then the new team that is created by a thread 30 encountering a parallel construct inside a parallel region will consist only of the 31 .However if ested parallelism is supported and enabled,then the 23 can c t of more tha d.A parallel co et may include a proc_bind clause to specify the places to use for the threads in the team within the paralle region. 34 When any team encounters a worksharing construct,the work inside the construct is divided among 3 the members of the team,and executed cooperatively instead of being executed by every thread. There is a default barrier at the end of each worksharing construct unless the nowait clause is ecution of co by every thread in the teamres mes after the end of the 38 CHAPTER 1.INTRODUCTION 21
1 When a target construct is encountered, a new target task is generated. The target task region 2 encloses the target region. The target task is complete after the execution of the target region 3 is complete. 4 When a target task executes, the enclosed target region is executed by an initial thread. The 5 initial thread may execute on a target device. The initial thread executes sequentially, as if the target 6 region is part of an initial task region that is generated by an implicit parallel region. If the target 7 device does not exist or the implementation does not support the target device, all target regions 8 associated with that device execute on the host device. 9 The implementation must ensure that the target region executes as if it were executed in the data 10 environment of the target device unless an if clause is present and the if clause expression 11 evaluates to false. 12 The teams construct creates a league of teams, where each team is an initial team that comprises 13 an initial thread that executes the teams region. Each initial thread executes sequentially, as if the 14 code encountered is part of an initial task region that is generated by an implicit parallel region 15 associated with each team. 16 If a construct creates a data environment, the data environment is created at the time the construct is 17 encountered. The description of a construct defines whether it creates a data environment. 18 When any thread encounters a parallel construct, the thread creates a team of itself and zero or 19 more additional threads and becomes the master of the new team. A set of implicit tasks, one per 20 thread, is generated. The code for each task is defined by the code inside the parallel construct. 21 Each task is assigned to a different thread in the team and becomes tied; that is, it is always 22 executed by the thread to which it is initially assigned. The task region of the task being executed 23 by the encountering thread is suspended, and each member of the new team executes its implicit 24 task. There is an implicit barrier at the end of the parallel construct. Only the master thread 25 resumes execution beyond the end of the parallel construct, resuming the task region that was 26 suspended upon encountering the parallel construct. Any number of parallel constructs 27 can be specified in a single program. 28 parallel regions may be arbitrarily nested inside each other. If nested parallelism is disabled, or 29 is not supported by the OpenMP implementation, then the new team that is created by a thread 30 encountering a parallel construct inside a parallel region will consist only of the 31 encountering thread. However, if nested parallelism is supported and enabled, then the new team 32 can consist of more than one thread. A parallel construct may include a proc_bind clause to 33 specify the places to use for the threads in the team within the parallel region. 34 When any team encounters a worksharing construct, the work inside the construct is divided among 35 the members of the team, and executed cooperatively instead of being executed by every thread. 36 There is a default barrier at the end of each worksharing construct unless the nowait clause is 37 present. Redundant execution of code by every thread in the team resumes after the end of the 38 worksharing construct. CHAPTER 1. INTRODUCTION 21