64 F.Xu et al. (AErpr) e ::nlxl sel&ele.id ele]l... (AppStmts)d ::e=elf(e)d;d|while (e)dl if (e)d else dreturn el... (App Code)A={fhwd,.…,fndn (LPrim) ::switch z|encrt|excrt eoi iext... (LStmts) s ::dts;s while (e)s... (ItrpCode)0 ::[so,...,sN-1] (ProgUnit)n::={f1s1,...,fnsn} (LOSCode)O ::(na,ni,0) (LProg) P:=(A,O) (BitVal) b,ie∈{0,1} (ISRReg)isr ::[bo,...,bN-1] (CrtStk) cs ::nil ie::cs (ItrpStk)is ::nil k::is (ItrpTaskSt)::=(ie,is,cs) (ItpS0π={t161,·,tnw6n} Fig.2.The language for applications and kernel implementation Note that the correctness of OS kernels are independent of the implemen- tation language of A.Here we pick the C language for A to simplify the for- malization because the applications and the kernel are now implemented in the same language and we do not have to consider the interaction between different languages when defining the whole system (AO])behaviors. Low-Level Language for OS Kernels.The middle of Fig.2 shows the low-level language for the concrete implementation of OS kernels.Usually the kernels are implemented in C with inline assembly.However,giving semantics directly to C with inline assembly requires us to expose stacks and registers,which make the semantics overly complex.To avoid this problem,we extend the C statements with assembly primitives t to encapsulate the assembly code.Semantics of these primitives will be given below. switch x switches to the target task z.encrt enters a critical region by disabling interrupts.It also saves the old IF onto the stack to allow nested critical regions.Note we use ie to model the IF flag and abstract away other bits in the hardware EFLAGS register.excrt exits the current critical region by popping the stack to recover ie.Since we hide stacks in our state model,we use an abstract stack cs to save the historical ie bits (see Fig.2,which is explained below).eoi k clears the k-th bit in isr,indicating that the k-th interrupt is no longer in service.iext enables interrupts and returns to the interrupted program. The kernel implementation O consists of the system API implementation na, the internal functions ni and the interrupt handlers 0.The internal functions are called only by code in na or 0.6 is a sequence of N interrupt handlers,where N is the maximum number of interrupts we support.The handler with the lower identifier has the higher priority.Then a complete low-level program P is defined as a pair of the application code A and the kernel code O. Operational Semantics.The language is concurrent,with multiple continuations (i.e.,control stacks)in the state,each corresponding to a task.All tasks share
64 F. Xu et al. (AExpr) e ::= n | x | ∗ e | &e | e.id | e[e] | ... (AppStmts) d ::= e=e | f(¯e)| d; d | while (e) d | if (e) d else d | return e | ... (AppCode) A ::= {f1 d1,...,fn dn} (LPrim) ι ::= switch x | encrt | excrt | eoi k | iext | ... (LStmts) s ::= d |ι| s; s | while (e) s | ... (ItrpCode) θ ::= [s0,...,sN−1] (ProgUnit) η ::= {f1 s1,...,fn sn} (LOSCode) O ::= (ηa, ηi, θ) (LProg) P ::= (A, O) (BitVal) b, ie ∈ {0, 1} (ISRReg) isr ::= [b0,...,bN−1] (CrtStk) cs ::= nil | ie::cs (ItrpStk) is ::= nil | k ::is (ItrpTaskSt) δ ::= (ie, is, cs) (ItrpSt) π ::= {t1 δ1,...,tn δn} Fig. 2. The language for applications and kernel implementation Note that the correctness of OS kernels are independent of the implementation language of A. Here we pick the C language for A to simplify the formalization because the applications and the kernel are now implemented in the same language and we do not have to consider the interaction between different languages when defining the whole system (A[O]) behaviors. Low-Level Language for OS Kernels. The middle of Fig. 2 shows the low-level language for the concrete implementation of OS kernels. Usually the kernels are implemented in C with inline assembly. However, giving semantics directly to C with inline assembly requires us to expose stacks and registers, which make the semantics overly complex. To avoid this problem, we extend the C statements with assembly primitives ι to encapsulate the assembly code. Semantics of these primitives will be given below. switch x switches to the target task x. encrt enters a critical region by disabling interrupts. It also saves the old IF onto the stack to allow nested critical regions. Note we use ie to model the IF flag and abstract away other bits in the hardware EFLAGS register. excrt exits the current critical region by popping the stack to recover ie. Since we hide stacks in our state model, we use an abstract stack cs to save the historical ie bits (see Fig. 2, which is explained below). eoi k clears the k-th bit in isr, indicating that the k-th interrupt is no longer in service. iext enables interrupts and returns to the interrupted program. The kernel implementation O consists of the system API implementation ηa, the internal functions ηi and the interrupt handlers θ. The internal functions are called only by code in ηa or θ. θ is a sequence of N interrupt handlers, where N is the maximum number of interrupts we support. The handler with the lower identifier has the higher priority. Then a complete low-level program P is defined as a pair of the application code A and the kernel code O. Operational Semantics. The language is concurrent, with multiple continuations (i.e., control stacks) in the state, each corresponding to a task. All tasks share
A Practical Verification Framework for Preemptive OS Kernels 65 memory,but each has its own local variables and local interrupt states(see 6 in Fig.2,which is explained below).We also separate the program state (including memory and variables)into two disjoint parts,one for the application code A and the other for the kernel code O.The only way for A to access kernel states is to call system APIs in O,and O cannot access application states. We give small-step operational semantics to the language.For each step,the processor picks the continuation of the current task and executes its current command or expression.To model concurrency and interrupts,both commands and expressions could be executed in multiple steps,where each step corresponds to the granularity of a single machine instruction (as in CompCertTSO [22],but we use the sequential consistent model instead of the x86-TSO memory model). The assembly implementation of the context switch routine is abstracted into the primitive switch r.It switches the execution from the current task to the target task z,where x stores the task identifier. The other assembly primitives t are all related to interrupts management and handling.To model their semantics,we introduce interrupt states in the state model,as shown at the bottom of Fig.2.The global register isr is shared by all tasks.It models the isr register in the 8259 A interrupt controller,as explained in Sect.2.1.In addition,there are local interrupt states 6 for each task.It contains a local copy ie of the IF flag in the EFLAGS register (see Sect.2.1)recording whether interrupts are enabled,a stack cs consisting of the historical values of ie to support nested critical regions,and another stack is recording the sequence of interrupts that interrupt the execution of the task.The stack isis auxiliary data introduced mainly for verification purposes.records the 6 of each task. encrt enters a critical region by disabling interrupts (ie.,clearing the ie bit using cli).It also saves the old ie onto the cs stack.excrt exits the critical region by popping off the top value on cs and using it to restore ie (executing sti if the value is 1). Interrupt requests may arrive non-deterministically after each step if ie=1. A level-k request is served only if there is no request at higher or the same level being served (i.e.,Vk'.k'<k-isr()=0).Then the processor clears ie,sets isr(k)to 1,pushes the number k onto the logical stack is,saves the execution context and the local variables onto the abstract control stack (i.e., the continuation),and finally jumps to the interrupt handler 0(k). eoi k clears the k-th bit in isr,indicating that the k-th interrupt is no longer in service.iext is an abstraction of the iret instruction.It resets the ie bit to 1 to enable interrupts,pops out the topmost interrupt number on the is stack, and returns to the interrupted program. 3.2 The High-Level Specification Language Viewing from the aspect of application programmers,we model the OS kernel as an extended C language with multi-tasking and system calls.As explained above,the C language is used to implement user applications A,and the system calls invoke an abstract version of system routines in O,which are implemented using a simple specification language.Correspondingly,the low-level concrete
A Practical Verification Framework for Preemptive OS Kernels 65 memory, but each has its own local variables and local interrupt states (see δ in Fig. 2, which is explained below). We also separate the program state (including memory and variables) into two disjoint parts, one for the application code A and the other for the kernel code O. The only way for A to access kernel states is to call system APIs in O, and O cannot access application states. We give small-step operational semantics to the language. For each step, the processor picks the continuation of the current task and executes its current command or expression. To model concurrency and interrupts, both commands and expressions could be executed in multiple steps, where each step corresponds to the granularity of a single machine instruction (as in CompCertTSO [22], but we use the sequential consistent model instead of the x86-TSO memory model). The assembly implementation of the context switch routine is abstracted into the primitive switch x. It switches the execution from the current task to the target task x, where x stores the task identifier. The other assembly primitives ι are all related to interrupts management and handling. To model their semantics, we introduce interrupt states in the state model, as shown at the bottom of Fig. 2. The global register isr is shared by all tasks. It models the isr register in the 8259 A interrupt controller, as explained in Sect. 2.1. In addition, there are local interrupt states δ for each task. It contains a local copy ie of the IF flag in the EFLAGS register (see Sect. 2.1) recording whether interrupts are enabled, a stack cs consisting of the historical values of ie to support nested critical regions, and another stack is recording the sequence of interrupts that interrupt the execution of the task. The stack is is auxiliary data introduced mainly for verification purposes. π records the δ of each task. encrt enters a critical region by disabling interrupts (i.e., clearing the ie bit using cli). It also saves the old ie onto the cs stack. excrt exits the critical region by popping off the top value on cs and using it to restore ie (executing sti if the value is 1). Interrupt requests may arrive non-deterministically after each step if ie = 1. A level-k request is served only if there is no request at higher or the same level being served (i.e., ∀k .k ≤ k → isr(k ) = 0). Then the processor clears ie, sets isr(k) to 1, pushes the number k onto the logical stack is, saves the execution context and the local variables onto the abstract control stack (i.e., the continuation), and finally jumps to the interrupt handler θ(k). eoi k clears the k-th bit in isr, indicating that the k-th interrupt is no longer in service. iext is an abstraction of the iret instruction. It resets the ie bit to 1 to enable interrupts, pops out the topmost interrupt number on the is stack, and returns to the interrupted program. 3.2 The High-Level Specification Language Viewing from the aspect of application programmers, we model the OS kernel as an extended C language with multi-tasking and system calls. As explained above, the C language is used to implement user applications A, and the system calls invoke an abstract version of system routines in O, which are implemented using a simple specification language. Correspondingly, the low-level concrete