Download presentation
Presentation is loading. Please wait.
1
Co-verification Experience Juncao Li System Verification Lab Computer Science, PSU 01/05/2010
2
juncao@cs.pdx.edu Agenda Overview Sealevel PIO-24 Digital I/O Card Intel 100M PCI Ethernet Adapter System Verification Lab @CS, Portland State University2
3
juncao@cs.pdx.edu PIO-24 Digital I/O Card Driver code and device manual links –http://www.osronline.com/article.cfm?article=403http://www.osronline.com/article.cfm?article=403 –https://www.sealevel.com/uploads/manuals/8018S.pdfhttps://www.sealevel.com/uploads/manuals/8018S.pdf Driver size: 1724 lines of C code Device model size: 1232 lines of C code Issues found in the specification (device manual): 2 Bugs found in the driver: 4 Properties proved in the driver: 2 System Verification Lab @CS, Portland State University3
4
juncao@cs.pdx.edu Intel 100M PCI Ethernet adapter Driver code and device manual links –http://msdn.microsoft.com/en-us/library/dd163298.aspxhttp://msdn.microsoft.com/en-us/library/dd163298.aspx The source code is available in WDK releases. E.g., the SLAM SD one is in “WDK\src_6001\kmdf\pcidrv” –http://download.intel.com/design/network/manuals/8255X_Open SDM.pdfhttp://download.intel.com/design/network/manuals/8255X_Open SDM.pdf Driver size: 14406 lines of C code Device model size: 3518 lines of C code Issues found in the specification (device manual): 6 Bugs found in the driver: 3 Properties proved in the driver: 5 System Verification Lab @CS, Portland State University4
5
juncao@cs.pdx.edu Agenda Overview Sealevel PIO-24 Digital I/O Card Intel 100M PCI Ethernet adapter System Verification Lab @CS, Portland State University5
6
juncao@cs.pdx.edu Specification Issues Hardware/Software (HW/SW) interface specifications (manuals) usually have problems: –Incompleteness: what should be clarified is missing –Inconsistency: multi-places do not consistent with each other Spec issues are found when –We model hardware devices according to the specs –We verify drivers with the hardware models System Verification Lab @CS, Portland State University6
7
juncao@cs.pdx.edu Specification Issues Issue 1 (Incompleteness): –Location: Page 11, Section: Interrupt Read. –Problem: The default value of the register ISRQST1 is not specified. ISRQST1 indicates the interrupt pending status and “should” be 0 when the device is powering on. –This problem is related to the bug “ProperISR2” Issue 2 (Inconsistency): –Location: Page 11, Section: I/O Control Code. –Problem: The CW1D1 bit of the Control Word register is never used, but the bit is not defined as “Not Used” in the Section, Register Description, at Page 10. System Verification Lab @CS, Portland State University7
8
juncao@cs.pdx.edu Verification Statistics System Verification Lab @CS, Portland State University8
9
juncao@cs.pdx.edu Bug1: InvalidRead Driver will not read any invalid data About the bug: –Driver returns an invalid value to user applications without reading any hardware data register System Verification Lab @CS, Portland State University9
10
Initialize the device model Test harness entry
11
Predicates, i.e. HW/SW states HW state variable: no interrupt pending HW state variable: interrupt enabled SW state variable, the inconsistency of the two variables directly causes the bug Entry the callback function: EvtIoDeviceControl
12
In the callback function Process the I/O control code
13
I/O control code Driver understands “else” as: interrupt is enabled in hardware
14
HW runs after this statement, where CurrentRequest and AwaitingInt become inconsistent
15
The HW Transaction Function In Hardware Simulate the environment or the DIO device behaviors
16
In RunInterrupt Fires an interrupt
17
Call the ISR
18
In the ISR In Software
19
If it is this driver’s interrupt Because the driver routine EvtIoDeviceControl was interrupted, AwaitingInt is not “TRUE” yet
20
“data” is not read from the device Schedule the DPC. This immediately violates the rule: “CvIsrCallDpc” (see later slides)
21
ISR returns, EvtIoDeviceControl continues
22
EvtIoDeviceControl returns
23
DPC runs
24
CurrentReqest is not NULL, so prepare the data for the user application!
25
The I/O request is completed with STATUS_SUCCESS, but the data was never read from the hardware register!
26
juncao@cs.pdx.edu Bug2: CvIsrCallDpc Before ISR schedules a DPC, the ISR should read hardware volatile registers first About the bug: –Driver does not read the hardware data register but still requests the DPC This bug is already illustrated in the error trace of the “InvalidRead” bug –The actual error trace of the CvIsrCallDpc bug is different because we used another life circle harness and this bug can happen at various places. System Verification Lab @CS, Portland State University26
27
juncao@cs.pdx.edu However … InvalidRead and CvIsrCallDpc are different rules –Switch the two lines (in DioEvtDeviceControl) can fix the “InvalidRead” bug: devContext->CurrentRequest = Request; devContext->AwaitingInt = TRUE; –To fix the “CvIsrCallDpc” bug We need to move the WdfInterruptQueueDpcForIsr(...) call into the “if(devContext->AwaitingInt) {... }” block in ISR. –These fixes have been proved correct by SDV (CoVer). System Verification Lab @CS, Portland State University27
28
juncao@cs.pdx.edu Bug3: ProperISR1 If ISR returns TURE (i.e. acknowledge the interrupt to OS), the device interrupt status should not be active; otherwise, it may cause the interrupt storm. There are two scenarios that can cause this bug: –When the interrupt firing condition is configured as “high level” (resp. “low level”), the interrupt should be repeatedly fired if the input to Port A (least significant bit) stays high (reps. low) –When the condition is “rising edge” (resp. falling edge), depending on the input frequency to Port A, the interrupt will be repeatedly fired Both of the scenarios can cause interrupt storm System Verification Lab @CS, Portland State University28
29
In ISR HW state: interrupt fired
30
Read the interrupt status and clear the register at the same time
31
In hardware, interface event function Clear the interrupt status register on read
32
HW transaction function runs after the interrupt status register has being cleared
33
Low level fires the interrupt
34
The SLIC rule
35
juncao@cs.pdx.edu About This Bug We learned a solution from the Ethernet adapter driver: –If the device may fire interrupts freqently –Disable the interrupt first in ISR and enable it later in DPC after currect request has been serviced System Verification Lab @CS, Portland State University35
36
juncao@cs.pdx.edu Bug4: ProperISR2 ISR should not return TRUE if the interrupt of the device is not active, i.e. it is not this driver's interrupt. About the bug: –ISR acknowledges an interrupt even when the interrupt of the device is disabled –Related to spec issue 1: ISRQST1 doesn’t have a default value –A scenario for this bug: ISRQST1 is initialized to be 1 during the device power-on OS registers the DIO driver’s ISR to the PCI bus’ interrupt vector table Another PCI device fires an interrupt The DIO driver may have an “opportunity” to acknowledge this interrupt System Verification Lab @CS, Portland State University36
37
juncao@cs.pdx.edu Bug4: ProperISR2 The causes of this bug: –The default value of ISRQST1 is not clearly stated in the spec –The driver doesn’t reset the device during the device power-on –The ISR only checks the ISRQST1 register to decide if there is an interrupt pending (this bug can be avoided, if the interrupt- enabled register is also checked) I have to admit, the chance for this bug seems quite low –However, when it happens, no one will notice! System Verification Lab @CS, Portland State University37
38
juncao@cs.pdx.edu Agenda Overview Sealevel PIO-24 Digital I/O Card Intel 100M PCI Ethernet adapter System Verification Lab @CS, Portland State University38
39
juncao@cs.pdx.edu Spec Issues – Some Examples Issue 1 (Inconsistency): –Location: Page 38 - 39 –Problem: Table 15 is inconsistent with Table 14 about the types of CU commands. Issue 2 (Inconsistency): –Location: Page 136, Section: 8.2 Transmit Processing –Problem: System Verification Lab @CS, Portland State University39 The word “previous” is missing here, otherwise the logic is wrong
40
juncao@cs.pdx.edu Verification Statistics System Verification Lab @CS, Portland State University40
41
juncao@cs.pdx.edu Bug1: DoubleCUC The driver should not issue a CU command while the Command Unit (CU) is busy (not zero) –This is clearly stated in the device manual (specification) –Spec Page 37, Section: 6.3.2.2 SCB Command Word About the bug: –Driver issues a command to CU regardless the result of the previous device operation (software reset) System Verification Lab @CS, Portland State University41
42
In EvtDeviceD0Entry callback function HW states are non-deterministically initialized
43
Issue a software reset
44
Write to the “PORT” register to issue the command Inside the function HwSoftwareReset
45
In hardware model Start the software reset process
46
Wait for the port reset to complete. In software Note: the device manual doesn’t promise that in how long a port reset can complete, so wait is NOT enough!
47
Issue a command when the reset process is not finished Out from software reset.
48
Issue a command without waiting for the CU free
49
The SLIC rule
50
juncao@cs.pdx.edu About this bug I was not sure at first So I checked the Linux driver of the same device –http://sourceforge.net/projects/e1000/http://sourceforge.net/projects/e1000/ –The driver is also found in (for example): linux-2.6.32/drivers/net/e100.c System Verification Lab @CS, Portland State University50
51
Let’s see how the Linux driver handles “software reset”. It makes sure that the device is in correct state before issuing any command.
52
The Linux driver always waits before issuing a new command
53
juncao@cs.pdx.edu Bug2: DevD0Entry The callback function EvtDeviceD0Entry returns the value (TRUE or FALSE) that correctly represents the hardware state (initialized or failed) –This rule is clearly stated in MSDN About the bug: –Driver returns STATUS_SUCCESS even if the operations on the device have failed –The error trace illustrates that the driver continues its attempts to initialize the device even after the previous operations have failed System Verification Lab @CS, Portland State University53
54
In EvtDeviceD0Entry callback function
55
In NICInitializeAdapter
56
The command is timeout. Failure starts here
57
Return the failure status
58
Do some work and return the failure status that is returned from NICInitializeAdapter()
59
The return value of MPSetPowerD0Private() is ignored. The initialization process goes on as if nothing happened
60
EvtDeviceD0Entry returns STATUS_SCCESS while the device state is a mess …
61
juncao@cs.pdx.edu Once again, how about the Linux driver? System Verification Lab @CS, Portland State University61
62
juncao@cs.pdx.edu Corresponding to EvtDeviceD0Entry System Verification Lab @CS, Portland State University62
64
juncao@cs.pdx.edu Bug3: DevD0Exit The callback EvtDeviceD0Exit returns the value (TRUE or FALSE) that can correctly represent the hardware state (i.e. if the hardware has been properly stopped). –This rule is clearly stated in MSDN About the bug: –Driver returns STATUS_SUCCESS even if the operations on the device have failed. System Verification Lab @CS, Portland State University64
65
juncao@cs.pdx.edu Reference Juncao Li, Fei Xie, Thomas Ball, Vladimir Levin, and Con McGarvey. An Automata- Theoretic Approach to Hardware/Software Co-verification. To appear in Proc. of International Conference on Foundational Approaches to Software Engineering (FASE) –Link: http://web.cecs.pdx.edu/~juncao/links/mypapers/cover2010.pdfhttp://web.cecs.pdx.edu/~juncao/links/mypapers/cover2010.pdf System Verification Lab @CS, Portland State University65
66
juncao@cs.pdx.edu System Verification Lab @CS, Portland State University66 Questions ? juncao@cs.pdx.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.