Experiments of An Analytical and Experimental Comparison of CSP Extensions and Tools


                                           

This webpage is used to give more details of the experiments related to paper An Analytical and Experimental Comparison of CSP Extensions and Tools, whose technical report is available here. The experiments are comparing three model checkers FDR, ProB, and the CSP module of PAT model checker. All the source code of the experiment (CSPM and CSP# models) in this paper is available here.

We evaluate the efficiency of FDR, ProB and PAT by verifying nine benchmark systems. All the experiments are run on a PC with Intel® CPU E6550@2.33GHz and 4GB RAM, on a 32-bit Linux OS for FDR (version 2.91) and ProB (version 1.3.5-beta1, build 10762), a 32-bit Windows OS for PAT (version 3.4.3, build 20102). They are shown as follows:

 

¡¤         Experiment on refinement checking

¡¤         Experiment on solving puzzles

¡¤         Experiment on shared variables

¡¤         Experiment on LTL model checking

¡¤         Experiment on battery monitor case study

This page is maintained by Ling Shi.

Please contact us if you have any question about this webpage or these experiments. We will reply your enquiries as soon as possible.

 


Experiment on refinement checking

The verified models in this experiment include readers/writers problem, dinning philosopher problem (DP), and Milner¡¯s cyclic scheduler (MCS) [1].

Explanations of models:

¡¤         Readers/writers

This problem deals with situations in which many threads must access the same shared memory at one time, some reading and some writing, with the natural constraint that no process may access the share for reading or writing while another process is in the act of writing to it. In particular, it is allowed for two readers to access the share at the same time. In this model, a controller is used to guarantee the correct coordination among multiple readers/writers. The verified property is trace refinement (P [T= S).

¡¤         Dinning philosopher

N philosophers sit around a table. Each philosopher can eat the meal if and only if he/she picks forks on his/her right and left sides, and both folks will be released after the philosopher finishes. This model is deadlock free and the property is failure refinement (P [F= S).

¡¤         Milner¡¯s cyclic scheduler

This scheduling algorithm is described by Milner in 1989. N processes are activated in a cyclic manner: process i activates process i + 1 for i < N - 1 and process N - 1 activates process 0. Moreover, a process may never be re-activated before it has terminated. The property is failures/divergence refinement (P [FD= S), namely, the implemented scheduler follows a cyclic way.

Experiment results:

The table below shows the verification result. Column State shows the number of visited states, and column Time(s) records running time for the verification in seconds. Value ¡°-¡± in a cell denotes that the experiment is aborted due to either memory overflow or execution time exceeding two hours. Value ¡°N.A.¡± means no models for the tool.

Model

N

Property

FDR

ProB

PAT

State

Time(s)

State

Time(s)

State

Time(s)

RW

6

P [T= S

8

0.024

61365

125.94

9

0.04

RW

200

P [T= S

202

1.434

-

-

203

0.11

RW

500

P [T= S

502

19.651

-

-

503

0.057

RW

1000

P [T= S

1002

156.162

-

-

1003

0.108

DP

6

P [F= S

1

0.006

14510

82.42

1762

0.174

DP

8

P [F= S

1

0.071

-

-

22362

2.995

DP

12

P [F= S

1

0.104

-

-

-

-

MCS

20

P [FD= S

40

0.043

-

-

60

0.114

MCS

50

P [FD= S

100

0.086

-

-

150

0.143

MCS

100

P [FD= S

200

0.246

-

-

300

0.53

 

Discussion:

For readers/writers (R/W) models, although FDR applies some dedicated compression techniques, PAT has better performance. For dining philosopher (DP) models, FDR performs extremely well because of the strategy discussed in [2]. However, other experiments show that this strategy may not be as efficient for other models. For MCS, PAT is comparable to FDR in terms of the number of states per second. FDR processes the LTS by applying its compression methods, whereas PAT applies a simple reduction method, i.e., using the keyword atomic to give higher priority to local events which are not synchronized, not updating any variable and not mentioned in the property.


Experiment on solving puzzles

The experiment compares the performance of three model checkers on solving puzzles, inspired by work in [3]. The CSPM and CSP# models for these puzzles make the best use of their modeling power: CSP# specifies the puzzles using shared variables, which are solved by PAT through reachability analysis, whereas CSPM models the puzzles using multi-way synchronization, which are solved by FDR and ProB through trace refinement. In addition, FDR simulates a bounded DFS algorithm by searching the divergence of a new system, in order to find a smaller counterexample. The new system P¡¯, like a watchdog, can only perform up to N events of the target implementation process P, and then performs an infinite number of events [3]. This approach can be used provided that the target process P is loop-free. Column FDR-Div records the results of states and time using this algorithm; value N.A. means there is no model with divergence checking to solve the puzzle.

Explanations of models:

¡¤         Peg solitaire game

Peg solitaire game is a board game for one player involving movement of pegs on a board with holes. The rule is to jump pegs by clicking and dragging an adjoining peg to the open hole on the other side. Each time a peg is jumped, it is automatically removed. You can only remove pegs by jumping them. The best solution is that this peg is in the center.

¡¤         Knight¡¯s tour

The Knight's tour is a mathematical problem involving a knight on a chessboard. The knight is placed on the empty board and, moving according to the rules of chess, must visit each square exactly once. A knight's tour is called a closed tour if the knight ends on a square attacking the square from which it began (so that it may tour the board again immediately with the same path). Otherwise the tour is open.

¡¤         The tower of Hanoi puzzle

This puzzle consists of three rods and a number of disks of different sizes which can slide onto any rod. The puzzle starts with the disks neatly stacked in order of size on one rod, the smallest at the top, thus making a conical shape. The puzzle is solved when disks are moved to the target rod where disks are stacked in order of size. Information of disk order on each rod is denoted by a process parameter whose value is a sequence in CSPM and by a shared array in CSP#.

Experiment results:

Model

N

FDR

FDR-Div

ProB

PAT

State

Time(s)

State

Time(s)

State

Time(s)

State

Time(s)

Solitaire

26

4048216

46.303

1

0.169

-

-

11950

5.356

Solitaire

29

28249254

387.737

1

0.217

-

-

104395

54.681

Solitaire

32

-

-

1

5.318

-

-

10955

5.301

Solitaire

35

-

-

1

377.297

-

-

443230

279.454

Knight

5

508450

3.522

1

0.037

-

-

4256

0.29

Knight

6

-

-

1

15.399

-

-

129269

9.143

Knight

7

-

-

1

94.713

-

-

77238

6.754

Hanoi

6

729

0.052

N.A.

N.A.

1667

57.84

5775

0.416

Hanoi

7

2187

0.086

N.A.

N.A.

4969

196.5

92680

6.837

Hanoi

8

6561

0.181

N.A.

N.A.

14853

660.59

150918

11.524

 

Discussion:

From the above table, we can observe that the divergence checking approach can be used in the solitaire and chess knight tour models. However, this approach cannot always significantly improve the performance, because it depends on the searching order. Moreover, it is costly to check if a system is loop-free or not, which is the premise for applying this approach. PAT solves the two puzzles in a reasonable time, and it is faster in the knight example than FDR and FDR-Div. For the Hanoi puzzle, FDR has a better performance because the compression techniques it uses can effectively reduce the state space.


Experiment on shared variables

The experiment explores the performance of FDR and PAT on verifying two models which involve shared variables, namely, concurrent stack and the Peterson algorithm.

Explanations of models:

¡¤         Concurrent stack

A concurrent stack allows multiple readers to access the shared stack at the same time, but only one writer to update the value; readers cannot access the shared variable in the latter case.

¡¤         Peterson¡¯s algorithm

This concurrent programming algorithm is designed for mutual exclusion that allows two processes to share a single-use resource without conflicts with only shared memory for communication. The algorithm can be generalized for more than two processes, as modeled in our experiment. The property is mutual exclusion, which is specified by refinement checking in CSPM model and reachability analysis in CSP# model. Note that we obtain the CSPM model for the Peterson algorithm from the shared variable analyzer (SVA) [4], a FDR front-end which can convert programs (like C programs) with shared variables into CSPM models. To be fair, the CSP# model is specified at the same level of granularity as the CSPM model

Experiment results:

In the following table, ConcurrentStack*2 in the Model column means that the stack size is 2.

Model

N

Property

FDR

PAT

State

Time(s)

State

Time(s)

Concurrent Stack*2

3

P [=T S

453465

3.833

10860

1.023

Concurrent Stack*2

4

P [=T S

-

-

189920

75.915

Concurrent Stack*2

5

P [=T S

-

-

693828

293.382

Peterson

3

mutual exclusion

1011

1.192

3257

0.105

Peterson

4

mutual exclusion

105493

20.067

104686

3.776

Peterson

5

mutual exclusion

14810779

387.645

5722863

294.005

 

Discussion:

The results of concurrent stack example show that PAT performs better than FDR for checking trace refinement (P[T=S), and this is because PAT uses DFS with anti-chain algorithm in the trace refinement. This algorithm is effective when the specification is non-deterministic. The results of the Peterson algorithm show that PAT performs better. This is because local events associated as atomic statements in CSP# reduce the states significantly, whereas CSPM model defines additional events to represent reading/writing operations of shared variables. Although these additional events can be hidden as internal events to apply existing compression techniques in FDR, the effect is minor because the type range of reading/writing channels and operations over different variables can easily lead to state space explosion


Experiment on LTL model checking

This experiment explores the performance on verifying LTL properties. We adopt the approach proposed by Lowe [5] to construct a CSPM process for the LTL formula and use FDR to perform the refusal refinement checking. As this approach cannot deal with operator eventually (), we ignore the checking of property ⃞⃟eat.0 in the dining philosopher example using FDR.

Experiment results:

 

Model

N

Property

Result

FDR

ProB

PAT

State

Time(s)

State

Time(s)

State

Time(s)

RW

6

!error

true

8

0.023

122722

104.8

15

0.059

RW

200

!error

true

202

1.455

-

-

403

0.086

RW

500

!error

true

502

19.901

-

-

1003

0.071

RW

1000

!error

true

1002

154.33

-

-

2003

0.148

DP

6

⃞⃟eat.0

false

N.A.

N.A.

2420

1.11

166

0.019

DP

8

⃞⃟eat.0

false

N.A.

N.A.

13312

1.75

256

0.024

DP

10

⃞⃟eat.0

false

N.A.

N.A.

-

-

460

0.049

 

Discussion:

The table indicates that PAT performs better than FDR and ProB. Notice that property ⃞⃟eat.0 can be verified to be true using PAT under the strong or global fairness assumption.


Experiment on battery monitor case study

We conduct a case study on translating CSPM model to CSP# through a real world problem, namely, the battery monitor component of the elevator control system described in [6]: the CSPM specification is translated from the battery monitor Simulink diagram, and the CSP# model is translated from the CSPM model. Then we analyze the Simulink diagram, using the Simulink simulator, to determine which output the battery monitor should produce for every given input. Thus, to assess if both models describe the same behaviour, we compose each one with parallel observers. We noticed that both CSPM and CSP# models provide the same output value for all relevant scenarios. Besides this analysis, we also checked basic properties like deadlock freeness and divergence freeness.

Explanations of models:

This model specifies a battery monitor module in power monitor control systems designed as a Simulink diagram. The Simulink system specification is a collection of 7 blocks. Each block is free to perform its exclusive events, and must synchronize with common events. In each period, when a sampleStep event is communicated, the block starts to read its inputs in any order. When all inputs are read, it outputs a value based on its function until another sampleStep event is communicated. When a sampleStep event does not happen, then it outputs  a value considering the last read inputs.

Experiment results:

The comparison is performed as a controlled experiment and we run each assertion 30 times. We apply common statistics testing methods (particularly, Shapiro Wilk and Mann-Whitney U) to experiment data.

Model

Result

FDR

PAT

State

Time(s)

State

Time(s)

Deadlock-free

true

2700

0.276

2700

0.748

Livelock-free

true

2700

0.286

2700

0.466

 

Discussion:

From the table, we can observe that the visited states in FDR and PAT are the same, and performance of these two tools is similar using the same verification algorithm. Note that the time for deadlock-freeness property in PAT consists of time for deadlock and nonterminating checking. We have not included ProB in this comparison as it is unable to recognize the CSPM syntax for this example.


References

[1]   R. Milner. Communication and concurrency. Prentice-Hall, Inc., 1989.

[2]    A. W. Roscoe, P. H. B. Gardiner, M. Goldsmith, J. R. Hulance, D. M. Jackson, and J. B. Scattergood. Hierarchical compression for model-checking csp or how to check 1020 dining philosophers for deadlock. In TACAS, pages 133¨C152, 1995.

[3]   H. Palikareva, J. Ouaknine, and A. W. Roscoe. Faster fdr counterexample generation using sat-solving. Electronic Communications of the EASST, 23, September 2009.

[4]   A. W. Roscoe. Understanding Concurrent Systems. Springer-Verlag New York, Inc., 2010.

[5]   G. Lowe. Specification of communicating processes: temporal logic versus refusals-based refinement. Form. Asp. Comput., 20(3):277¨C294, May 2008.

[6]   J. Jesus, A. Mota, A. Sampaio, and L. Grijo. Architectural verification of control systems using csp. In ICFEM, pages 323¨C339, 2011.