Experiments of An Analytical and Experimental
Comparison of CSP Extensions and Tools
This webpage
is used to give more details of the experiments related to paper An Analytical and Experimental Comparison of
CSP Extensions and Tools, whose technical
report is available here. The
experiments are comparing three model checkers FDR,
ProB,
and the CSP module of PAT model checker.
All the source code of the
experiment (CSPM and CSP# models) in this paper is available here.
We evaluate the efficiency of FDR, ProB and PAT by verifying nine
benchmark systems. All the experiments are run on a PC with Intel® CPU
E6550@2.33GHz and 4GB RAM, on a 32-bit Linux OS for FDR (version 2.91) and ProB
(version 1.3.5-beta1, build 10762), a 32-bit Windows OS for PAT (version 3.4.3,
build 20102). They are shown as follows:
¡¤
Experiment
on refinement checking
¡¤
Experiment
on solving puzzles
¡¤
Experiment on shared variables
¡¤
Experiment on LTL model checking
¡¤
Experiment on battery monitor
case study
This page is maintained by Ling Shi.
Please contact us if you have any
question about this webpage or these experiments. We will reply your enquiries
as soon as possible.
The verified models
in this experiment include readers/writers problem, dinning philosopher problem
(DP), and Milner¡¯s cyclic scheduler (MCS) [1].
Explanations of
models:
¡¤
Readers/writers
This
problem deals with situations in which many threads must access the same shared
memory at one time, some reading and some writing, with the natural constraint
that no process may access the share for reading or writing while another
process is in the act of writing to it. In particular, it is allowed for two
readers to access the share at the same time. In this model, a controller is
used to guarantee the correct coordination among multiple readers/writers. The
verified property is trace refinement (P [T= S).
¡¤
Dinning
philosopher
N philosophers sit around a table. Each philosopher
can eat the meal if and only if he/she picks forks on his/her right and left
sides, and both folks will be released after the philosopher finishes. This
model is deadlock free and the property is failure refinement (P [F= S).
¡¤
Milner¡¯s
cyclic scheduler
This scheduling algorithm is described by
Milner in 1989. N processes are
activated in a cyclic manner: process i
activates process i + 1 for i < N - 1 and process N - 1 activates process 0. Moreover, a
process may never be re-activated before it has terminated. The property is
failures/divergence refinement (P [FD= S), namely, the
implemented scheduler follows a cyclic way.
Experiment results:
The table below
shows the verification result. Column State
shows the number of visited states, and column Time(s) records running time for the verification in seconds. Value
¡°-¡± in a cell denotes that the experiment is aborted due to either memory overflow
or execution time exceeding two hours. Value ¡°N.A.¡± means no models for the
tool.
Model |
N |
Property |
FDR |
ProB |
PAT |
|||
State |
Time(s) |
State |
Time(s) |
State |
Time(s) |
|||
RW |
6 |
P [T= S |
8 |
0.024 |
61365 |
125.94 |
9 |
0.04 |
RW |
200 |
P [T= S |
202 |
1.434 |
- |
- |
203 |
0.11 |
RW |
500 |
P [T= S |
502 |
19.651 |
- |
- |
503 |
0.057 |
RW |
1000 |
P [T= S |
1002 |
156.162 |
- |
- |
1003 |
0.108 |
DP |
6 |
P [F= S |
1 |
0.006 |
14510 |
82.42 |
1762 |
0.174 |
DP |
8 |
P [F= S |
1 |
0.071 |
- |
- |
22362 |
2.995 |
DP |
12 |
P [F= S |
1 |
0.104 |
- |
- |
- |
- |
MCS |
20 |
P [FD= S |
40 |
0.043 |
- |
- |
60 |
0.114 |
MCS |
50 |
P [FD= S |
100 |
0.086 |
- |
- |
150 |
0.143 |
MCS |
100 |
P [FD= S |
200 |
0.246 |
- |
- |
300 |
0.53 |
Discussion:
For readers/writers (R/W) models, although FDR applies some dedicated
compression techniques, PAT has better performance. For dining philosopher (DP)
models, FDR performs extremely well because of the strategy discussed in [2]. However, other experiments show that this
strategy may not be as efficient for other models. For MCS, PAT is comparable
to FDR in terms of the number of states per second. FDR processes the LTS by
applying its compression methods, whereas PAT applies a simple reduction
method, i.e., using the keyword atomic
to give higher priority to local events which are not synchronized, not
updating any variable and not mentioned in the property.
The experiment compares the
performance of three model checkers on solving puzzles, inspired by work in [3].
The CSPM and CSP# models
for these puzzles make the best use of their modeling power: CSP# specifies the
puzzles using shared variables, which are solved by PAT through reachability analysis,
whereas CSPM models the
puzzles using multi-way synchronization, which are solved by FDR and ProB
through trace refinement. In addition, FDR simulates a bounded DFS algorithm by
searching the divergence of a new system, in order to find a smaller
counterexample. The new system P¡¯, like a watchdog, can only perform up to N
events of the target implementation process P,
and then performs an infinite number of events [3].
This approach can be used provided that the target process P is loop-free. Column FDR-Div
records the results of states and time using this algorithm; value N.A. means there is no model with
divergence checking to solve the puzzle.
Explanations of
models:
¡¤
Peg
solitaire game
Peg solitaire game is a board game for one player
involving movement of pegs on a board with holes. The rule is to jump pegs by
clicking and dragging an adjoining peg to the open hole on the other side. Each
time a peg is jumped, it is automatically removed. You can only remove pegs by
jumping them. The best solution is that this peg is in the center.
¡¤
Knight¡¯s
tour
The Knight's tour is a
mathematical problem involving a knight on a chessboard. The knight is placed
on the empty board and, moving according to the rules of chess, must visit each
square exactly once. A knight's tour is called a closed tour if the knight ends
on a square attacking the square from which it began (so that it may tour the
board again immediately with the same path). Otherwise the tour is open.
¡¤
The tower
of Hanoi puzzle
This puzzle consists of three rods and a
number of disks of different sizes
which can slide onto any rod. The puzzle starts with the disks neatly stacked in order of size on one
rod, the smallest at the top, thus making a
conical shape. The puzzle is solved when disks are moved to the target rod
where disks are stacked in order of
size. Information of disk order on each rod is denoted by a process parameter whose value is a sequence in CSPM
and by a shared array in CSP#.
Experiment
results:
Model |
N |
FDR |
FDR-Div |
ProB |
PAT |
||||
State |
Time(s) |
State |
Time(s) |
State |
Time(s) |
State |
Time(s) |
||
Solitaire |
26 |
4048216 |
46.303 |
1 |
0.169 |
- |
- |
11950 |
5.356 |
Solitaire |
29 |
28249254 |
387.737 |
1 |
0.217 |
- |
- |
104395 |
54.681 |
Solitaire |
32 |
- |
- |
1 |
5.318 |
- |
- |
10955 |
5.301 |
Solitaire |
35 |
- |
- |
1 |
377.297 |
- |
- |
443230 |
279.454 |
Knight |
5 |
508450 |
3.522 |
1 |
0.037 |
- |
- |
4256 |
0.29 |
Knight |
6 |
- |
- |
1 |
15.399 |
- |
- |
129269 |
9.143 |
Knight |
7 |
- |
- |
1 |
94.713 |
- |
- |
77238 |
6.754 |
Hanoi |
6 |
729 |
0.052 |
N.A. |
N.A. |
1667 |
57.84 |
5775 |
0.416 |
Hanoi |
7 |
2187 |
0.086 |
N.A. |
N.A. |
4969 |
196.5 |
92680 |
6.837 |
Hanoi |
8 |
6561 |
0.181 |
N.A. |
N.A. |
14853 |
660.59 |
150918 |
11.524 |
Discussion:
From the above table, we can observe that the divergence checking
approach can be used in the solitaire and chess knight tour models. However,
this approach cannot always significantly improve the performance, because it
depends on the searching order. Moreover, it is costly to check if a system is
loop-free or not, which is the premise for applying this approach. PAT solves
the two puzzles in a reasonable time, and it is faster in the knight example
than FDR and FDR-Div. For the Hanoi puzzle, FDR has a better performance
because the compression techniques it uses can effectively reduce the state
space.
The experiment
explores the performance of FDR and PAT on verifying two models which involve
shared variables, namely, concurrent stack and the Peterson algorithm.
Explanations of
models:
¡¤
Concurrent
stack
A concurrent
stack allows multiple readers to access the shared stack at the same time, but
only one writer to update the value; readers cannot access the shared variable
in the latter case.
¡¤
Peterson¡¯s
algorithm
This concurrent programming algorithm is
designed for mutual exclusion that allows two processes to share a single-use
resource without conflicts with only shared memory for communication. The
algorithm can be generalized for more than two processes, as modeled in our
experiment. The property is mutual exclusion, which is specified by refinement
checking in CSPM model and
reachability analysis in CSP# model. Note that we obtain the CSPM model for the Peterson
algorithm from the shared variable analyzer (SVA) [4],
a FDR front-end which can convert programs (like C programs) with shared
variables into CSPM
models. To be fair, the CSP# model
is specified at the same level of granularity as the CSPM model
Experiment
results:
In the following
table, ConcurrentStack*2
in the Model column means that the
stack size is 2.
Model |
N |
Property |
FDR |
PAT |
||
State |
Time(s) |
State |
Time(s) |
|||
Concurrent
Stack*2 |
3 |
P [=T S |
453465 |
3.833 |
10860 |
1.023 |
Concurrent
Stack*2 |
4 |
P [=T S |
- |
- |
189920 |
75.915 |
Concurrent
Stack*2 |
5 |
P [=T S |
- |
- |
693828 |
293.382 |
Peterson |
3 |
mutual exclusion |
1011 |
1.192 |
3257 |
0.105 |
Peterson |
4 |
mutual exclusion |
105493 |
20.067 |
104686 |
3.776 |
Peterson |
5 |
mutual exclusion |
14810779 |
387.645 |
5722863 |
294.005 |
Discussion:
The results of concurrent stack example show that PAT performs better
than FDR for checking trace refinement (P[T=S), and this is because PAT uses
DFS with anti-chain algorithm in the trace refinement. This algorithm is
effective when the specification is non-deterministic. The results of the
Peterson algorithm show that PAT performs better. This is because local events
associated as atomic statements in CSP# reduce the states significantly,
whereas CSPM model defines additional events to represent reading/writing
operations of shared variables. Although these additional events can be hidden
as internal events to apply existing compression techniques in FDR, the effect
is minor because the type range of reading/writing channels and operations over
different variables can easily lead to state space explosion
This experiment
explores the performance on verifying LTL properties. We adopt the approach
proposed by Lowe [5] to construct a CSPM process for the LTL formula
and use FDR to perform the refusal refinement checking. As this approach cannot
deal with operator eventually (⃟), we ignore the checking of property
⃞⃟eat.0 in the dining philosopher example using FDR.
Experiment
results:
Model |
N |
Property |
Result |
FDR |
ProB |
PAT |
|||
State |
Time(s) |
State |
Time(s) |
State |
Time(s) |
||||
RW |
6 |
⃞!error |
true |
8 |
0.023 |
122722 |
104.8 |
15 |
0.059 |
RW |
200 |
⃞!error |
true |
202 |
1.455 |
- |
- |
403 |
0.086 |
RW |
500 |
⃞!error |
true |
502 |
19.901 |
- |
- |
1003 |
0.071 |
RW |
1000 |
⃞!error |
true |
1002 |
154.33 |
- |
- |
2003 |
0.148 |
DP |
6 |
⃞⃟eat.0 |
false |
N.A. |
N.A. |
2420 |
1.11 |
166 |
0.019 |
DP |
8 |
⃞⃟eat.0 |
false |
N.A. |
N.A. |
13312 |
1.75 |
256 |
0.024 |
DP |
10 |
⃞⃟eat.0 |
false |
N.A. |
N.A. |
- |
- |
460 |
0.049 |
Discussion:
The table indicates that PAT performs better than FDR and ProB. Notice that property ⃞⃟eat.0 can be verified to be true using PAT under the strong or global fairness assumption.
We conduct a case
study on translating CSPM model to CSP# through a real world problem, namely,
the battery monitor component of the elevator control system described in [6]: the CSPM
specification is translated from the battery monitor Simulink diagram, and the
CSP# model is translated from the CSPM model. Then we analyze the Simulink
diagram, using the Simulink simulator, to determine which output the battery
monitor should produce for every given input. Thus, to assess if both models
describe the same behaviour, we compose each one with parallel observers. We
noticed that both CSPM and
CSP# models provide the same output value for all relevant scenarios. Besides
this analysis, we also checked basic properties like deadlock freeness and
divergence freeness.
Explanations of
models:
This model specifies
a battery monitor module in power monitor control systems designed as a
Simulink diagram. The Simulink system specification is a collection of 7
blocks. Each block is free to perform its exclusive events, and must
synchronize with common events. In each period, when a sampleStep event is communicated, the block starts to read its
inputs in any order. When all inputs are read, it outputs a value based on its
function until another sampleStep
event is communicated. When a sampleStep
event does not happen, then it outputs
a value considering the last read inputs.
Experiment
results:
The comparison is performed as a controlled experiment
and we run each assertion 30 times. We apply common statistics testing methods
(particularly, Shapiro Wilk and Mann-Whitney U) to experiment data.
Model |
Result |
FDR |
PAT |
||
State |
Time(s) |
State |
Time(s) |
||
Deadlock-free |
true |
2700 |
0.276 |
2700 |
0.748 |
Livelock-free |
true |
2700 |
0.286 |
2700 |
0.466 |
Discussion:
From the table, we can observe that the visited states in FDR and PAT
are the same, and performance of these two tools is similar using the same verification
algorithm. Note that the time for deadlock-freeness property in PAT consists of
time for deadlock and nonterminating checking. We have not included ProB in
this comparison as it is unable to recognize the CSPM syntax for this example.
[1]
R.
Milner. Communication and concurrency.
Prentice-Hall, Inc., 1989.
[2] A. W. Roscoe, P. H. B. Gardiner, M. Goldsmith, J. R. Hulance, D. M. Jackson, and J. B. Scattergood. Hierarchical compression for model-checking csp or how to check 1020 dining philosophers for deadlock. In TACAS, pages 133¨C152, 1995.
[3]
H.
Palikareva, J. Ouaknine, and A. W. Roscoe. Faster fdr counterexample generation
using sat-solving. Electronic
Communications of the EASST, 23, September 2009.
[4]
A.
W. Roscoe. Understanding Concurrent
Systems. Springer-Verlag New York, Inc., 2010.
[5]
G. Lowe. Specification of
communicating processes: temporal logic versus refusals-based refinement. Form. Asp. Comput., 20(3):277¨C294, May
2008.
[6]
J. Jesus, A. Mota, A. Sampaio, and L.
Grijo. Architectural verification of control systems using csp. In ICFEM, pages 323¨C339, 2011.