This sub-project deals with
the problem of integrating XDuce into ML.
o
A
source-to-source translation
o
Type-safe
embedding of XDuce into ML
o
Semantics
of the XDuce program is preserved by the translation
o
An
easy extension of XDuce with ML function call and ML pattern matching.
The XDuce to ML embedding prototype programs are
written in Haskell via manual translation. The reason of using Haskell is that
we find that laziness may help in improving runtime performance.
o
The
addressbook program:
o
All programs
are written based on the XDuce example "addrbook". Source codes are
downloadable here.
o
We are
measuring the transformation time. The results show that we are pretty close to
the CDuce implementation.
o
The Benchmarks
are performed on a Linux workstation (with 2.4Ghz P-IV and 1GB Ram).
o
Time is
measured in milliseconds. To exclude the parsing time, we need to set two measuring
points for each program.
o
XDuce's
transformation time is obtained by calculating the difference between two
programs. The first program parses the input and transforms it. The second one
just parses the input.
o
CDuce's and
our prototype's transformation times are obtained by measuring the difference
between the CPU time at the two program locations. The first location is placed
at the point right after the parsing is over. The second location is placed at
the point right after the transformation is over.
o
The benchmark
for CDuce is obtained via the compilation with Ocaml interface support, as
suggested by the authors of CDuce. The benchmarks of CDuce and our prototype
are subject to runtime parameters, i.e. OCAMLRUNPARAM for ocaml and +RTS for
Haskell. For simplicity, we use DEFAULT runtime settings.
o
To enforce
strictness in Haskell code, we employ the DeepSeq
module.
|
|
XDuce to ML embedding |
||
|
Tiny Size Input |
|
|
|
|
100 entries |
60 |
1.8 |
1.0 |
|
500 entries |
2560 |
2.8 |
2.0 |
|
900 entries |
9120 |
4.0 |
4.0 |
|
Medium Size Input |
|
|
|
|
1k entries |
11770 |
4.2 |
2.4 |
|
2k entries |
51510 |
6.8 |
5.4 |
|
3k entries |
* |
10.4 |
7.4 |
|
4k entries |
* |
13.0 |
9.7 |
|
5k entries |
* |
16.4 |
12.7 |
|
6k entries |
* |
20.9 |
14.9 |
|
7k entries |
* |
23.3 |
18.3 |
|
8k entries |
* |
28.1 |
21.0 |
|
9k entries |
* |
32.7 |
23.5 |
|
Huge Size Input |
|
|
|
|
10k entries |
* |
35.1 |
25.5 |
|
20k entries |
* |
70.4 |
54.7 |
|
30k entries |
* |
106.6 |
90.7 |
|
40k entries |
* |
135.9 |
133.4 |
|
50k entries |
* |
170.6 |
173.2 |
|
60k entries |
* |
202.1 |
217.7 |
|
70k entries |
* |
270.5 |
281.7 |
|
80k entries |
* |
301.9 |
332.9 |
|
90k entries |
* |
351.6 |
380.8 |
* denotes no respond after 60 seconds.
o
The token
example:
o
The objective
of this benchmark is to test whether laziness can employed as an optimization
option.
o
We consider a program
as follows
regtype T =
Token[]
regtype D =
Delim[]
token ::
(T,(D,T)*) -> Bool
token (x as T) =
False
token _ =
True
o
The problem
lies in the application of (token e) where e is evaluated to some value of type
((T,D)*,T). Though we know that ((T,D)*,T) = (T,(D,T)*), the value is in
structured representation, and it will be shuffled (restructured) to match with
the function input type. It seems unnecessary because we just want to check
whether there is more than one token in the sequence.
o
In this
benchmark, we employ laziness in the function application, and we test the
program with inputs ranging between 10k and 100k entries under the same
environment as the above benchmark.
o
The results
capture the responding time of the application in milliseconds. It shows that
the processing time is constant. It might be more meaningful if we run this
benchmark on a slower machine.
|
|
XDuce to ML embedding (benchmark2.tar.gz) |
|
10k ~ 100k entries |
0.0 ~ 0.04 |
o
Upcoming benchmarks
o
See Optimizations
section below.
Here are some optimization
options we can pursue. We plan to apply them in for future experiments.
o
Optimization
scheme #1: We can save some coercion, by some additional check. After inference
takes place, if we find that pattern structure type (R_gamma_i) is a subtype of
pattern type (R_I), we combine the downcast testing and the upcast value
distribution. This is common in
most of the practical cases. For instance,
f :: (A|B)* -> ...
f ((x as A*), (y as B*)) = ...
...
o
Note that the
type of values which will match first pattern is (A*,B*) (R_1)
which is in same shape as the
pattern structural type (A*,B*)
(R_gamma_1). In this case, we combine d_1 :: [[(A|B)*]] -> Maybe [[R_1]] and u_1 :: [[R_1]] -> [[R_gamma_1]] into du_1 :: [[(A|B)*]]
-> Maybe [[R_gamma_1]] . We need an additional check at compile time whether
R_gamma_1 is a subtype of the function input type.
o
Optimization
scheme #2: We can compute a most general type based on the use of the pattern
variable. For instance,
f :: (A|B)* -> ...
f (x as A*) = g x
...
g :: (A|B)* -> ...
o
Since the only
use of variable x in the body is to apply it to g whose argument type is
(A|B)*, we consider just perform a pattern match checking, and we need not
perform any value distribution. This saves two up-casting coercions as compared
to the original translation scheme.
In this case, we can translate f as follows,
f :: [Or A_U B_U] -> ...
f v = case (d1 v) of
Just x ->
g v
Nothing
-> ...
g :: [Or A_U B_U] -> ...
d1 :: [Or A_U B_U] -> Maybe [A_U]
o
Optimization
scheme #3: Another possible direction we can pursue is that when the patterns
share a common prefix/suffix, we can merge the prefix/suffix. For example,
f :: (A|B)* -> ...
f v = case v of
(x as (A|C)*, y as
A) -> ...
(x as (A|D)*, y as
B) -> ...
...
o
From
the results of pattern inference, we have,
f v = case v of
(x as A*,y as A)
-> ...
(x as A*,y as B)
-> ...
...
o
Now
we can merge the patterns since they share a common prefix after inference taking
place.
f v = case v of
(x as A*,v' as
(A|B)) -> case v' of
y as A -> ...
y as B -> ...
...
o
We
get rid of one redundant checking for prefix A* out of value v.
o
Other
optimization possibilities: fusion and deforestation, changing the underlying
representation for sequence, etc.