¡Idea:
capture syntactic and semantic flow rather than
token identity (for source code)
¡
¡Replace
variable names with IDs correlated with symbol
table and data type
¡Decompose
each p into
regions of
lsequential
statements
lconditionals
llooping
blocks – recurse on these
¡Calculate
similarity from root node downwards
¡
¡