Wei Lu and Min-Yen Kan
AIRS 2005 (Jeju Island, Korea)
9/22
Syntax Analysis
•function play(){
•  s = window.location;
• window.location = “media/arbrav2.wav”;
•}
JavaScript source code
•FUNCTION [FUNNAME:play PARAMS:]
• BLOCK
• STMT
• SETNAME
• BINDNAME::s
• GETPROP
• NAME::window
• STRING::location
• STMT
• SETPROP
• NAME::window
• STRING::location
• STRING::“media/abrav2.wav”
• RETURN
Syntax (structure) features are extracted from the parse tree
SETNAME[BINDNAMEàGETPROP] Syntax Feature
     level=2
STMT[SETNAMEà[BINDNAMEàGETPROP]] Syntax Feature
                  level=3
Next we try to investigate whether syntax features can enhance categorization performance.
For each JavaScript source code, we parse into a tree structure. Syntax features can then get extracted based on the tree.
In this particular example, a JavaScript function is parsed into a tree structure as shown below, and the syntax (structure) features can get extracted from the parse tree.
For example, a syntax feature can be extracted as a level 2 sub-tree like this.
Another syntax feature can be extracted as a level 3 sub-tree like this.
Such syntax features are then serialized as text tokens and get passed to the classifier.