26 Oct 2004
CS 5244: DL Extended Services
26
Discriminant analysis for text genres
‘Karlgren and Cutting (94)
lSame text genre categories as Biber
lSimple count and average metrics
lDiscriminant analysis (in SPSS)
l64% precision over four categories
• Adverb
• Character
• Long word (> 6 chars)
• Preposition
• 2nd person pronoun
• “Therefore”
• 1st person pronoun
• “Me”
• “I”
• Sentence
Text Box: Some count features
Some count features
Text Box: Other features
Other features
• Words per sentence
• Characters per word
• Characters per sentence
• Type / Token Ratio