Spatial and Temporal Digital Libraries

After Generalization

Timeid

Digit

<stime>

Tag

SemCat

Case

LexCat

Lemma

Word

Action

Associated information

Condition

Word

index


	First, users create some set of trained texts for a domain. They mark positive examples of relevant named entities. The rest of the corpus is considered a pool of negative examples.
	The algorithm goes through training stage using this corpus.
	Tagging rules are induced only for left or right boundary of each Named Entity. For every positive example algorithm does several steps:
	1. build initial rule
	2. generalize rule
	3. keep k best generalizations of the initial rule