| First, users create some set
of trained texts for a domain. They mark positive examples of relevant named
entities. The rest of the corpus is considered a pool of negative examples. |
|
| The algorithm goes through
training stage using this corpus. |
|
| Tagging rules are induced
only for left or right boundary of each Named Entity. For every positive
example algorithm does several steps: |
|
| 1. build initial rule |
|
| 2. generalize rule |
|
| 3. keep k best
generalizations of the initial rule |
|