Text size
  • Small
  • Medium
  • Large
  • Standard
  • Blue text on blue
  • High contrast (Yellow text on black)
  • Blue text on beige

    Sequence Models for Automatic Highlighting and Surface Information Extraction

    21st Annual BCS-IRSG Colloquium on IR

    Glasgow. 19th - 20th April 1999


    M-R. Amini, H. Zaragoza & P. Gallinari


    With the increase of textual information available electronically, we assist to a great diversification of the demands on Information Retrieval (IR) and Information Extraction (IE) systems.

    In this paper we apply Machine Learning techniques of sequence analysis to the tasks of highlighting and labeling text with respect to an information extraction task. Specifically, dynamic probability models are used.

    Like IR systems, they use little semantics, are fully trainable and do not require any knowledge representation of the domain.

    Unlike IR approaches, documents are considered as a dynamic sequence of words. Furthermore, additional word information is naturally included in the representation.

    Models are evaluated on a sub-task of the MUC6 Scenario Template corpus. When morpho-syntactic word information is introduced into the representation, an increase in performances is observed.


    PDF filePDF Version of this Paper (41kb)