Named Entities

Recognition, classification and use

Edited by Satoshi Sekine and Elisabete Ranchhod
New York University / University of Lisbon
Named Entities provides critical information for many NLP applications. Named Entity recognition and classification (NERC) in text is recognized as one of the important sub-tasks of Information Extraction (IE). The seven papers in this volume cover various interesting and informative aspects of NERC research. Nadeau & Sekine provide an extensive survey of past NERC technologies, which should be a very useful resource for new researchers in this field. Smith & Osborne describe a machine learning model which tries to solve the over-fitting problem. Mazur & Dale tackle a common problem of NE and conjunction; as conjunctions are often a part of NEs or appear close to NEs, this is an important practical problem. A further three papers describe analyses and implementations of NERC for different languages: Spanish (Galicia-Haro & Gelbukh), Bengali (Ekbal, Naskar & Bandyopadhyay), and Serbian (Vitas, Krstev & Maurel). Finally, Steinberger & Pouliquen report on a real WEB application where multilingual NERC technology is used to identify occurrences of people, locations and organizations in newspapers in different languages.

The contributions to this volume were previously published in Lingvisticae Investigationes 30:1 (2007).

[Benjamins Current Topics, 19]  2009.  v, 168 pp.
Publishing status: Available
HardboundAvailable
ISBN 9789027222497 | EUR 90.00 | USD 120.00
 
e-BookSold by e-book platforms
ISBN 9789027289223 | EUR 90.00 | USD 120.00
 
 

Table of Contents

Foreword
1–2
Articles
A survey of named entity recognition and classification
David Nadeau and Satoshi Sekine
3–28
Diversity in logarithmic opinion pools
Andrew D.M. Smith and Miles Osborne
29–50
Handling conjunctions in named entities
Pawel Mazur and Robert Dale
51–70
Complex named entities in Spanish texts: Structures and properties
Sofía N. Galicia-Haro and Alexander Gelbukh
71–96
Named Entity Recognition and transliteration in Bengali
Asif Ekbal, Sudip Kumar Naskar and Sivaji Bandyopadhyay
97–116
A note on the semantic and morphological properties of proper names in the Prolex project
Duško Vitas, Cvetana Krstev and Denis Maurel
117–136
Cross-lingual Named Entity Recognition
Ralf Steinberger and Bruno Pouliquen
137–164
Index
165–168

Subjects

Benjamins Subject classification

BIC Subject

CFG: Semantics, Pragmatics, Discourse Analysis

BISAC Subject

LAN009000: LANGUAGE ARTS & DISCIPLINES / Linguistics
U.S. Library of Congress Control Number:  2009017541
This page is part of John Benjamins Publishing Company website. Click 'embed' to view its contents in the fully-featured web application. Embed