Named Entities
Recognition, classification and use
New York University / University of Lisbon
Named Entities provides critical information for many NLP applications. Named Entity recognition and classification (NERC) in text is recognized as one of the important sub-tasks of Information Extraction (IE). The seven papers in this volume cover various interesting and informative aspects of NERC research. Nadeau & Sekine provide an extensive survey of past NERC technologies, which should be a very useful resource for new researchers in this field. Smith & Osborne describe a machine learning model which tries to solve the over-fitting problem. Mazur & Dale tackle a common problem of NE and conjunction; as conjunctions are often a part of NEs or appear close to NEs, this is an important practical problem. A further three papers describe analyses and implementations of NERC for different languages: Spanish (Galicia-Haro & Gelbukh), Bengali (Ekbal, Naskar & Bandyopadhyay), and Serbian (Vitas, Krstev & Maurel). Finally, Steinberger & Pouliquen report on a real WEB application where multilingual NERC technology is used to identify occurrences of people, locations and organizations in newspapers in different languages.
The contributions to this volume were previously published in Lingvisticae Investigationes 30:1 (2007).
[Benjamins Current Topics, 19]
2009.
v, 168 pp.
Publishing status: Available
Hardbound – Available
ISBN
9789027222497
|
EUR
90.00
|
USD
120.00
e-Book – Sold by e-book platforms
ISBN
9789027289223
|
EUR
90.00
|
USD
120.00
Table of Contents
|
Foreword
|
1–2
|
|
Articles
|
|
|
A survey of named entity recognition and classification
|
3–28
|
|
Diversity in logarithmic opinion pools
|
29–50
|
|
Handling conjunctions in named entities
|
51–70
|
|
Complex named entities in Spanish texts: Structures and properties
|
71–96
|
|
Named Entity Recognition and transliteration in Bengali
|
97–116
|
|
A note on the semantic and morphological properties of proper names in the Prolex project
|
117–136
|
|
Cross-lingual Named Entity Recognition
|
137–164
|
|
Index
|
165–168
|
Subjects
Benjamins Subject classification
BIC Subject
CFG: Semantics, Pragmatics, Discourse Analysis
BISAC Subject
LAN009000: LANGUAGE ARTS & DISCIPLINES / Linguistics
U.S. Library of Congress Control Number: 2009017541