| EPSRC Reference: |
GR/L08243/01 |
| Title: |
APRIL:A KNOWLEDGE-RICH TOOL FOR THE ANALYSIS AND PREDICTION OF INNOVATION IN THE LEXICON |
| Principal Investigator: |
Professor A Renouf |
| Other Investigators: |
|
| Researcher Co-investigators: |
|
| Project Partners: |
|
| Department: |
English |
| Organisation: |
University of Liverpool |
| Scheme: |
Standard Research |
| Starts: |
01 July 1997 |
Ends: |
30 November 2000 |
Value (£): |
245,677
|
| EPSRC Research Topic Classifications: |
| Information and communication technologies: Human Communication in ICT |
|
|
| EPSRC Industrial Sector Classifications: |
|
| Related Grants: |
|
| Panel History: |
|
|
Summary |
The AVIATOR Project showed that almost all neologisms are very low-frequency words. These words are generally ignored in NLP/AI applicants. Individually, they are insignificant by any statistical measure; they cannot be readily identified or prioritised instatic text corpora; and as-yet unseen words cannot be taken account of in the development of applications for future use. Nevertheless, without access to information about them, NLP systems lack the capacity for fine-tuning and updating which would improve performance.
We now see a way to investigate this area by automatic means. We intend to develop a fine-grained system of software filters for monitoring morphological and lexical change. This tool will identify and classify the new words at stages in a flow of electronic news data in terms of form and function, allowing trends and generalisations to emerge, and sifting true neologisms from ephemera and typographical errors. It sill then extrapolate from existing patterns, to predict the likely future structure of the lexicon.
|
| Final Report Summary |
The AVIATOR Project showed that almost all neologisms are very low-frequency words. These words are generally ignored in NLP/AI applications. Individually, they are insignificant by any statistical measure; they cannot be readily identified or prioritised in static text corpora; and as-yet unseen words cannot be taken account of in the development of applications for future use. Nevertheless, without access to information about them, NLP systems lack the capacity for fine-tuning and updating which would improve performance.
We now see a way to investigate this area by automatic means. We intend to develop a fine-grained system of software filters for monitoring morphological and lexical change. This tool will identify and classify the new words at stages in a flow of electronic news data in terms of form and status, allowing trends and generalisations to emerge, and sifting true neologisms from ephemera and typographical errors. It will provide the data abstractions and representations required to support extrpolation from existing patterns in order to predict aspects of the future structure of the lexicon.
|
| Further Information: |
|
| Organisation Website: |
http://www.liv.ac.uk |