Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery
Author: Gregory Grefenstette
Publisher: Springer Science & Business Media
Total Pages: 313
Release: 2012-12-06
Genre: Computers
ISBN: 1461527104

Explorations in Automatic Thesaurus Discovery presents an automated method for creating a first-draft thesaurus from raw text. It describes natural processing steps of tokenization, surface syntactic analysis, and syntactic attribute extraction. From these attributes, word and term similarity is calculated and a thesaurus is created showing important common terms and their relation to each other, common verb--noun pairings, common expressions, and word family members. The techniques are tested on twenty different corpora ranging from baseball newsgroups, assassination archives, medical X-ray reports, abstracts on AIDS, to encyclopedia articles on animals, even on the text of the book itself. The corpora range from 40,000 to 6 million characters of text, and results are presented for each in the Appendix. The methods described in the book have undergone extensive evaluation. Their time and space complexity are shown to be modest. The results are shown to converge to a stable state as the corpus grows. The similarities calculated are compared to those produced by psychological testing. A method of evaluation using Artificial Synonyms is tested. Gold Standards evaluation show that techniques significantly outperform non-linguistic-based techniques for the most important words in corpora. Explorations in Automatic Thesaurus Discovery includes applications to the fields of information retrieval using established testbeds, existing thesaural enrichment, semantic analysis. Also included are applications showing how to create, implement, and test a first-draft thesaurus.

Author:
Publisher: IOS Press
Total Pages: 4947
Release:
Genre:
ISBN:

Research and Development in Intelligent Systems XXIII

Research and Development in Intelligent Systems XXIII
Author: Frans Coenen
Publisher: Springer Science & Business Media
Total Pages: 421
Release: 2010-05-30
Genre: Computers
ISBN: 1846286638

The papers in this volume are the refereed technical papers presented at AI-2006, the Twenty-sixth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, held in Cambridge in December 2006. They present new and innovative developments in the field. For the first time the volume also includes the text of short papers presented as posters at the conference.

Foundations of Statistical Inference

Foundations of Statistical Inference
Author: Yoel Haitovsky
Publisher: Springer Science & Business Media
Total Pages: 227
Release: 2012-12-06
Genre: Mathematics
ISBN: 3642574106

This volume is a collection of papers presented at a conference held in Shoresh Holiday Resort near Jerusalem, Israel, in December 2000 organized by the Israeli Ministry of Science, Culture and Sport. The theme of the conference was "Foundation of Statistical Inference: Applications in the Medical and Social Sciences and in Industry and the Interface of Computer Sciences". The following is a quotation from the Program and Abstract booklet of the conference. "Over the past several decades, the field of statistics has seen tremendous growth and development in theory and methodology. At the same time, the advent of computers has facilitated the use of modern statistics in all branches of science, making statistics even more interdisciplinary than in the past; statistics, thus, has become strongly rooted in all empirical research in the medical, social, and engineering sciences. The abundance of computer programs and the variety of methods available to users brought to light the critical issues of choosing models and, given a data set, the methods most suitable for its analysis. Mathematical statisticians have devoted a great deal of effort to studying the appropriateness of models for various types of data, and defining the conditions under which a particular method work. " In 1985 an international conference with a similar title* was held in Is rael. It provided a platform for a formal debate between the two main schools of thought in Statistics, the Bayesian, and the Frequentists.

Machine Learning and Data Mining in Pattern Recognition

Machine Learning and Data Mining in Pattern Recognition
Author: Petra Perner
Publisher: Springer
Total Pages: 709
Release: 2005-08-25
Genre: Computers
ISBN: 3540318917

We met again in front of the statue of Gottfried Wilhelm von Leibniz in the city of Leipzig. Leibniz, a famous son of Leipzig, planned automatic logical inference using symbolic computation, aimed to collate all human knowledge. Today, artificial intelligence deals with large amounts of data and knowledge and finds new information using machine learning and data mining. Machine learning and data mining are irreplaceable subjects and tools for the theory of pattern recognition and in applications of pattern recognition such as bioinformatics and data retrieval. This was the fourth edition of MLDM in Pattern Recognition which is the main event of Technical Committee 17 of the International Association for Pattern Recognition; it started out as a workshop and continued as a conference in 2003. Today, there are many international meetings which are titled “machine learning” and “data mining”, whose topics are text mining, knowledge discovery, and applications. This meeting from the first focused on aspects of machine learning and data mining in pattern recognition problems. We planned to reorganize classical and well-established pattern recognition paradigms from the viewpoints of machine learning and data mining. Though it was a challenging program in the late 1990s, the idea has inspired new starting points in pattern recognition and effects in other areas such as cognitive computer vision.

Advances in Natural Language Processing

Advances in Natural Language Processing
Author: José Luis Vicedo
Publisher: Springer Science & Business Media
Total Pages: 498
Release: 2004-10-12
Genre: Computers
ISBN: 3540234985

This book constitutes the refereed proceedings of the 4th International Conference, EsTAL 2004, held in Alicante, Spain in October 2004. The 42 revised full papers presented were carefully reviewed and selected from 72 submissions. The papers address current issues in computational linguistics and monolingual and multilingual intelligent language processing and applications, in particular written language analysis and generation; pragmatics, discourse, semantics, syntax, and morphology; lexical resources; word sense disambiguation; linguistic, mathematical, and morphology; lexical resources; word sense disambiguation; linguistic, mathematical, and psychological models of language; knowledge acquisition and representation; corpus-based and statistical language modeling; machine translation and translation tools; and computational lexicography; information retrieval; extraction and question answering; automatic summarization; document categorization; natural language interfaces; and dialogue systems and evaluation of systems.

Knowledge Management and Organizational Memories

Knowledge Management and Organizational Memories
Author: Rose Dieng-Kuntz
Publisher: Springer Science & Business Media
Total Pages: 231
Release: 2012-12-06
Genre: Computers
ISBN: 1461509475

Knowledge Management and Organizational Memories presents models, methods, and techniques for building, managing and using corporate memories. These models incorporate knowledge bases, ontologies, documents, FAQs, workflow systems, case-based reasoning systems, multi-agent systems, and CSCW. The book is divided into five parts: methods; knowledge-based approaches; ontologies and documents; case-based reasoning approaches; and distributed and collaborative approaches.

Understanding New Media

Understanding New Media
Author: Kim H. Veltman
Publisher: University of Calgary Press
Total Pages: 714
Release: 2006
Genre: Computers
ISBN: 1552381544

This book outlines the development currently underway in the technology of new media and looks further to examine the unforeseen effects of this phenomenon on our culture, our philosophies, and our spiritual outlook.

Building and Using Comparable Corpora

Building and Using Comparable Corpora
Author: Serge Sharoff
Publisher: Springer Science & Business Media
Total Pages: 333
Release: 2013-12-13
Genre: Computers
ISBN: 3642201288

The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.