From Complex Sentences to a Formal Semantic Representation using Syntactic Text Simplification and Open Information Extraction

From Complex Sentences to a Formal Semantic Representation using Syntactic Text Simplification and Open Information Extraction
Author: Christina Niklaus
Publisher: Springer Nature
Total Pages: 340
Release: 2022-08-29
Genre: Language Arts & Disciplines
ISBN: 3658386975

This work presents a discourse-aware Text Simplification approach that splits and rephrases complex English sentences within the semantic context in which they occur. Based on a linguistically grounded transformation stage, complex sentences are transformed into shorter utterances with a simple canonical structure that can be easily analyzed by downstream applications. To avoid breaking down the input into a disjointed sequence of statements that is difficult to interpret, the author incorporates the semantic context between the split propositions in the form of hierarchical structures and semantic relationships, thus generating a novel representation of complex assertions that puts a semantic layer on top of the simplified sentences. In a second step, she leverages the semantic hierarchy of minimal propositions to improve the performance of Open IE frameworks. She shows that such systems benefit in two dimensions. First, the canonical structure of the simplified sentences facilitates the extraction of relational tuples, leading to an improved precision and recall of the extracted relations. Second, the semantic hierarchy can be leveraged to enrich the output of existing Open IE approaches with additional meta-information, resulting in a novel lightweight semantic representation for complex text data in the form of normalized and context-preserving relational tuples.

From Complex Sentences to a Formal Semantic Representation Using Syntactic Text Simplification and Open Information Extraction

From Complex Sentences to a Formal Semantic Representation Using Syntactic Text Simplification and Open Information Extraction
Author: Christina Niklaus
Publisher:
Total Pages: 0
Release: 2022
Genre:
ISBN: 9783658386986

This work presents a discourse-aware Text Simplification approach that splits and rephrases complex English sentences within the semantic context in which they occur. Based on a linguistically grounded transformation stage, complex sentences are transformed into shorter utterances with a simple canonical structure that can be easily analyzed by downstream applications. To avoid breaking down the input into a disjointed sequence of statements that is difficult to interpret, the author incorporates the semantic context between the split propositions in the form of hierarchical structures and semantic relationships, thus generating a novel representation of complex assertions that puts a semantic layer on top of the simplified sentences. In a second step, she leverages the semantic hierarchy of minimal propositions to improve the performance of Open IE frameworks. She shows that such systems benefit in two dimensions. First, the canonical structure of the simplified sentences facilitates the extraction of relational tuples, leading to an improved precision and recall of the extracted relations. Second, the semantic hierarchy can be leveraged to enrich the output of existing Open IE approaches with additional meta-information, resulting in a novel lightweight semantic representation for complex text data in the form of normalized and context-preserving relational tuples. About the author Christina Niklaus is an Assistant Professor in Computer Science at the University of St.Gallen with a focus on Data Science and NLP. .

The Oxford Handbook of Computational Linguistics

The Oxford Handbook of Computational Linguistics
Author: Ruslan Mitkov
Publisher: Oxford University Press
Total Pages: 808
Release: 2004
Genre: Computers
ISBN: 019927634X

This handbook of computational linguistics, written for academics, graduate students and researchers, provides a state-of-the-art reference to one of the most active and productive fields in linguistics.

Automatic Text Simplification

Automatic Text Simplification
Author: Horacio Saggion
Publisher: Springer Nature
Total Pages: 121
Release: 2022-05-31
Genre: Computers
ISBN: 3031021665

Thanks to the availability of texts on the Web in recent years, increased knowledge and information have been made available to broader audiences. However, the way in which a text is written—its vocabulary, its syntax—can be difficult to read and understand for many people, especially those with poor literacy, cognitive or linguistic impairment, or those with limited knowledge of the language of the text. Texts containing uncommon words or long and complicated sentences can be difficult to read and understand by people as well as difficult to analyze by machines. Automatic text simplification is the process of transforming a text into another text which, ideally conveying the same message, will be easier to read and understand by a broader audience. The process usually involves the replacement of difficult or unknown phrases with simpler equivalents and the transformation of long and syntactically complex sentences into shorter and less complex ones. Automatic text simplification, a research topic which started 20 years ago, now has taken on a central role in natural language processing research not only because of the interesting challenges it posesses but also because of its social implications. This book presents past and current research in text simplification, exploring key issues including automatic readability assessment, lexical simplification, and syntactic simplification. It also provides a detailed account of machine learning techniques currently used in simplification, describes full systems designed for specific languages and target audiences, and offers available resources for research and development together with text simplification evaluation techniques.

Natural Language Processing with Python

Natural Language Processing with Python
Author: Steven Bird
Publisher: "O'Reilly Media, Inc."
Total Pages: 506
Release: 2009-06-12
Genre: Computers
ISBN: 0596555717

This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication. Packed with examples and exercises, Natural Language Processing with Python will help you: Extract information from unstructured text, either to guess the topic or identify "named entities" Analyze linguistic structure in text, including parsing and semantic analysis Access popular linguistic databases, including WordNet and treebanks Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.

Representation Learning for Natural Language Processing

Representation Learning for Natural Language Processing
Author: Zhiyuan Liu
Publisher: Springer Nature
Total Pages: 319
Release: 2020-07-03
Genre: Computers
ISBN: 9811555737

This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, including words, phrases, sentences and documents. Part II then introduces the representation techniques for those objects that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, networks, and cross-modal entries. Lastly, Part III provides open resource tools for representation learning techniques, and discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing.

Linguistic Databases

Linguistic Databases
Author: John A. Nerbonne
Publisher: Center for the Study of Language and Information Publications
Total Pages: 255
Release: 1998-01-28
Genre: Language Arts & Disciplines
ISBN: 9781575860930

Linguistic Databases explores the increasing use of databases in linguistics. The enormous potential in linguistic data - billions of utterances and messages daily - has been difficult to exploit. Many linguists have had to concentrate on introspective data with its inevitable blinders toward frequency, variation, and naturalness. Applications of linguistics have been handicapped. This volume explores the potential advantages of database applications to linguistics. Included in this volume are reports on database activities in phonetics, phonology, lexicography and syntax, comparative grammar, second-language acquisition, linguistic fieldwork, and language pathology. The book presents the specialized problems of multi-media (especially audio) and multi-lingual texts, including those in exotic writing systems. Implemented solutions are also discussed. The opportunities to use existing, minimally structured text repositories are presented.

The Cambridge Handbook of Psycholinguistics

The Cambridge Handbook of Psycholinguistics
Author: Michael Spivey
Publisher: Cambridge University Press
Total Pages: 1297
Release: 2012-08-20
Genre: Psychology
ISBN: 1139536141

Our ability to speak, write, understand speech and read is critical to our ability to function in today's society. As such, psycholinguistics, or the study of how humans learn and use language, is a central topic in cognitive science. This comprehensive handbook is a collection of chapters written not by practitioners in the field, who can summarize the work going on around them, but by trailblazers from a wide array of subfields, who have been shaping the field of psycholinguistics over the last decade. Some topics discussed include how children learn language, how average adults understand and produce language, how language is represented in the brain, how brain-damaged individuals perform in terms of their language abilities and computer-based models of language and meaning. This is required reading for advanced researchers, graduate students and upper-level undergraduates who are interested in the recent developments and the future of psycholinguistics.