Computational Methods for Corpus Annotation and Analysis

Computational Methods for Corpus Annotation and Analysis
Author: Xiaofei Lu
Publisher: Springer
Total Pages: 192
Release: 2014-07-08
Genre: Language Arts & Disciplines
ISBN: 9401786453

In the past few decades the use of increasingly large text corpora has grown rapidly in language and linguistics research. This was enabled by remarkable strides in natural language processing (NLP) technology, technology that enables computers to automatically and efficiently process, annotate and analyze large amounts of spoken and written text in linguistically and/or pragmatically meaningful ways. It has become more desirable than ever before for language and linguistics researchers who use corpora in their research to gain an adequate understanding of the relevant NLP technology to take full advantage of its capabilities. This volume provides language and linguistics researchers with an accessible introduction to the state-of-the-art NLP technology that facilitates automatic annotation and analysis of large text corpora at both shallow and deep linguistic levels. The book covers a wide range of computational tools for lexical, syntactic, semantic, pragmatic and discourse analysis, together with detailed instructions on how to obtain, install and use each tool in different operating systems and platforms. The book illustrates how NLP technology has been applied in recent corpus-based language studies and suggests effective ways to better integrate such technology in future corpus linguistics research. This book provides language and linguistics researchers with a valuable reference for corpus annotation and analysis.

Computational and Corpus Approaches to Chinese Language Learning

Computational and Corpus Approaches to Chinese Language Learning
Author: Xiaofei Lu
Publisher: Springer
Total Pages: 268
Release: 2019-02-06
Genre: Education
ISBN: 9811335702

This book presents a collection of original research articles that showcase the state of the art of research in corpus and computational linguistic approaches to Chinese language teaching, learning and assessment. It offers a comprehensive set of corpus resources and natural language processing tools that are useful for teaching, learning and assessing Chinese as a second or foreign language; methods for implementing such resources and techniques in Chinese pedagogy and assessment; as well as research findings on the effectiveness of using such resources and techniques in various aspects of Chinese pedagogy and assessment.

Corpus Annotation

Corpus Annotation
Author: R. G. Garside
Publisher: Routledge
Total Pages: 0
Release: 2016-07-10
Genre: Computational linguistics
ISBN: 9781138148581

Corpus Annotation gives an up-to-date picture of this fascinating new area of research, and will provide essential reading for newcomers to the field as well as those already involved in corpus annotation. Early chapters introduce the different levels and techniques of corpus annotation. Later chapters deal with software developments, applications, and the development of standards for the evaluation of corpus annotation. While the book takes detailed account of research world-wide, its focus is particularly on the work of the UCREL (University Centre for Computer Corpus Research on Language) team at Lancaster University, which has been at the forefront of developments in the field of corpus annotation since its beginnings in the 1970s.

Language Corpora Annotation and Processing

Language Corpora Annotation and Processing
Author: Niladri Sekhar Dash
Publisher: Springer Nature
Total Pages:
Release: 2021
Genre: Computational linguistics
ISBN: 9811629609

This book addresses the research, analysis, and description of the methods and processes that are used in the annotation and processing of language corpora in advanced, semi-advanced, and non-advanced languages. It provides the background information and empirical data needed to understand the nature and depth of problems related to corpus annotation and text processing and shows readers how the linguistic elements found in texts are analyzed and applied to develop language technology systems and devices. As such, it offers valuable insights for researchers, educators, and students of linguistics and language technology.

The Routledge Handbook of Corpus Linguistics

The Routledge Handbook of Corpus Linguistics
Author: Anne O'Keeffe
Publisher: Routledge
Total Pages: 684
Release: 2022-02-08
Genre: Language Arts & Disciplines
ISBN: 0429632649

The Routledge Handbook of Corpus Linguistics 2e provides an updated overview of a dynamic and rapidly growing area with a widely applied methodology. Over a decade on from the first edition of the Handbook, this collection of 47 chapters from experts in key areas offers a comprehensive introduction to both the development and use of corpora as well as their ever-evolving applications to other areas, such as digital humanities, sociolinguistics, stylistics, translation studies, materials design, language teaching and teacher development, media discourse, discourse analysis, forensic linguistics, second language acquisition and testing. The new edition updates all core chapters and includes new chapters on corpus linguistics and statistics, digital humanities, translation, phonetics and phonology, second language acquisition, social media and theoretical perspectives. Chapters provide annotated further reading lists and step-by-step guides as well as detailed overviews across a wide range of themes. The Handbook also includes a wealth of case studies that draw on some of the many new corpora and corpus tools that have emerged in the last decade. Organised across four themes, moving from the basic start-up topics such as corpus building and design to analysis, application and reflection, this second edition remains a crucial point of reference for advanced undergraduates, postgraduates and scholars in applied linguistics.

Advanced Computational and Communication Paradigms

Advanced Computational and Communication Paradigms
Author: Samarjeet Borah
Publisher: Springer Nature
Total Pages: 536
Release: 2023-09-20
Genre: Technology & Engineering
ISBN: 9819942845

This book presents high-quality, peer-reviewed papers from Fourth International Conference on Advanced Computational and Communication Paradigms (ICACCP 2023), organized by Department of Computer Science and Engineering (CSE), Sikkim Manipal Institute of Technology (SMIT), Sikkim, India, during February 16–18, 2023. ICACCP 2023 covers advanced computational paradigms and communications technique which provides failsafe and robust solutions to the emerging problems faced by mankind. Technologists, scientists, industry professionals, and research scholars from regional, national, and international levels are invited to present their original unpublished work in this conference.

Corpus Linguistics and Second Language Acquisition

Corpus Linguistics and Second Language Acquisition
Author: Xiaofei Lu
Publisher: Taylor & Francis
Total Pages: 173
Release: 2022-10-24
Genre: Language Arts & Disciplines
ISBN: 1000648494

In Corpus Linguistics and Second Language Acquisition, Xiaofei Lu comprehensively reviews empirical studies that employ corpus linguistic methods to investigate issues in second language variation, processing, production, and development. These methods enable advanced students and researchers to: Examine learner and task variables that condition variation in second language use Understand the effects of various input factors on second language processing and production Track group longitudinal trajectories of second language development and the input, learner, and task factors that affect such trajectories Profile inter- and intra-learner variability and individual variation in second language longitudinal development This book will serve as an excellent resource for students and researchers with interests in corpus linguistics and second language acquisition.

Developing Linguistic Corpora

Developing Linguistic Corpora
Author: Martin Wynne
Publisher: Oxbow Books Limited
Total Pages: 100
Release: 2005
Genre: Language Arts & Disciplines
ISBN:

A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.

Complexity, Accuracy and Fluency in Learner Corpus Research

Complexity, Accuracy and Fluency in Learner Corpus Research
Author: Agnieszka Leńko-Szymańska
Publisher: John Benjamins Publishing Company
Total Pages: 335
Release: 2022-12-15
Genre: Language Arts & Disciplines
ISBN: 9027257337

This volume illustrates the high potential of learner corpus investigations for research into the CAF triad by presenting eleven original learner corpus-based studies which are set within solid theoretical frameworks, examine learner corpora with state-of-the-art analytical techniques and yield highly interesting findings. The volume’s major strength lies in the range of issues it undertakes and in its interdisciplinary thematic novelty. The chapters collectively address all three dimensions of L2 performance related to different linguistic subsystems (i.e. lexical, phraseological and grammatical complexity and accuracy, along with fluency) as well as the interactions among these constructs. The studies are based on data drawn from carefully compiled learner corpora which are analysed with the help of diverse corpus-based methods. The theoretical discussions and the empirical results shall contribute to the advancement of the fields of SLA and writing and speech research and shall inspire further investigations in the area of the CAF triad.