Model-Free Prediction and Regression

Model-Free Prediction and Regression
Author: Dimitris N. Politis
Publisher: Springer
Total Pages: 256
Release: 2015-11-13
Genre: Mathematics
ISBN: 3319213474

The Model-Free Prediction Principle expounded upon in this monograph is based on the simple notion of transforming a complex dataset to one that is easier to work with, e.g., i.i.d. or Gaussian. As such, it restores the emphasis on observable quantities, i.e., current and future data, as opposed to unobservable model parameters and estimates thereof, and yields optimal predictors in diverse settings such as regression and time series. Furthermore, the Model-Free Bootstrap takes us beyond point prediction in order to construct frequentist prediction intervals without resort to unrealistic assumptions such as normality. Prediction has been traditionally approached via a model-based paradigm, i.e., (a) fit a model to the data at hand, and (b) use the fitted model to extrapolate/predict future data. Due to both mathematical and computational constraints, 20th century statistical practice focused mostly on parametric models. Fortunately, with the advent of widely accessible powerful computing in the late 1970s, computer-intensive methods such as the bootstrap and cross-validation freed practitioners from the limitations of parametric models, and paved the way towards the `big data' era of the 21st century. Nonetheless, there is a further step one may take, i.e., going beyond even nonparametric models; this is where the Model-Free Prediction Principle is useful. Interestingly, being able to predict a response variable Y associated with a regressor variable X taking on any possible value seems to inadvertently also achieve the main goal of modeling, i.e., trying to describe how Y depends on X. Hence, as prediction can be treated as a by-product of model-fitting, key estimation problems can be addressed as a by-product of being able to perform prediction. In other words, a practitioner can use Model-Free Prediction ideas in order to additionally obtain point estimates and confidence intervals for relevant parameters leading to an alternative, transformation-based approach to statistical inference.

Clinical Prediction Models

Clinical Prediction Models
Author: Ewout W. Steyerberg
Publisher: Springer
Total Pages: 574
Release: 2019-07-22
Genre: Medical
ISBN: 3030163997

The second edition of this volume provides insight and practical illustrations on how modern statistical concepts and regression methods can be applied in medical prediction problems, including diagnostic and prognostic outcomes. Many advances have been made in statistical approaches towards outcome prediction, but a sensible strategy is needed for model development, validation, and updating, such that prediction models can better support medical practice. There is an increasing need for personalized evidence-based medicine that uses an individualized approach to medical decision-making. In this Big Data era, there is expanded access to large volumes of routinely collected data and an increased number of applications for prediction models, such as targeted early detection of disease and individualized approaches to diagnostic testing and treatment. Clinical Prediction Models presents a practical checklist that needs to be considered for development of a valid prediction model. Steps include preliminary considerations such as dealing with missing values; coding of predictors; selection of main effects and interactions for a multivariable model; estimation of model parameters with shrinkage methods and incorporation of external data; evaluation of performance and usefulness; internal validation; and presentation formatting. The text also addresses common issues that make prediction models suboptimal, such as small sample sizes, exaggerated claims, and poor generalizability. The text is primarily intended for clinical epidemiologists and biostatisticians. Including many case studies and publicly available R code and data sets, the book is also appropriate as a textbook for a graduate course on predictive modeling in diagnosis and prognosis. While practical in nature, the book also provides a philosophical perspective on data analysis in medicine that goes beyond predictive modeling. Updates to this new and expanded edition include: • A discussion of Big Data and its implications for the design of prediction models • Machine learning issues • More simulations with missing ‘y’ values • Extended discussion on between-cohort heterogeneity • Description of ShinyApp • Updated LASSO illustration • New case studies

Practical Statistics for Data Scientists

Practical Statistics for Data Scientists
Author: Peter Bruce
Publisher: "O'Reilly Media, Inc."
Total Pages: 322
Release: 2017-05-10
Genre: Computers
ISBN: 1491952911

Statistical methods are a key part of of data science, yet very few data scientists have any formal statistics training. Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not. Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you’re familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format. With this book, you’ll learn: Why exploratory data analysis is a key preliminary step in data science How random sampling can reduce bias and yield a higher quality dataset, even with big data How the principles of experimental design yield definitive answers to questions How to use regression to estimate outcomes and detect anomalies Key classification techniques for predicting which categories a record belongs to Statistical machine learning methods that “learn” from data Unsupervised learning methods for extracting meaning from unlabeled data

Regression Analysis and Linear Models

Regression Analysis and Linear Models
Author: Richard B. Darlington
Publisher: Guilford Publications
Total Pages: 689
Release: 2016-08-22
Genre: Social Science
ISBN: 1462527981

Emphasizing conceptual understanding over mathematics, this user-friendly text introduces linear regression analysis to students and researchers across the social, behavioral, consumer, and health sciences. Coverage includes model construction and estimation, quantification and measurement of multivariate and partial associations, statistical control, group comparisons, moderation analysis, mediation and path analysis, and regression diagnostics, among other important topics. Engaging worked-through examples demonstrate each technique, accompanied by helpful advice and cautions. The use of SPSS, SAS, and STATA is emphasized, with an appendix on regression analysis using R. The companion website (www.afhayes.com) provides datasets for the book's examples as well as the RLM macro for SPSS and SAS. Pedagogical Features: *Chapters include SPSS, SAS, or STATA code pertinent to the analyses described, with each distinctively formatted for easy identification. *An appendix documents the RLM macro, which facilitates computations for estimating and probing interactions, dominance analysis, heteroscedasticity-consistent standard errors, and linear spline regression, among other analyses. *Students are guided to practice what they learn in each chapter using datasets provided online. *Addresses topics not usually covered, such as ways to measure a variable’s importance, coding systems for representing categorical variables, causation, and myths about testing interaction.

Regression and Other Stories

Regression and Other Stories
Author: Andrew Gelman
Publisher: Cambridge University Press
Total Pages: 551
Release: 2020-07-23
Genre: Business & Economics
ISBN: 110702398X

A practical approach to using regression and computation to solve real-world problems of estimation, prediction, and causal inference.

Fundamentals of Clinical Data Science

Fundamentals of Clinical Data Science
Author: Pieter Kubben
Publisher: Springer
Total Pages: 219
Release: 2018-12-21
Genre: Medical
ISBN: 3319997130

This open access book comprehensively covers the fundamentals of clinical data science, focusing on data collection, modelling and clinical applications. Topics covered in the first section on data collection include: data sources, data at scale (big data), data stewardship (FAIR data) and related privacy concerns. Aspects of predictive modelling using techniques such as classification, regression or clustering, and prediction model validation will be covered in the second section. The third section covers aspects of (mobile) clinical decision support systems, operational excellence and value-based healthcare. Fundamentals of Clinical Data Science is an essential resource for healthcare professionals and IT consultants intending to develop and refine their skills in personalized medicine, using solutions based on large datasets from electronic health records or telemonitoring programmes. The book’s promise is “no math, no code”and will explain the topics in a style that is optimized for a healthcare audience.

Advances in Parallel Computing Algorithms, Tools and Paradigms

Advances in Parallel Computing Algorithms, Tools and Paradigms
Author: D.J. Hemanth
Publisher: IOS Press
Total Pages: 670
Release: 2022-11-23
Genre: Computers
ISBN: 1643683152

Recent developments in parallel computing for various fields of application are providing improved solutions for handling data. These newer, innovative ideas offer the technical support necessary to enhance intellectual decisions, while also dealing more efficiently with the huge volumes of data currently involved. This book presents the proceedings of ICAPTA 2022, the International Conference on Advances in Parallel Computing Technologies and Applications, hosted as a virtual conference from Bangalore, India, on 27 and 28 January 2022. The aim of the conference was to provide a forum for the sharing of knowledge about various aspects of parallel computing in communications systems and networking, including cloud and virtualization solutions, management technologies and vertical application areas. The conference also provided a premier platform for scientists, researchers, practitioners and academicians to present and discuss their most recent innovations, trends and concerns, as well as the practical challenges encountered in this field. More than 300 submissions were received for the conference, from which the 91 full-length papers presented here were accepted after review by a panel of subject experts. Topics covered include parallel computing in communication, machine learning intelligence for parallel computing and parallel computing for software services in theoretical and practical aspects. Providing an overview of recent developments in the field, the book will be of interest to all those whose work involves the use of parallel computing technologies.

Statistical Regression and Classification

Statistical Regression and Classification
Author: Norman Matloff
Publisher: CRC Press
Total Pages: 439
Release: 2017-09-19
Genre: Business & Economics
ISBN: 1351645897

Statistical Regression and Classification: From Linear Models to Machine Learning takes an innovative look at the traditional statistical regression course, presenting a contemporary treatment in line with today's applications and users. The text takes a modern look at regression: * A thorough treatment of classical linear and generalized linear models, supplemented with introductory material on machine learning methods. * Since classification is the focus of many contemporary applications, the book covers this topic in detail, especially the multiclass case. * In view of the voluminous nature of many modern datasets, there is a chapter on Big Data. * Has special Mathematical and Computational Complements sections at ends of chapters, and exercises are partitioned into Data, Math and Complements problems. * Instructors can tailor coverage for specific audiences such as majors in Statistics, Computer Science, or Economics. * More than 75 examples using real data. The book treats classical regression methods in an innovative, contemporary manner. Though some statistical learning methods are introduced, the primary methodology used is linear and generalized linear parametric models, covering both the Description and Prediction goals of regression methods. The author is just as interested in Description applications of regression, such as measuring the gender wage gap in Silicon Valley, as in forecasting tomorrow's demand for bike rentals. An entire chapter is devoted to measuring such effects, including discussion of Simpson's Paradox, multiple inference, and causation issues. Similarly, there is an entire chapter of parametric model fit, making use of both residual analysis and assessment via nonparametric analysis. Norman Matloff is a professor of computer science at the University of California, Davis, and was a founder of the Statistics Department at that institution. His current research focus is on recommender systems, and applications of regression methods to small area estimation and bias reduction in observational studies. He is on the editorial boards of the Journal of Statistical Computation and the R Journal. An award-winning teacher, he is the author of The Art of R Programming and Parallel Computation in Data Science: With Examples in R, C++ and CUDA.

Contributions in infinite-dimensional statistics and related topics

Contributions in infinite-dimensional statistics and related topics
Author: Enea G. Bongiorno
Publisher: Società Editrice Esculapio
Total Pages: 300
Release: 2014-05-21
Genre: Mathematics
ISBN: 8874887639

The interest towards Functional and Operatorial Statistics, and, more in general, towards infinite-dimensional statistics has dramatically increased in the statistical community and in many other applied scientific areas where people faces functional data. This volume collects the works selected and presented at the Third Edition of the International Workshop on Functional and Operatorial Statistics held in Stresa, Italy, from the 19th to the 21st of June 2014 (IWFOS’2014). The meeting represents an opportunity of bringing together leading researchers active on these topics both for what concerns theoretical aspects and a wide range of applications in various fields. To promote collaborations with other important strictly related areas of infinite-dimensional Statistics, such as High Dimensional Statistics and Model Selection Procedures, this book hosts works in the latter research subjects too.