Multilevel and Longitudinal Modeling Using Stata, Volumes I and II, 4th Edition
- Length: 1047 pages
- Edition: 4
- Language: English
- Publisher: Stata Press
- Publication Date: 2021-08-19
- ISBN-10: B09CW4JB3W
- ISBN-13: 9781597181372
- Sales Rank: #0 (See Top 100 Books)
This book is a complete resource for learning to model data in which observations are grouped—whether nested data such as children nested in schools or repeated observations on the same individuals. Rabe-Hesketh and Skrondal introduce a variety of multilevel models for continuous, binary, count, and other outcomes. They also explain when each model is useful, how to fit and evaluate the model using Stata, and how to interpret the results. With this comprehensive coverage, researchers who need to apply multilevel models will find this book to be the perfect companion. It is also an excellent text for courses in multilevel modeling because it provides examples from a variety of disciplines as well as end-of-chapter exercises that allow students to practice newly learned material.
Displays Preface Acknowledgments I Preliminaries 1 Review of linear regression 1.1 Introduction 1.2 Is there gender discrimination in faculty salaries? 1.3 Independent-samples t test 1.4 One-way analysis of variance 1.5 Simple linear regression 1.6 Dummy variables 1.7 Multiple linear regression 1.8 Interactions 1.9 Dummy variables for more than two groups 1.10 Other types of interactions 1.10.1 Interaction between dummy variables 1.10.2 Interaction between continuous covariates 1.11 Nonlinear effects 1.12 Residual diagnostics 1.13 ❖ Causal and noncausal interpretations of regression coefficients 1.13.1 Regression as conditional expectation 1.13.2 Regression as structural model 1.14 Summary and further reading 1.15 Exercises II Two-level models 2 Variance-components models 2.1 Introduction 2.2 How reliable are peak-expiratory-flow measurements? 2.3 Inspecting within-subject dependence 2.4 The variance-components model 2.4.1 Model specification 2.4.2 Path diagram 2.4.3 Between-subject heterogeneity 2.4.4 Within-subject dependence Intraclass correlation Intraclass correlation versus Pearson correlation 2.5 Estimation using Stata 2.5.1 Data preparation: Reshaping from wide form to long form 2.5.2 Using xtreg 2.5.3 Using mixed 2.6 Hypothesis tests and confidence intervals 2.6.1 Hypothesis test and confidence interval for the population mean 2.6.2 Hypothesis test and confidence interval for the between-cluster variance Likelihood-ratio test ❖ Score test F test Confidence intervals 2.7 ❖ Model as data-generating mechanism 2.8 Fixed versus random effects 2.9 Crossed versus nested effects 2.10 Parameter estimation 2.10.1 Model assumptions Mean structure and covariance structure Distributional assumptions 2.10.2 Different estimation methods 2.10.3 Inference for β Estimate and standard error: Balanced case Estimate: Unbalanced case 2.11 Assigning values to the random intercepts 2.11.1 Maximum “likelihood” estimation Implementation via OLS Implementation via the mean total residual 2.11.2 Empirical Bayes prediction 2.11.3 Empirical Bayes standard errors Posterior and comparative standard errors Diagnostic standard errors Accounting for uncertainty in β 2.11.4 ❖ Bayesian interpretation of REML estimation and prediction 2.12 Summary and further reading 2.13 Exercises 3 Random-intercept models with covariates 3.1 Introduction 3.2 Does smoking during pregnancy affect birthweight? 3.2.1 Data structure and descriptive statistics 3.3 The linear random-intercept model with covariates 3.3.1 Model specification 3.3.2 Model assumptions 3.3.3 Mean structure 3.3.4 Residual covariance structure 3.3.5 Graphical illustration of random-intercept model 3.4 Estimation using Stata 3.4.1 Using xtreg 3.4.2 Using mixed 3.5 Coefficients of determination or variance explained 3.6 Hypothesis tests and confidence intervals 3.6.1 Hypothesis tests for individual regression coefficients 3.6.2 Joint hypothesis tests for several regression coefficients 3.6.3 Predicted means and confidence intervals 3.6.4 Hypothesis test for random-intercept variance 3.7 Between and within effects of level-1 covariates 3.7.1 Between-mother effects 3.7.2 Within-mother effects 3.7.3 ❖ Relations among within estimator, between estimator, and estimator for random-intercept model 3.7.4 Level-2 endogeneity and cluster-level confounding 3.7.5 Conventional Hausman test 3.7.6 Allowing for different within and between effects 3.7.7 Robust Hausman test 3.8 Fixed versus random effects revisited 3.9 Assigning values to random effects: Residual diagnostics 3.10 More on statistical inference 3.10.1 ❖ Overview of estimation methods Pooled OLS Feasible generalized least squares (FGLS) ML by iterative GLS (IGLS) ML by Newton–Raphson and Fisher scoring ML by the expectation-maximization (EM) algorithm REML 3.10.2 Consequences of using standard regression modeling for clustered data Purely between-cluster covariate Purely within-cluster covariate 3.10.3 ❖ Power and sample-size determination Purely between-cluster covariate Purely within-cluster covariate 3.11 Summary and further reading 3.12 Exercises 4 Random-coefficient models 4.1 Introduction 4.2 How effective are different schools? 4.3 Separate linear regressions for each school 4.4 Specification and interpretation of a random-coefficient model 4.4.1 Specification of a random-coefficient model 4.4.2 Interpretation of the random-effects variances and covariances 4.5 Estimation using mixed 4.5.1 Random-intercept model 4.5.2 Random-coefficient model 4.6 Testing the slope variance 4.7 Interpretation of estimates 4.8 Assigning values to the random intercepts and slopes 4.8.1 Maximum “likelihood” estimation 4.8.2 Empirical Bayes prediction 4.8.3 Model visualization 4.8.4 Residual diagnostics 4.8.5 Inferences for individual schools 4.9 Two-stage model formulation 4.10 Some warnings about random-coefficient models 4.10.1 Meaningful specification 4.10.2 Many random coefficients 4.10.3 Convergence problems 4.10.4 Lack of identification 4.11 Summary and further reading 4.12 Exercises III Models for longitudinal and panel data 5 Subject-specific effects, endogeneity, and unobserved confounding 5.1 Introduction 5.2 Random-effects approach: No endogeneity 5.3 Fixed-effects approach: Level-2 endogeneity 5.3.1 De-meaning and subject dummies De-meaning Subject dummies 5.3.2 Hausman test 5.3.3 Mundlak approach and robust Hausman test 5.3.4 First-differencing 5.4 Difference-in-differences and repeated-measures ANOVA 5.4.1 Does raising the minimum wage reduce employment? 5.4.2 ❖ Repeated-measures ANOVA 5.5 Subject-specific coefficients 5.5.1 Random-coefficient model: No endogeneity 5.5.2 Fixed-coefficient model: Level-2 endogeneity 5.6 Hausman–Taylor: Level-2 endogeneity for level-1 and level-2 covariates 5.7 Instrumental-variable methods: Level-1 (and level-2) endogeneity 5.7.1 Do deterrents decrease crime rates? 5.7.2 Conventional fixed-effects approach 5.7.3 Fixed-effects IV estimator 5.7.4 Random-effects IV estimator 5.7.5 More Hausman tests 5.8 Dynamic models 5.8.1 Dynamic model without subject-specific intercepts 5.8.2 Dynamic model with subject-specific intercepts 5.9 Missing data and dropout 5.9.1 ❖ Maximum likelihood estimation under MAR: A simulation 5.10 Summary and further reading 5.11 Exercises 6 Marginal models 6.1 Introduction 6.2 Mean structure 6.3 Covariance structures 6.3.1 Unstructured covariance matrix 6.3.2 Random-intercept or compound symmetric/exchangeable structure 6.3.3 Random-coefficient structure 6.3.4 Autoregressive and exponential structures 6.3.5 Moving-average residual structure 6.3.6 Banded and Toeplitz structures 6.4 Hybrid and complex marginal models 6.4.1 Random effects and correlated level-1 residuals 6.4.2 Heteroskedastic level-1 residuals over occasions 6.4.3 Heteroskedastic level-1 residuals over groups 6.4.4 Different covariance matrices over groups 6.5 Comparing the fit of marginal models 6.6 Generalized estimating equations (GEE) 6.7 Marginal modeling with few units and many occasions 6.7.1 Is a highly organized labor market beneficial for economic growth? 6.7.2 Marginal modeling for long panels 6.7.3 Fitting marginal models for long panels in Stata 6.8 Summary and further reading 6.9 Exercises 7 Growth-curve models 7.1 Introduction 7.2 How do children grow? 7.2.1 Observed growth trajectories 7.3 Models for nonlinear growth 7.3.1 Polynomial models Estimation using mixed Predicting the mean trajectory Predicting trajectories for individual children 7.3.2 Piecewise linear models Estimation using mixed Predicting the mean trajectory 7.4 Two-stage model formulation and cross-level interaction 7.5 Heteroskedasticity 7.5.1 Heteroskedasticity at level 1 7.5.2 Heteroskedasticity at level 2 7.6 How does reading improve from kindergarten through third grade? 7.7 Growth-curve model as a structural equation model 7.7.1 Estimation using sem 7.7.2 Estimation using mixed 7.8 Summary and further reading 7.9 Exercises IV Models with nested and crossed random effects 8 Higher-level models with nested random effects 8.1 Introduction 8.2 Do peak-expiratory-flow measurements vary between methods within subjects? 8.3 Inspecting sources of variability 8.4 Three-level variance-components models 8.5 Different types of intraclass correlation 8.6 Estimation using mixed 8.7 Empirical Bayes prediction 8.8 Testing variance components 8.9 Crossed versus nested random effects revisited 8.10 Does nutrition affect cognitive development of Kenyan children? 8.11 Describing and plotting three-level data 8.11.1 Data structure and missing data 8.11.2 Level-1 variables 8.11.3 Level-2 variables 8.11.4 Level-3 variables 8.11.5 Plotting growth trajectories 8.12 Three-level random-intercept model 8.12.1 Model specification: Reduced form 8.12.2 Model specification: Three-stage formulation 8.12.3 Estimation using mixed 8.13 Three-level random-coefficient models 8.13.1 Random coefficient at the child level Estimation using mixed 8.13.2 Random coefficient at the child and school levels Estimation using mixed 8.14 Residual diagnostics and predictions 8.15 Summary and further reading 8.16 Exercises 9 Crossed random effects 9.1 Introduction 9.2 How does investment depend on expected profit and capital stock? 9.3 A two-way error-components model 9.3.1 Model specification 9.3.2 Residual variances, covariances, and intraclass correlations Longitudinal correlations Cross-sectional correlations 9.3.3 Estimation using mixed 9.3.4 Prediction 9.4 How much do primary and secondary schools affect attainment at age 16? 9.5 Data structure 9.6 Additive crossed random-effects model 9.6.1 Specification 9.6.2 Intraclass correlations 9.6.3 Estimation using mixed 9.7 Crossed random-effects model with random interaction 9.7.1 Model specification 9.7.2 Intraclass correlations 9.7.3 Estimation using mixed 9.7.4 Testing variance components 9.7.5 Some diagnostics 9.8 ❖ A trick requiring fewer random effects 9.9 Summary and further reading 9.10 Exercises A Useful Stata commands V Models for categorical responses 10 Dichotomous or binary responses 10.1 Introduction 10.2 Single-level logit and probit regression models for dichotomous responses 10.2.1 Generalized linear model formulation Labor-participation data Estimation using logit Estimation using glm 10.2.2 Latent-response formulation Logistic regression Probit regression Estimation using probit 10.3 Which treatment is best for toenail infection? 10.4 Longitudinal data structure 10.5 Proportions and fitted population-averaged or marginal probabilities Estimation using logit 10.6 Random-intercept logistic regression 10.6.1 Model specification Reduced-form specification Two-stage formulation 10.6.2 Model assumptions 10.6.3 Estimation Using xtlogit Using melogit Using gllamm 10.7 Subject-specific or conditional versus population-averaged or marginal relationships 10.8 Measures of dependence and heterogeneity 10.8.1 Conditional or residual intraclass correlation of the latent responses 10.8.2 Median odds ratio 10.8.3 ❖ Measures of association for observed responses at median fixed part of the model 10.9 Inference for random-intercept logistic models 10.9.1 Tests and confidence intervals for odds ratios 10.9.2 Tests of variance components 10.10 Maximum likelihood estimation 10.10.1 ❖ Adaptive quadrature 10.10.2 Some speed and accuracy considerations Integration methods and number of quadrature points Starting values Using melogit and gllamm for collapsible data Spherical quadrature in gllamm 10.11 Assigning values to random effects 10.11.1 Maximum “likelihood” estimation 10.11.2 Empirical Bayes prediction 10.11.3 Empirical Bayes modal prediction 10.12 Different kinds of predicted probabilities 10.12.1 Predicted population-averaged or marginal probabilities 10.12.2 Predicted subject-specific probabilities Predictions for hypothetical subjects: Conditional probabilities Predictions for the subjects in the sample: Posterior mean probabilities 10.13 Other approaches to clustered dichotomous data 10.13.1 Conditional logistic regression Estimation using clogit 10.13.2 Generalized estimating equations (GEE) Estimation using xtgee 10.14 Summary and further reading 10.15 Exercises 11 Ordinal responses 11.1 Introduction 11.2 Single-level cumulative models for ordinal responses 11.2.1 Generalized linear model formulation 11.2.2 Latent-response formulation 11.2.3 Proportional odds 11.2.4 ❖ Identification 11.3 Longitudinal data structure and graphs 11.3.1 Longitudinal data structure 11.3.2 Plotting cumulative proportions 11.3.3 Plotting cumulative sample logits and transforming the time scale 11.4 Single-level proportional-odds model 11.4.1 Model specification Estimation using ologit 11.5 Random-intercept proportional-odds model 11.5.1 Model specification Estimation using meologit Estimation using gllamm 11.5.2 Measures of dependence and heterogeneity Residual intraclass correlation of latent responses Median odds ratio 11.6 Random-coefficient proportional-odds model 11.6.1 Model specification Estimation using meologit Estimation using gllamm 11.7 Different kinds of predicted probabilities 11.7.1 Predicted population-averaged or marginal probabilities 11.7.2 Predicted subject-specific probabilities: Posterior mean 11.8 Do experts differ in their grading of student essays? 11.9 A random-intercept probit model with grader bias 11.9.1 Model specification Estimation using gllamm 11.10 ❖ Including grader-specific measurement-error variances 11.10.1 Model specification Estimation using gllamm 11.11 ❖ Including grader-specific thresholds 11.11.1 Model specification Estimation using gllamm 11.12 ❖ Other link functions Cumulative complementary log–log model Continuation-ratio logit model Adjacent-category logit model Baseline-category logit and stereotype models 11.13 Summary and further reading 11.14 Exercises 12 Nominal responses and discrete choice 12.1 Introduction 12.2 Single-level models for nominal responses 12.2.1 Multinomial logit models Transport data version 1 Estimation using mlogit 12.2.2 Conditional logit models with alternative-specific covariates Transport data version 2: Expanded form Estimation using clogit Estimation using cmclogit 12.2.3 Conditional logit models with alternative- and unit-specific covariates Estimation using clogit Estimation using cmclogit 12.3 Independence from irrelevant alternatives 12.4 Utility-maximization formulation 12.5 Does marketing affect choice of yogurt? 12.6 Single-level conditional logit models 12.6.1 Conditional logit models with alternative-specific intercepts Estimation using clogit Estimation using cmclogit 12.7 Multilevel conditional logit models 12.7.1 Preference heterogeneity: Brand-specific random intercepts Estimation using cmxtmixlogit Estimation using gllamm 12.7.2 Response heterogeneity: Marketing variables with random coefficients Estimation using cmxtmixlogit Estimation using gllamm 12.7.3 ❖ Preference and response heterogeneity Estimation using cmxtmixlogit Estimation using gllamm 12.8 Prediction of marginal choice probabilities 12.9 Prediction of random effects and household-specific choice probabilities 12.10 Summary and further reading 12.11 Exercises VI Models for counts 13 Counts 13.1 Introduction 13.2 What are counts? 13.2.1 Counts versus proportions 13.2.2 Counts as aggregated event-history data 13.3 Single-level Poisson models for counts 13.4 Did the German healthcare reform reduce the number of doctor visits? 13.5 Longitudinal data structure 13.6 Single-level Poisson regression 13.6.1 Model specification Estimation using poisson Estimation using glm 13.7 Random-intercept Poisson regression 13.7.1 Model specification 13.7.2 Measures of dependence and heterogeneity 13.7.3 Estimation Using xtpoisson Using mepoisson Using gllamm 13.8 Random-coefficient Poisson regression 13.8.1 Model specification Estimation using mepoisson Estimation using gllamm 13.9 Overdispersion in single-level models 13.9.1 Normally distributed random intercept Estimation using xtpoisson 13.9.2 Negative binomial models Mean dispersion or NB2 Constant dispersion or NB1 13.9.3 Quasilikelihood Estimation using glm 13.10 Level-1 overdispersion in two-level models 13.10.1 Random-intercept Poisson model with robust standard errors Estimation using mepoisson 13.10.2 Three-level random-intercept model 13.10.3 Negative binomial models with random intercepts Estimation using menbreg 13.10.4 The HHG model 13.11 Other approaches to two-level count data 13.11.1 Conditional Poisson regression Estimation using xtpoisson, fe Estimation using Poisson regression with dummy variables for clusters 13.11.2 Conditional negative binomial regression 13.11.3 Generalized estimating equations Estimation using xtgee 13.12 Estimating marginal and conditional effects when responses are missing at random ❖ Simulation 13.13 Which Scottish counties have a high risk of lip cancer? 13.14 Standardized mortality ratios 13.15 Random-intercept Poisson regression 13.15.1 Model specification Estimation using gllamm 13.15.2 Prediction of standardized mortality ratios 13.16 ❖ Nonparametric maximum likelihood estimation 13.16.1 Specification Estimation using gllamm 13.16.2 Prediction 13.17 Summary and further reading 13.18 Exercises VII Models for survival or duration data 14 Discrete-time survival 14.1 Introduction 14.2 Single-level models for discrete-time survival data 14.2.1 Discrete-time hazard and discrete-time survival Promotions data 14.2.2 Data expansion for discrete-time survival analysis 14.2.3 Estimation via regression models for dichotomous responses Estimation using logit 14.2.4 Including time-constant covariates Estimation using logit 14.2.5 Including time-varying covariates Estimation using logit 14.2.6 Multiple absorbing events and competing risks Estimation using mlogit 14.2.7 Handling left-truncated data 14.3 How does mother’s birth history affect child mortality? 14.4 Data expansion 14.5 ❖ Proportional hazards and interval-censoring 14.6 Complementary log–log models 14.6.1 Marginal baseline hazard Estimation using cloglog 14.6.2 Including covariates Estimation using cloglog 14.7 Random-intercept complementary log–log model 14.7.1 Model specification Estimation using mecloglog 14.8 ❖ Population-averaged or marginal vs. cluster-specific or conditional survival probabilities 14.9 Summary and further reading 14.10 Exercises 15 Continuous-time survival 15.1 Introduction 15.2 What makes marriages fail? 15.3 Hazards and survival 15.4 Proportional hazards models 15.4.1 Piecewise exponential model Estimation using streg Estimation using poisson 15.4.2 Cox regression model Estimation using stcox 15.4.3 Cox regression via Poisson regression for expanded data Estimation using xtpoisson, fe 15.4.4 Approximate Cox regression: Poisson regression with smooth baseline hazard Estimation using poisson 15.5 Accelerated failure-time models 15.5.1 Log-normal model Estimation using streg Estimation using stintreg 15.6 Time-varying covariates Estimation using streg 15.7 Does nitrate reduce the risk of angina pectoris? 15.8 Marginal modeling 15.8.1 Cox regression with occasion-specific dummy variables Estimation using stcox 15.8.2 Cox regression with occasion-specific baseline hazards Estimation using stcox, strata 15.8.3 Approximate Cox regression Estimation using poisson 15.9 Multilevel proportional hazards models 15.9.1 Cox regression with gamma shared frailty Estimation using stcox, shared 15.9.2 Approximate Cox regression with log-normal shared frailty Estimation using mepoisson 15.9.3 Approximate Cox regression with normal random intercept and random coefficient Estimation using mepoisson 15.10 Multilevel accelerated failure-time models 15.10.1 Log-normal model with gamma shared frailty Estimation using streg 15.10.2 Log-normal model with log-normal shared frailty Estimation using mestreg 15.10.3 Log-normal model with normal random intercept and random coefficient Estimation using mestreg 15.11 Fixed-effects approach 15.11.1 Stratified Cox regression with subject-specific baseline hazards Estimation using stcox, strata 15.12 ❖ Different approaches to recurrent-event data 15.12.1 Total-time risk interval 15.12.2 Counting-process risk interval 15.12.3 Gap-time risk interval 15.13 Summary and further reading 15.14 Exercises VIII Models with nested and crossed random effects 16 Models with nested and crossed random effects 16.1 Introduction 16.2 Did the Guatemalan-immunization campaign work? 16.3 A three-level random-intercept logistic regression model 16.3.1 Model specification 16.3.2 Measures of dependence and heterogeneity Types of residual intraclass correlations of the latent responses Types of median odds ratios 16.3.3 Three-stage formulation 16.3.4 Estimation Using melogit Using gllamm 16.4 A three-level random-coefficient logistic regression model 16.4.1 Estimation Using melogit Using gllamm 16.5 Prediction of random effects 16.5.1 Empirical Bayes prediction 16.5.2 Empirical Bayes modal prediction 16.6 Different kinds of predicted probabilities 16.6.1 Predicted population-averaged or marginal probabilities: New clusters 16.6.2 Predicted median or conditional probabilities 16.6.3 Predicted posterior mean probabilities: Existing clusters 16.7 Do salamanders from different populations mate successfully? 16.8 Crossed random-effects logistic regression 16.8.1 Setup for estimating crossed random-effects model using melogit 16.8.2 Approximate maximum likelihood estimation Estimation using melogit 16.8.3 Bayesian estimation Brief introduction to Bayesian inference Priors for the salamander data Estimation using bayes: melogit 16.8.4 Estimates compared 16.8.5 Fully Bayesian versus empirical Bayesian inference for random effects 16.9 Summary and further reading 16.10 Exercises B Syntax for gllamm, eq, and gllapred: The bare essentials C Syntax for gllamm D Syntax for gllapred E Syntax for gllasim References Author index Subject index
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.