2023 FDA Science Forum
Assessment of a Modified Sandwich Estimator for Generalized Estimating Equations with Application to Opioid Poisoning in MIMIC-IV ICU Patients
- Authors:
- Center:
-
Contributing OfficeNational Center for Toxicological Research
Abstract
Abstract:
Background Longitudinal regression models for correlated binary outcomes are frequently fit using generalized estimating equations (GEE). The Liang and Zeger sandwich estimator is often used in GEE to produce unbiased standard error estimation for regression coefficients in large sample settings, even when the covariance structure is misspecified. The sandwich estimator performs optimally in balanced designs when the number of participants is large with few repeated measurements. The sandwich estimator’s asymptotic properties do not hold in small sample and rare-event settings. Under these conditions, the sandwich estimator underestimates the variances and is biased downwards. Purpose The goal of this research is to construct a hybrid sandwich estimator able to correctly model the variances in rare outcome and finite sample situations. Only a handful of statisticians have attempted improving the performance of the sandwich estimator under these conditions. Here, the performance of a modified sandwich estimator is compared to the traditional Liang-Zeger estimator and alternative forms proposed by authors Morel, Pan, and Mancl-DeRouen. Each estimator’s performance was assessed with 95% coverage probabilities for the regression coefficients using simulated data under various combinations of sample sizes and outcome prevalence values with independence and autoregressive correlation structures. Results We demonstrated in simulations with sample sizes of 100 subjects and an autoregressive covariance structure with higher correlation settings (0.10 and 0.15) that all sandwich estimators produced coverage probabilities that fell below 95%. This was not observed in our earlier simulations with low correlation values. As the sample sizes dropped under these same correlation conditions, the Liang-Zeger continued to perform abysmally while the Rogers-Stoner and Pan estimators adjusted. As the sample sizes decreased under a 0.10 correlation with 10% and 5% outcome prevalences, the coverage probabilities of the Liang-Zeger continued to deteriorate, while the Rogers-Stoner and Pan estimators recovered, almost achieving 95% coverage probabilities at 40 subjects and lower. Conclusion In our limited simulation settings, the Rogers-Stoner sandwich estimator outperformed the Liang-Zeger and typically outperformed all other estimators as both the prevalence and sample size decreased. This approach provides a method for modeling rare events in finite samples on the effects of medications, drugs, and poisons.