| 8:30 | Breakfast |
| 9:00–9:10 | Opening remarks, Tian Zheng (Columbia) |
| 9:10–9:50 |
Extended Talk A
Jennifer Hill (NYU)
Jennifer Hill (Professor of Applied Statistics, NYU)
Multiple comparisons: Why even Bayesians may need to worry
Researchers asking causal questions are often interested not only in the average treatment effect but also subgroup specific treatment effects that allow for a more nuanced understanding of who benefits from an intervention. However, this pursuit can lead to issues with unwarranted "researcher degrees of freedom" and failure to properly adjust for multiple comparisons. While previous research has demonstrated that Bayesian methods with regularizing prior distributions are more conservative than their frequentist counterparts and can lead to more appropriate assessments of uncertainty than either ignoring the issue or using corrections (Bonferroni, FDR), the extent to which Bayesian methods eliminate the problem of multiple comparisons still depends critically on the context and specifics of the method. Critically, we demonstrate a setting common in social sciences where standard Bayesian regularizing priors are not sufficient to control false positive claims and additionally can lead to sign errors. We characterize this setting as dominated by “shrinkage to the wrong place” and discuss potential remedies.
Chair: Ben Goodrich (Columbia)
|
| 10:00–11:00 |
Session 1: The Talented Mr. P
Stephen Ansolabehere (Harvard)
Stephen Ansolabehere (Frank G. Thompson Professor of Government, Harvard)
TBDTBD
Yajuan Si (Michigan)
Yajuan Si (Research Associate Professor, Michigan)
MRP is turning 30: Are we entering the golden age, or just getting started?Developed by Gelman and Little (1997), Multilevel Regression and Poststratification (MRP) was designed to obtain desirable subgroup estimates using complex sample survey data. In many ways, MRP perfectly embodies Andrew’s favorite maxims: "fit multilevel models as defaults," "statisticians cannot avoid adjustment," and notably, "survey weighting is a mess." Over the past 30 years, MRP has proven practical success across disciplines. Yet MRP is frequently critiqued for lacking theoretical guarantees like design consistency or double robustness. As a statistical model that accounts for data collection, MRP is subject to model misspecification and cannot fix low data quality or bad study design. In this talk, I will trace the evolution of MRP, discussing recent methodological improvements and current efforts to extend the framework into the AI era, ensuring it continues to yield reproducible and reliable findings in an increasingly complex data landscape.
Qixuan Chen (Columbia)
Qixuan Chen (Associate Professor of Biostatistics, Columbia University)
Predictive Inference for Non-Probability Samples Using Bayesian Machine LearningProbability surveys are the gold standard for population inference but are increasingly costly and subject to declined response rates. Non-probability samples, though more accessible, raise concerns about generalizability. In this talk, I present Bayesian predictive inference methods that integrate non-probability samples with administrative data or electronic health records in data-rich settings with high-dimensional auxiliary information. We first consider estimation of population means using non-probability surveys and then extend the framework to generalizability and transportability of causal effects from randomized trials. Our methods model high-dimensional covariates via Bayesian Additive Regression Trees and incorporate the propensity score for sample inclusion using natural cubic splines, along with a balancing transformation to better align propensity score distributions between trial and target populations. Simulation studies show improved performance over existing methods, with smaller root mean squared error and coverage closer to nominal levels. We illustrate the approaches with real-world applications.
Chair: Shigeo Hirano (Columbia)
|
| 11:20–12:20 |
Session 2: 0.234 — Theory in Computing
Charles Margossian (UBC)
Charles Margossian (Assistant Professor, University of British Columbia)
TBDTBD
Collin Cademartori (Wake Forest)
Collin Cademartori (Assistant Professor, Wake Forest)
Can Probabilistic Programming Make Workflows Work?Probabilistic programming has represented a major step in the separation of concerns between Bayesian modeling and inference, with corresponding gains in the efficiency and flexibility of model building. As model building has become easier, a growing literature has emerged to tackle the central question of workflow: When so many models can be easily built, how do we decide what to build, how to build it, and how to evaluate the end product? This literature has offered up numerous tools to handle pieces of this question, including tools for validation of computation (SBC), specification of priors (elicitation), evaluation of model predictions (LOO), and assessment of model fit (PPCs), among other tasks. While such tools are often designed to fit into a coherent workflow pipeline, substantial hurdles remain for integrating them in practice. Different tools leverage different pieces of the underlying model (e.g. log likelihoods for LOO, hyperparameters for elicitation). Currently, furnishing the right metadata and gluing the pieces together often requires manual work from the user and repetition of intent across different blocks of code. Existing probabilistic programming languages can offer little in the way of automation for these tasks. Languages like Stan do not guarantee any concrete model structure until runtime, while more restricted languages define some model properties statically, but unnecessarily limit the class of models which can be implemented. Both of these are poor fits for integrating a variety of workflow tools, each with their own requirements of the underlying model. This talk aspirationally considers possible future developments in probabilistic programming to address the integration of disparate tooling into customizable workflow pipelines, with the aim of defining a flexible software platform which current and future methodologies can plug into.
Matt Hoffman
Matt Hoffman
Running Markov Chain Monte Carlo on Modern Hardware and SoftwareToday, cheap numerical hardware offers huge amounts of parallel computing power, much of which is used for the task of fitting and applying neural networks. Adoption of this hardware to accelerate statistical Markov chain Monte Carlo (MCMC) applications has been slower. We suggest some patterns for speeding up MCMC workloads using
the hardware (e.g., GPUs, TPUs) and software (e.g., PyTorch, JAX) that have driven progress in deep learning over the last fifteen years or so. We offer some intuitions for why these new systems are so well suited to MCMC, and show some examples where we use them to achieve dramatic speedups over a CPU-based workflow. Finally, we discuss some potential pitfalls to watch out for.
Chair: Ruobin Gong (Rutgers)
|
| 12:20–2:00 | Lunch |
| 2:00–3:00 |
Session 3: The Science of Defaults
Susan Gelman (Michigan)
Susan Gelman (Heinz Werner Distinguished University Professor of Psychology and Linguistics, University of Michigan)
Andrew Gelman: The Early Years (and Beyond) This talk will share personal reflections on Andrew's early years and his important influence on the field of psychology, from my perspective as a psychologist who studies cognitive development, and as Andrew's sister.
Upmanu Lall (Arizona State/Columbia)
Upmanu Lall (Arizona State/Columbia)
TBD
Tian Zheng (Columbia)
Tian Zheng (Professor of Statistics, Columbia University)
Statistical Thinking and AI EducationAs AI becomes more common in education and practice, statistical thinking remains essential. In my talk, I will discuss the importance of core ideas such as uncertainty, model validation, and data interpretation in AI Education across disciplines. Integrating these concepts helps students move beyond using tools to understanding how and why models work. This approach supports more reliable, transparent, and responsible use of AI, and highlights the role of statisticians in shaping effective AI education.
Chair: Rahul Dodhia (Microsoft Research)
|
| 3:10–4:10 |
Session 4: Regression and Other Stories
Jonathan Auerbach (George Mason)
Jonathan Auerbach (Assistant Professor, George Mason)
How temperature regimes near the equinox synchronize spring biological events?Many biological processes, including plant leafout and flowering, occur once cumulative temperatures reach a threshold (the thermal-sum model). In this way, temperatures are thought to coordinate the timing of biological events. But growing evidence suggests that as climates warm, both the advancement of spring has slowed (declining sensitivity) and the variance in the timing of spring events has increased (declining synchrony), raising questions about the resilience of temperature-based coordination to anthropogenic climate change. To answer these questions, researchers have complicated the thermal-sum model, introducing additional factors and mechanisms. We consider whether such complexity is necessary. Using results from the theory of stopped random walks, we show that sensitivity and synchrony are exactly as predicted by the basic thermal-sum model. The theory suggests a nonlinear relationship between temperatures and both the timing and synchrony of biological events. In particular, it predicts that as temperatures increase and springtime events shift from the equinox toward the solstice, the events themselves become less coordinated and more variable. We verify these predictions using experimental and real-world data, including 10,000 observations of common lilacs (United States, 1956-2025). We conclude that the theory provides a powerful tool for understanding the thermal-sum model, particularly when considering additional complexity.
Rob Trangucci (Oregon State)
Rob Trangucci (Assistant Professor, Oregon State University)
Identified vaccine efficacy for post-infection outcomes"In order to meet regulatory approval, a new vaccine must show that it reduces the risk of an outcome like symptomatic disease, severe illness, or death in a randomized clinical trial.
Because infection is necessary for these outcomes, one may be interested in the causal effect on a post-infection outcome, namely an outcome conditional on infection.
Conditioning on a post-treatment outcome affected by the treatment leads to selection bias, but one can use principal stratification to do valid causal inference; this method partitions the total causal effect of vaccination into two causal effects: vaccine efficacy against infection, and the principal effect of vaccine efficacy on post-infection outcomes in patients who would be infected under both placebo and vaccination. Despite the importance of such principal effects to policymakers, these estimands are generally unidentifiable, even under strong assumptions that are rarely satisfied in real-world trials. We develop a novel method to point identify these principal effects while eliminating the monotonicity assumption and allowing for measurement error. Furthermore, our results allow for multiple treatments, and are general enough to be applicable outside of vaccine efficacy. Our method relies on the fact that many vaccine trials are run at geographically disparate health centers, and measure biologically-relevant categorical pretreatment covariates. We show that our method can be applied to a variety of clinical trial settings where vaccine efficacy against infection and a post-infection outcome can be jointly inferred. This methodology can yield new insights from existing vaccine efficacy trial data and will aid researchers in designing new multi-arm clinical trials.
"
David Rothschild (Microsoft Research)
David Rothschild (Economist, Microsoft Research)
Survey Research from MRP to AI: Applying What We Learned from the Last Disruption to Guiding the NextEarly work on multilevel regression and poststratification (MRP), including collaborations using non-probability data such as Xbox samples, demonstrated that credible population inference could be recovered from unconventional data through modeling. This work shifted attention from sampling to the full survey workflow. In this talk, I reflect on how decomposing surveys into ideation, design, target population, administration, processing, and reporting reveals where assumptions enter and where error accumulates. And how as these methods became mainstream, through the persistence of many in this room, they not only transformed survey research directly but also reshaped how the field responds to disruption. And, that shift is now driving how survey research is confronting the new transformation driven by AI.
Chair: Shira Mitchell (Blue Rose Research)
|
| 4:20–5:00 |
Extended Talk B
Sophia Rabe-Hesketh (Berkeley)
Sophia Rabe-Hesketh (Professor of Educational Statistics and Biostatistics, UC Berkeley)
Simple suggestions for missing data and the DIC
Missing data methods that ignore the missingness process, such as multiple imputation or joint modeling of the response variable(s) and partially observed covariates, assume that data are missing at random (MAR). My first simple suggestion is to “make” the missingness MAR under certain MAR violations by deleting more data (Rabe-Hesketh & Skrondal, Psychometrika, 2023). The deviance information criterion (DIC) is not invariant to reparameterization and can be unstable with a negative effective number of parameters, for instance in finite mixture models. My second simple suggestion is to define a new version of the DIC that does not suffer from these problems (Xiao & Rabe-Hesketh, in progress), making use of an alternative definition of the effective number of parameters (Gelman, Hwang, & Vehtari. Stat Comput, 2014).
|
| 5:00–6:00 | Light refreshments (Faculty House) |
| 6:00–9:30 | Dinner banquet (Faculty House) |