Blue Waters User Portal | Science Teams

A Computational Model for Causal Inference via Subset Selection

Wendy Cho, University of Illinois at Urbana-Champaign

Usage Details

Yan Liu, Wendy Cho

Researchers in all disciplines desire to identify causal relationships. Randomized experimental designs isolate the treatment effect and thus permit causal inferences. However, experiments are often prohibitive because resources may be unavailable or the research question may not lend itself to an experimental design. In these cases, a researcher is relegated to analyzing observational data. To make causal inferences from observational data, one must adjust the data so that they resemble data that might have emerged from an experiment. The data adjustment can proceed through a subset selection procedure to identify treatment and control groups that re statistically indistinguishable. Identifying optimal subsets is a computationally complex and challenging problem but a powerful tool for discovering scientific insights in a wide-variety of fields.