Skip to Content

Making ancestral trees using Bayesian inference to identify disease-causing genetic variants

Don Armstrong, University of Illinois at Urbana-Champaign

Usage Details

Don Armstrong

Bayesian inference and maximum likelihood are the most accurate methods available to identify ancestral relationships between individuals. They can identify genetic regions that are associated with heritable human diseases as well as provide answers to the migration and ancestral relationships of humans over time. Unfortunately, these approaches require extensive floating point calculations and memory usage to generate consensus trees from the whole genomes of large numbers of individuals, which is only possible on large clusters such as Blue Waters. In this exploratory allocation, we will demonstrate the feasibility of running on Blue Waters, and identify any bottlenecks blocking the full usage of the GPUs across thousands of nodes. This exploratory allocation will provide the preliminary data necessary for a full allocation which will generate trees from the whole genomes of 200,000 individuals.