Learning Objectives

  • Identify effects of population structure on GWAS
  • Interpret principal components of the genotype matrix
  • Correct for population structure



I’ve also uploaded a powerpoint version on canvas Module/lecture 7 as a backup.


  • Mills, Melinda C.; Barban, Nicola; Tropf, Felix C.. An Introduction to Statistical Genetic Data Analysis (p. 237). MIT Press. Kindle Edition.

  • J. Novembre, et al “Genes mirror geography within Europe,” Nature, vol. 456, no. 7218, pp. 98–101, Aug. 2008.

  • Auton, A., Altshuler, D. M., Durbin, R. M., Chakravarti, A., Clark, A. G., Donnelly, P., et al. (2015). A global reference for human genetic variation. Nature, 526(7571), 68–74. http://doi.org/10.1038/nature15393

  • Check out this masterful explanation of SVD in 24 tweets by Daniella Witten https://twitter.com/WomenInStat/status/1285611207530098688

  • Alexander, D.H., Novembre, J., Lange, K. (2009). Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19,1655-1664.


Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The source code is licensed under MIT.

Suggest changes

If you find any mistakes (including typos) or want to suggest changes, please feel free to edit the source file of this page on Github and create a pull request.


For attribution, please cite this work as

Haky Im (2022). Lecture 7 - Population Structure. HGEN 471 Class Notes. /post/2022/02/01/lecture-7-population-structure/

BibTeX citation

  title = "Lecture 7 - Population Structure",
  author = "Haky Im",
  year = "2022",
  journal = "HGEN 471 Class Notes",
  note = "/post/2022/02/01/lecture-7-population-structure/"