Total of 40 points

Download hapmap3 data in plink format from

  1. (5 points) Check the population composition.
  2. (10 points) Test for Hardy Weinberg Equilibrium using all the populations using SNPs in chr22. Plot the qqplot.
  3. (10 points) Test for Hardy Weinberg Equilibrium using CEU, YRI, and CHB, ASW separately using SNPs in chr22. Plot the qqplot and interpret why they are different from 2.
  4. (10 points) Calculate principal components using chromosome 22.
  5. (5 points) Plot PC1 vs PC2 using different color for each population. Keep only CEU, YRI, ASW, and CHB before plotting.

Hint: you can borrow code with the appropriate modifications from


Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The source code is licensed under MIT.

Suggest changes

If you find any mistakes (including typos) or want to suggest changes, please feel free to edit the source file of this page on Github and create a pull request.


For attribution, please cite this work as

Haky Im (2022). Homework 4 - Population Structure. HGEN 471 Class Notes. /post/2022/02/03/homework-4-population-structure/

BibTeX citation

  title = "Homework 4 - Population Structure",
  author = "Haky Im",
  year = "2022",
  journal = "HGEN 471 Class Notes",
  note = "/post/2022/02/03/homework-4-population-structure/"