Q2 Plot the \(\chi^2\) statistic vs. LD score
To get an intuitive sense of how LD-score regression works, download the pre-calculated LD score on chromosome 22 and summary statistics for a height GWAS from here and do the following
(10 points) Plot the histogram of LD-score
(10 points) Plot \(\chi^2\) statistic vs. LD-score
(10 points) Regress \(\chi^2\) on LD-score
Feel free to reuse code https://hakyimlab.github.io/hgen471/L8-LD-score.html and complete the followings.
Q3 Calculate heritability and genetic correlation with LD score regression
This question is optional, we will get back to this after lab 8 on stratified LDSC
In this exercise, you will use LD score regression method to calculate the chip heritability of two GWAS phenotypes from the UK Biobank data and look for evidence of population stratification.
(20 points) Install the LDSC regression software
To install LDSC regression software, go to GitHub repository (https://github.com/bulik/ldsc) and follow the installation instructions in (https://github.com/bulik/ldsc#getting-started).
The installation requires conda
being pre-installed.
conda
is package manager for command line tool, Python, and R.
We highly encourage you to try conda
if you haven’t done so.
If you have trouble installing the software, please consult with your TA and/or instructors.
Go through the LDSC documentation
To get started on using the software for this problem, follow the tutorial here to get the reference files needed (https://github.com/bulik/ldsc/wiki/Heritability-and-Genetic-Correlation).
a. (10 points) LD score formula.
Write down the relationship between the expected value of the \(\chi^2\) statistic and heritability, number of markers, sample size, population stratification effect
b. Pick and download two phenotypes
from (https://nealelab.github.io/UKBB_ldsc/downloads.html), you will probably want to download the ones that are already in LDSC format.
c. (10 points) Calculate heritability using LDSC
What is the intercept for each trait? How do you interpret the values?
d. (10 points) Calculate genetic correlation between the two UKB traits.
Did you expect the results you got? Comment.
Hint: take a look this old solution in a different platform here(https://bios25328.hakyimlab.org/post/2021/04/15/homework-ldsc-regression/). If you use the pre-formatted summary statistics, you may not need to run the munging script.
Data: You can download the following files
- chr22.l2.ldscore.gz,
- gwas_giant_chr22.txt,
- hapmap_chr22.bed,
- hapmap_chr22.bim,
- hapmap_chr22.fam
from https://uchicago.box.com/s/iqxg6yo7pi50hyudnfcv2xnhtfp6euv8