run_sldsc.RdThis function runs the S-LDSC pipeline using PolyFun scripts for:
Munging summary statistics ("munge_polyfun_sumstats.py"),
Generating annotation files from a user-supplied BED (hg38) via "make_annot.py",
Computing LD scores with "ldsc.py –l2",
Performing final heritability estimation ("ldsc.py –h2") by combining the custom annotation with baseline v2.2 annotations.
run_sldsc(
chrs,
polyfun_path,
ldsc_path,
sumstats_path,
n,
trait,
onekg_path,
bed_dir,
baseline_dir,
frqfile_pref,
hm3_snps,
weights_pref,
out_dir
)Number of chromosomes for running S-LDSC (Default is 1:22).
Directory containing the cleaned GWAS summary statistics file (format: <trait>_sumstats.txt.gz).
Sample size of GWAS.
GWAS trait name (e.g., "SIM", "IBD").
Prefix to 1000G plink files in hg19 (e.g., "/LDSCORE/zenodo/1000G_EUR_Phase3_plink/1000G.EUR.QC.").
Directory containing the user-supplied BED file(s) in hg19. Only the first .bed found is used.
Prefix to 1000G baseline v2.2 annotation in hg19 (e.g., "/LDSCORE/zenodo/1000G_Phase3_baselineLD_v2.2_ldscores/baselineLD.").
Prefix to the 1000G .frq or .afreq files in hg19 (e.g., "/LDSCORE/zenodo/1000G_Phase3_frq/1000G.EUR.QC.").
Path to the HapMap3 no-MHC SNP list in hg19 (e.g., "/LDSCORE/zenodo/hm3_no_MHC.list.txt").
Path to the 1000G weights in hg19 (e.g., "/LDSCORE/zenodo/1000G_Phase3_weights_hm3_no_MHC/weights.hm3_noMHC.").
Directory where outputs are saved. The function creates annotations/<trait> and results subdirectories.
#' @details Step-by-step:
Munge summary stats <trait>_sumstats.txt.gz into a parquet with
munge_polyfun_sumstats.py.
For each chromosome 1..22:
Create annotation (make_annot.py) from the user .bed plus the .bim for that chromosome.
Compute LD scores (ldsc.py --l2) for the annotation.
Finally, run ldsc.py --h2 referencing both the newly created annotation
(out_dir/annotations/<trait>/<trait>.<chr>) and baseline_dir,
providing frqfile_pref for the allele frequency files and the hm3_snps
for SNP filtering.
Directory to access polyFUN code scripts (e.g., munge_polyfun_sumstats.py); must be python3-compatible.
Directory to access LDSC code scripts (e.g., make_annot.py, ldsc.py).
No object is returned. All intermediate files (.annot.gz, .ldscore.gz) and final
LDSC results (.log, .results) are written to out_dir.
if (FALSE) { # \dontrun{
run_sldsc(chrs = 1:22,
polyfun_code_dir = "/path/polyfun/",
ldsc_code_dir = "/path/ldsc/",
sumstats_path = "/path/sumstats/",
n = 60000, trait = "SIM",
onekg_path = "/LDSCORE/zenodo/1000G_EUR_Phase3_plink/1000G.EUR.QC.",
bed_dir = "/path/my_bedfiles/",
baseline_dir = "/LDSCORE/zenodo/1000G_Phase3_baselineLD_v2.2_ldscores/baselineLD.",
frqfile_pref = "/LDSCORE/zenodo/1000G_Phase3_frq/1000G.EUR.QC.",
hm3_snps = "/LDSCORE/zenodo/hm3_no_MHC.list.txt",
weights_pref = "/LDSCORE/zenodo/1000G_Phase3_weights_hm3_no_MHC/weights.hm3_noMHC.",
out_dir = "results/")
} # }