The goal of scads is to provide a statistical framework for integrating GWAS and scATAC-seq data for quantifying genetic disease risk in single cells.
Skip this step if you already have conda available.
macOS (Apple Silicon):
curl -fsSL https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh -o /tmp/miniconda.sh
bash /tmp/miniconda.sh -b -p ~/miniconda3
~/miniconda3/bin/conda init zsh # or bash, depending on your shellmacOS (Intel):
curl -fsSL https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -o /tmp/miniconda.sh
bash /tmp/miniconda.sh -b -p ~/miniconda3
~/miniconda3/bin/conda init zshLinux:
curl -fsSL https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o /tmp/miniconda.sh
bash /tmp/miniconda.sh -b -p ~/miniconda3
~/miniconda3/bin/conda init bashRestart your shell after running conda init.
Create a conda environment with the required Python dependencies:
conda create -n polyfun python=3.11 -y
conda activate polyfun
pip install numpy scipy pandas scikit-learn pyarrow tqdm bitarray networkx pandas-plink packaging pybedtoolsVerify the installation:
Both commands should print usage information without errors.
scads depends on several Bioconductor packages that are not available on CRAN. Install them first from within R:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c(
"SummarizedExperiment",
"rtracklayer",
"plyranges",
"IRanges",
"GenomicRanges",
"BSgenome",
"Biostrings"
))Note: If
plyrangesfails to install due to avctrsversion conflict, installvctrsfrom source first:install.packages("vctrs", type = "source")Then retry
BiocManager::install("plyranges").
Option A: Install directly from GitHub (recommended):
install.packages("remotes")
remotes::install_github("szhaolab/scads")Option B: Install from a local clone:
remotes::install_local("path/to/scads")scads requires reference files from the S-LDSC baseline LD v2.2 model (hg19). Download and extract them into a local directory (e.g., ~/LDSCORE/).
Disk usage: The compressed downloads total ~1 GB. After extraction, the reference data requires ~4 GB.
mkdir -p ~/LDSCORE && cd ~/LDSCORE
# Download from Zenodo (~1 GB total)
curl -fSL -o 1000G_Phase3_plinkfiles.tgz \
"https://zenodo.org/records/10515792/files/1000G_Phase3_plinkfiles.tgz?download=1"
curl -fSL -o 1000G_Phase3_baselineLD_v2.2_ldscores.tgz \
"https://zenodo.org/records/10515792/files/1000G_Phase3_baselineLD_v2.2_ldscores.tgz?download=1"
curl -fSL -o 1000G_Phase3_frq.tgz \
"https://zenodo.org/records/10515792/files/1000G_Phase3_frq.tgz?download=1"
curl -fSL -o 1000G_Phase3_weights_hm3_no_MHC.tgz \
"https://zenodo.org/records/10515792/files/1000G_Phase3_weights_hm3_no_MHC.tgz?download=1"
curl -fSL -o hm3_no_MHC.list.txt \
"https://zenodo.org/records/10515792/files/hm3_no_MHC.list.txt?download=1"
# Extract
tar -xzf 1000G_Phase3_plinkfiles.tgz
tar -xzf 1000G_Phase3_frq.tgz
tar -xzf 1000G_Phase3_weights_hm3_no_MHC.tgz
# Baseline LD extracts flat files, so create a subdirectory first
mkdir -p 1000G_Phase3_baselineLD_v2.2_ldscores
tar -xzf 1000G_Phase3_baselineLD_v2.2_ldscores.tgz -C 1000G_Phase3_baselineLD_v2.2_ldscoresWhen running scads, you must activate the polyfun conda environment before starting R, so that python3 resolves to the environment with the required dependencies:
In your R session, point scads to the polyFUN and LDSC directories:
scads_res <- scads(
...,
polyfun_code_dir = "~/polyfun/",
ldsc_code_dir = "~/ldsc/",
...
)See the tutorial for a complete example.