prep_gcEffects_input.RdThis function computes the GC content across genomic peak regions, applying a weighting scheme (either uniform "ladder" or smoothed "tricube") over each region of fixed width. It prepares a list of GC content, genomic regions, and the provided read count information for downstream GC-effect correction or modeling.
prep_gcEffects_input(
rc,
peak_names,
genome = "hg19",
peakwidth = 501,
gctype = c("ladder", "tricube"),
verbose = TRUE
)A numeric vector, matrix, or data frame representing read counts (e.g., from ATAC-seq or ChIP-seq).
This can be sparse or dense, and should align with the provided peak_names.
A character vector of peak names, typically in the format
"chr_start_end", e.g., "chr1_12345_12845".
A character string specifying the genome build to use. Must be a valid
BSgenome package name (e.g., "hg19", "hg38", "mm10"). Default is "hg19".
An integer specifying the width of each peak (default: 501).
Used to determine the smoothing window for GC content calculation.
A character string specifying the GC weighting scheme.
Options are "ladder" (uniform weights) or "tricube" (smooth kernel weights).
Default is "ladder".
A list containing:
A numeric vector of GC content values for each region.
A GRanges object representing the genomic regions.
The input read count data, returned for convenience.
The function extracts sequences for the given genomic regions using getSeq()
from a BSgenome object and computes the GC content in each region by summing weights
at positions corresponding to G or C nucleotides. The result is scaled to sum to 1.