pycmplot.resources
Centralised configuration for external reference files (liftover chain,
gene-info TSVs). Paths are resolved from environment variables or the
bundled pycmplot/data/ directory; individual paths can be overridden
by passing a custom ResourceConfig instance
to any function that accepts a resources argument.
pycmplot.resources
Centralised configuration for external reference files that cannot be bundled with the package distribution (large gene-info TSVs, liftover chain files, etc.).
Resolution order
Resource paths are resolved in the following priority order for each attribute:
Explicit argument — pass a
ResourceConfiginstance with the desired path directly to any function that accepts a resources parameter.Environment variable — set the corresponding variable before running pycmplot:
export PYCMPLOT_CHAIN_HG19_HG38=/path/to/hg19ToHg38.over.chain.gz export PYCMPLOT_CHAIN_HG18_HG38=/path/to/hg18ToHg38.over.chain.gz export PYCMPLOT_GENEINFO_HG38=/path/to/Homo_sapiens.GRCh38.geneinfo.tsv.gz export PYCMPLOT_GENEINFO_HG19=/path/to/Homo_sapiens.GRCh37.geneinfo.tsv.gz
Bundled default — pycmplot ships with the required files in the
pycmplot/data/package directory; they are used automatically when neither of the above is set.
Examples
Override a single resource while using defaults for the rest:
>>> from pycmplot.resources import ResourceConfig
>>> cfg = ResourceConfig(chain_hg19_hg38="/my/custom.over.chain.gz")
>>> # pass cfg to any function that accepts a resources argument:
>>> from pycmplot.liftover import liftover_position
>>> df_lifted = liftover_position(df, resources=cfg)
- class pycmplot.resources.ResourceConfig(chain_hg19_hg38: str | None = <factory>, chain_hg18_hg38: str | None = <factory>, geneinfo_hg38: str | None = <factory>, geneinfo_hg19: str | None = <factory>)[source]
Bases:
objectPaths to external reference files used by pycmplot.
Dataclass grouping the on-disk resources required by pycmplot:
chain_hg19_hg38– UCSC LiftOver chain file for hg19 to hg38 conversion. Resolved fromPYCMPLOT_CHAIN_HG19_HG38or the bundledhg19ToHg38.over.chain.gz.chain_hg18_hg38– UCSC LiftOver chain file for hg18 to hg38 conversion. Resolved fromPYCMPLOT_CHAIN_HG18_HG38or the bundledhg18ToHg38.over.chain.gz. Only required when any input summary statistics file carries ahg18build label.geneinfo_hg38– Ensembl gene-info TSV for GRCh38, used for nearest-gene annotation. Resolved fromPYCMPLOT_GENEINFO_HG38or the bundledHomo_sapiens.GRCh38.geneinfo.tsv.gz.geneinfo_hg19– Ensembl gene-info TSV for GRCh37, used when input data carry a hg19 build label. Resolved fromPYCMPLOT_GENEINFO_HG19or the bundledHomo_sapiens.GRCh37.geneinfo.tsv.gz.
All three attributes default to values resolved from environment variables or the bundled
pycmplot/data/directory viaimportlib.resources.files(). Override individual attributes to use custom file locations.Examples
Use all bundled defaults:
>>> from pycmplot.resources import ResourceConfig >>> cfg = ResourceConfig()
Override the hg38 gene-info file:
>>> cfg = ResourceConfig( ... geneinfo_hg38="/data/custom_GRCh38_genes.tsv.gz" ... )
- require(attr: str) str[source]
Return the path for attr, raising a clear
FileNotFoundErrorif the attribute is unset or the path does not exist.First checks whether the attribute value is
None; if so, raisesFileNotFoundErrorwith a message indicating which environment variable to set. Then verifies that the resolved path exists on disk, falling back toimportlib.resources.files()package-data resolution before raising if neither succeeds.- Parameters:
attr (str) – Name of the
ResourceConfigattribute to retrieve, e.g.'chain_hg19_hg38','geneinfo_hg38','geneinfo_hg19'.- Returns:
Absolute file path as a string.
- Return type:
- Raises:
FileNotFoundError – If the attribute is
Noneor the resolved path does not exist.
Examples
>>> from pycmplot.resources import ResourceConfig >>> cfg = ResourceConfig() >>> chain = cfg.require("chain_hg19_hg38") >>> chain.endswith(".over.chain.gz") True