pycmplot.plotting

The plotting subpackage contains three modules: one for linear (stacked) Manhattan plots, one for circular (Circos-style) Manhattan plots, and one for QQ plots.

pycmplot.plotting.linear

Generates single- and multi-track stacked linear Manhattan plots with optional significance lines, locus highlighting, cluster-aware label spreading, and intelligent arrow-angle calculation for gene annotations.

pycmplot.plotting.linear

Multi-track stacked linear Manhattan plot.

The module exposes two public functions:

plot_linear() — the user-facing entry point. Accepts the sumstats_loaded dict produced by get_sumstats_and_merged_sector_list(), resolves output paths, and delegates rendering to plot_linearm().
plot_linearm() — the core rendering engine. Accepts a list of DataFrames and a fully resolved set of plotting parameters, builds the matplotlib figure, draws all tracks, and saves the file.

Internal helpers:

_draw_annotation_arrows() — places angled FancyArrowPatch arrows from spread gene labels down to their corresponding signal positions.
_draw_annotation_arrows_multirail() — places angled FancyArrowPatch arrows from spread gene labels down to their corresponding signal positions with multirail capability and single sort + rank-reassignment to avoid arrow crossing.

pycmplot.plotting.linear.plot_linear(sumstats_loaded: list[str], track_heights: list[float] = None, logp: bool = False, point_size: float | None = 8, highlight: bool = False, highlight_color: str = 'brown', highlight_line: bool = False, highlight_line_color: str = 'grey', hits_table: DataFrame | None = None, annotate: str = None, annotation_size: float = 8, label_col: str | None = None, chr_spacing: float | None = 9000000.0, linear_track_spacing: float | None = None, annot_rail_frac: float | None = 0.98, colors: list[str] = ['steelblue', 'silver'], signif_lines: dict | None = None, plot_title: str | None = None, no_track_labels: bool = False, ylabel: str | None = None, dpi: int | None = None, output_format: str | None = 'png', output_dir: str | None = '.', figsize: list[float] | None = [10, 4])[source]

Generate a multi-track stacked linear Manhattan plot.

This is the primary user-facing entry point for linear Manhattan plots. It extracts DataFrames and labels from sumstats_loaded, resolves output file paths, then delegates all rendering to plot_linearm().

Parameters:

sumstats_loaded (dict) – Mapping of label → [DataFrame, n_chroms] as returned by get_sumstats_and_merged_sector_list(). One track is created per key; the DataFrames must have canonical columns CHR, POS, P, and logP (when logp is True).
trim_pval (float, optional) – Reserved; trimming is applied upstream. Default None.
track_heights (list of float, optional) – Comma-parsed relative track heights (e.g. [2.0, 2.0, 1.5]). Passed directly to plot_linearm() as the gridspec height ratios for the data tracks. When None, all tracks are equal.
logp (bool, optional) – Plot –log₁₀(p) on the y-axis. Default False.
point_size (float, optional) – Scatter-plot point size. Default uses plot_linearm()’s default (5).
highlight (bool, optional) – Render significant-locus variants in brown. Default False.
highlight_thresh (float, optional) – P-value threshold for locus highlighting. Default 5e-8.
hits_table (pandas.DataFrame, optional) – Annotation DataFrame (hits summary table from get_hits_summary_table()). Must contain columns CHR, POS, and label_col. Passed to DataFrame suppresses annotations.
label_col (str, optional) – Column in hits_table to use as annotation text (e.g. 'top_gene'). Default None (falls back to plot_linearm() default 'label').
chr_spacing (float, optional) – Horizontal gap between chromosomes in base-pairs. Default 9e6.
linear_track_spacing (float, optional) – Vertical space between tracks as a fraction of average track height. Default 0.10.
annot_rail_frac (float, optional) – Fraction of horizontal space covering the center of the annotation track within which to place annotation texts. Default 0.98 (annotation texts will cover 98% of annotation track horizontally)
colors (list of str, optional) – Two alternating chromosome colours. Default ['steelblue', 'silver'].
signif_lines (list of dict, optional) – One {'genome': float, 'suggestive': float} dict per track, in the same order as sumstats_loaded. Produced by get_sumstats_and_merged_sector_list().
plot_title (str, optional) – Human-readable title used as the output file-name stem. Passed to get_output_paths().
no_track_labels (bool, optional) – Suppress per-track labels. Default False.
ylabel (str, optional) – Override the shared y-axis label (left margin). Useful for non-p-value statistics such as iHS, F_ST or XP-EHH (e.g. ylabel="iHS"). When None (the default), the label is "-log₁₀(p-value)" if logp is True and "P" otherwise.
dpi (int, optional) – Output resolution in dots per inch. Default 300.
output_format (str, optional) – Image format ('png', 'pdf', 'svg', 'jpg'). Default 'png'.
output_dir (str or pathlib.Path, optional) – Directory in which to save the output files. Default '.'.
figsize (tuple of (float, float), optional) – Figure dimensions (width, height) in inches. Default (15, 9).

Returns:

fig (matplotlib.figure.Figure) – The completed figure.
axes (list of matplotlib.axes.Axes) – All axes: axes[0] is the annotation sub-panel; axes[1:] are the per-track data axes.

pycmplot.plotting.circular

Generates multi-track Circos-style circular Manhattan plots. Track radii are computed automatically to give each track proportional visual weight relative to its data range.

pycmplot.plotting.circular

Circos-style multi-track circular Manhattan plot.

The module exposes two public functions and one internal per-sector helper:

plot_circular() — user-facing entry point. Configures the pycirclize.Circos canvas, computes track radii, iterates over sectors and tracks, renders gene/SNP annotations, and saves the figure.
compute_track_radii_dict() — divides the radial space between r_min and r_max into n_tracks evenly-spaced, padded bands and returns their (r_start, r_end) limits.
plot_circosm() — internal per-sector renderer called once per (sector, sumstat) pair inside the main loop of plot_circular(). Mutates the pycirclize.Sector object in place and returns None.

pycmplot.plotting.circular.compute_track_radii_dict(n_tracks: int, r_min: float = 20, r_max: float = 100, pad: float = 1, annotate: bool = False) → dict[str, tuple[float, float]][source]

Compute (r_start, r_end) tuples for n_tracks evenly-spaced radial bands.

Divides the usable radial space between r_min and r_max into n_tracks bands of equal height, separated by gaps of pad units. The tracks are ordered from innermost ('track_1') to outermost ('track_n').

Parameters:

n_tracks (int) – Number of data tracks to accommodate.
r_min (float, optional) – Inner boundary of the full plotting area (as a percentage of the figure radius). Default 20.
r_max (float, optional) – Outer boundary of the full plotting area. Default 100.
pad (float, optional) – Gap in the same radius units between consecutive tracks. Default 1.
annotate (bool, optional) – If True, an extra slot is reserved for the annotation ring by incrementing n_tracks before computing heights. The extra slot is always placed at the outermost position. Default False.

Returns:

Mapping of 'track_i' → (r_start, r_end) for i in 1 … n_tracks (plus one extra entry when annotate is True).

Return type:

dict

Raises:

ValueError – If the total padding pad × (n_tracks − 1) exceeds the available radial space r_max − r_min.

Examples

>>> from pycmplot.plotting.circular import compute_track_radii_dict
>>> radii = compute_track_radii_dict(n_tracks=3, r_min=20, r_max=100, pad=2)
>>> list(radii.items())
[('track_1', (20.0, 45.33...)),
('track_2', (47.33..., 72.66...)),
('track_3', (74.66..., 100.0))]

pycmplot.plotting.circular.plot_circosm(sector=None, sector_radius=None, assoc: DataFrame | None = None, assoc_by_chr: DataFrame = None, sector_sizes: dict | None = None, chrom_label_loc: float | None = -3, chrom_label_size: float = 6, track_label_size: float = 6, track_label_orientation: str | None = 'vertical', track_index: int = 0, assoc_label: str | None = None, logp: bool = True, signif_line: float | None = 5e-08, signif_threshold: float | None = 5e-08, suggest_line: float | None = 1e-05, suggest_threshold: float | None = 1e-05, highlight: bool = False, highlight_color: str = 'brown', colors: list[str] | None = ['steelblue', 'orange'], point_size: float = 6, no_track_labels: bool = False) → None[source]

Plot one track of summary statistics onto a single pycirclize sector.

This is a low-level internal function called once for every (sector, sumstat) combination in the plot_circular() main loop. It adds a scatter track to sector in-place and optionally draws significance lines, y-axis ticks (on the first chromosome only), and chromosome labels. Returns None.

Parameters:

sector (pycirclize.Sector) – The pycirclize Sector object representing one chromosome arc.
sector_radius (tuple of (float, float)) – (r_start, r_end) radial limits for this track within sector, as returned by compute_track_radii_dict().
assoc (pandas.DataFrame, optional) – Full summary statistics DataFrame (all chromosomes). Filtered to the current sector’s chromosome internally. Must have columns CHR, POS, P, and logP (when logp is True).
sector_sizes (dict, optional) – Ordered mapping of chrom → [min_pos, max_pos] as returned by get_sumstats_and_merged_sector_list(). Used to identify the first and last sectors for y-axis ticks and track labels.
chrom_label_loc (float or None) – Radial position at which to draw the chromosome label. Computed in plot_circular() from chrom_label_side, r_min, and r_max.
chrom_label_size (float, optional) – Font size for chromosome labels. Default 6.
track_label_size (float, optional) – Font size for the track (sumstat) label written on the spacer sector. Default 6.
track_label_orientation ({'vertical', 'horizontal'}, optional) – Orientation of the track label text. Default 'vertical'.
track_index (int, optional) – 0-based index of the current sumstat track. Chromosome labels are only drawn on track_index == 0 (or for chromosome X). Default 0.
assoc_label (str, optional) – Track label text (sumstat name) rendered on the spacer sector.
logp (bool, optional) – If True, use the logP column for y-values and threshold comparisons. Default True.
signif_line (float, optional) – Y-value at which to draw the genome-wide significance dashed line (orange-red). Default 5e-8.
signif_threshold (float, optional) – Significance threshold used for y-axis scaling. Default 5e-8.
suggest_line (float or bool, optional) – Y-value for the suggestive significance dashed line (light blue). Pass False or None to suppress. Default 1e-5.
suggest_threshold (float, optional) – Suggestive threshold value used for y-axis scaling. Default 1e-5.
highlight (bool, optional) – If True, variants within significant loci (in_locus == True after get_highlight_snps()) are rendered in highlight_color (see below). Default False.
highlight_color (str, optional) – Color of highlighted positions when highlight is True. Default brown.
colors (list of str, optional) – Two alternating colours for even/odd chromosome numbers. Default ['steelblue', 'orange'].
no_track_labels (bool, optional) – Suppress the track label on the spacer sector. Default False.

pycmplot.plotting.circular.plot_circular(sumstats_loaded: dict, sector_sizes: dict = None, signif_lines: dict = None, logp: bool = False, pad: float = 1, r_min: float = 20, r_max: float = 100, annotate: str = None, label_col: str = None, chrom_label_side: str = 'inside', chrom_label_size: float = 6, signif_line: float = 5e-08, highlight: bool = False, highlight_thresh: float = 5e-08, highlight_color: str = 'brown', highlight_line: bool = False, highlight_line_color: str = 'grey', colors: list[str] = ['steelblue', 'silver'], point_size: float = 6, track_label_size: float = 6, track_label_orientation: str = 'vertical', hits_table: DataFrame = None, annotation_size: float = 6, plot_title: str | None = None, plot_title_size: float = 12, dpi: int | None = None, output_format: str | None = 'png', output_dir: str | None = '.', ylabel: str | None = None, no_track_labels: bool = False)[source]

Generate a multi-track Circos-style circular Manhattan plot.

Sets up a pycirclize.Circos canvas with one arc sector per chromosome, computes radial track extents, and calls plot_circosm() once per (sector, sumstat) pair to populate each track with scatter data and significance lines. After all tracks are rendered, gene or SNP annotations from hits_table are added to a dedicated annotation ring, and a shared y-axis label is placed on the spacer sector.

Parameters:

sumstats_loaded (dict) – Mapping of label → [DataFrame, n_chroms] as returned by get_sumstats_and_merged_sector_list(). One radial track is created per key. The outermost track corresponds to the first key after reversal of the radii dict.
sector_sizes (dict, optional) – Ordered mapping of chrom → [min_pos, max_pos] defining the arc length of each chromosome sector. The last key is expected to be 'Spacer1' (automatically added by get_sumstats_and_merged_sector_list()).
signif_lines (list of dict, optional) – One {'genome': float, 'suggestive': float} dict per track in the same order as sumstats_loaded, as returned by get_sumstats_and_merged_sector_list().
logp (bool, optional) – Plot –log₁₀(p) radially. Default False.
pad (float, optional) – Gap in radius units between consecutive tracks. Default 1.
r_min (float, optional) – Inner radius of the innermost track (as a percentage of the figure radius). Default 0.
r_max (float, optional) – Outer radius of the outermost track. Default 100.
annotate ({'SNP', 'GENE'} or falsy, optional) – Annotation content for significant loci. 'GENE' uses nearest_upstream_gene for genic hits and top_gene for intergenic hits (italic text); 'SNP' uses the SNP column (regular text). Pass None or False to disable annotations. Default 'SNP'.
chrom_label_side ({'inside', 'outside'}, optional) – Radial position of chromosome labels. 'inside' places them just inside the innermost track; 'outside' places them beyond the outermost track. Default 'inside'.
signif_line (float, optional) – Genome-wide significance threshold value for the orange-red dashed line. Default 5e-8.
highlight (bool, optional) – Render significant-locus variants in brown. Default False.
highlight_thresh (float, optional) – P-value threshold for locus highlighting. Default 5e-8.
highlight_color (str, optional) – Color of highlighted positions when highlight is True. Default brown.
colors (list of str, optional) – Two alternating chromosome colours. Default ['steelblue', 'grey'].
chrom_label_size (float, optional) – Chromosome label font size. Default 6.
track_label_size (float, optional) – Track (sumstat) label font size. Default 6.
track_label_orientation ({'vertical', 'horizontal'}, optional) – Track label text orientation. Default 'vertical'.
hits_table (pandas.DataFrame, optional) – Hits summary table from get_hits_summary_table(). Required for annotations (annotate truthy and hits_table non-empty).
annotation_size (float, optional) – Font size for annotation labels. Default 6.
highlight_line (bool, optional) – Draw a dashed radial line from the innermost track to the annotation ring for each annotated position. Default False.
highlight_line_color (str, optional) – Color of highlight line when highlight_line is True.
plot_title (str, optional) – Text placed in the centre of the circle and used as the output file-name stem.
plot_title_size (float, optional) – Font size for the centre title. Default 12.
dpi (int, optional) – Output resolution in dots per inch. Default 300.
output_format (str, optional) – Image format ('png', 'pdf', 'svg', 'jpg'). Default 'png'.
output_dir (str or pathlib.Path, optional) – Output directory. Default '.'.
ylabel (str, optional) – Override the shared y-axis label (left margin). Useful for non-p-value statistics such as iHS, F_ST or XP-EHH (e.g. ylabel="iHS"). When None (the default), the label is "-log₁₀(p-value)" if logp is True and "P" otherwise.
no_track_labels (bool, optional) – Suppress track labels on the spacer sector. Default False.

Returns:

The completed circular Manhattan figure (also saved to output_dir).

Return type:

matplotlib.figure.Figure

pycmplot.plotting.qq

Produces QQ plots with 95 % beta-distribution confidence bands, optional genome-wide significance lines, and genomic inflation (λ) annotation. Supports log-uniform point thinning for fast plotting of large datasets. Three high-level layouts are provided: combined (grid of per-trait panels), separate (one file per trait), and overlay (all traits on one shared axes).

pycmplot.plotting.qq.plot_qq_combined(pval_dict: dict[str, ndarray | Series], colors: list[str] | None = None, point_size: float = 8, ci: float = 0.95, signif_threshold: float | None = 5e-08, show_lambda: bool = True, ncols: int = 3, figsize: tuple | None = None, dpi: int = 300, title: str | None = None, output_path: str | None = None, fig_format: str = 'png', thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) → tuple[Figure, list[Axes]][source]

Plot all QQ plots in a single figure arranged in a grid.

Parameters:

pval_dict – Ordered dict of {label: p_value_array}.
colors – List of colours, one per track. Cycles if fewer than tracks.
ncols – Number of columns in the subplot grid (default 3).
figsize – Figure size. Auto-calculated from ncols and number of tracks if None.
output_path – If given, save the figure here.
thin – See plot_qq_single().
thin_below – See plot_qq_single().
max_points – See plot_qq_single().
rasterized – See plot_qq_single().

Return type:

(fig, axes)

pycmplot.plotting.qq.plot_qq_overlay(pval_dict: dict[str, ndarray | Series], colors: list[str] | None = None, point_size: float = 8, ci: float = 0.95, ci_alpha: float = 0.1, signif_threshold: float | None = 5e-08, show_lambda: bool = True, figsize: tuple = (6, 6), dpi: int = 300, title: str | None = None, output_path: str | None = None, fig_format: str = 'png', thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) → tuple[Figure, Axes][source]

Plot all sumstats on a single QQ axes, each coloured differently.

Lambda (λ) values appear in the legend label for each sumstat.

Parameters:

pval_dict – Ordered dict of {label: p_value_array}.
colors – List of colours, one per sumstat. Defaults to tab10 palette.
ci_alpha – Transparency of CI bands (default 0.10 — lower than single-panel default to keep overlapping bands readable).
show_lambda – Append λ to each legend entry.
thin – See plot_qq_single().
thin_below – See plot_qq_single().
max_points – See plot_qq_single().
rasterized – See plot_qq_single().

Return type:

(fig, ax)

pycmplot.plotting.qq.plot_qq_separate(pval_dict: dict[str, ndarray | Series], base_name: str = None, output_path: str = '.', colors: list[str] | None = None, point_size: float = 8, ci: float = 0.95, signif_threshold: float | None = 5e-08, show_lambda: bool = True, figsize: tuple = (5, 5), dpi: int = 300, fig_format: str = 'png', thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) → list[str][source]

Save one QQ plot per sumstat as individual files.

Parameters:

pval_dict – Ordered dict of {label: p_value_array}.
output_dir – Directory to save files in.
file_stem – Prefix for output filenames.
colors – List of colours, one per track.
thin – See plot_qq_single().
thin_below – See plot_qq_single().
max_points – See plot_qq_single().
rasterized – See plot_qq_single().

Return type:

List of output file paths.

pycmplot.plotting.qq.plot_qq_single(pvals: ndarray | Series, ax: Axes, label: str | None = None, color: str = 'steelblue', point_size: float = 8, ci: float = 0.95, ci_alpha: float = 0.15, signif_threshold: float | None = 5e-08, show_lambda: bool = True, title: str | None = None, thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) → Axes[source]

Draw a single QQ plot onto ax.

Parameters:

pvals – Array or Series of raw p-values (not −log10).
ax – Matplotlib Axes to draw on.
label – Legend label for the scatter points.
color – Colour for points and CI fill.
point_size – Scatter point size.
ci – Confidence interval level (default 0.95).
ci_alpha – Transparency of the CI band.
signif_threshold – If given, draw a horizontal dashed line at −log10(threshold).
show_lambda – Annotate the plot with the genomic inflation factor λ.
title – Axes title.
thin – Enable p-value thinning for speed (default True).
thin_below – P-value threshold below which all points are always kept. Points above this threshold are downsampled.
max_points – Maximum number of points to plot after thinning (default 50 000).
rasterized – Render the scatter as a bitmap inside vector output formats — greatly reduces PDF/SVG file size (default True).

Return type:

plt.Axes

pycmplot.plotting.qq.thin_pvals(pvals: ndarray, tail_threshold: float = 0.01, max_points: int = 50000, seed: int = 42) → tuple[ndarray, ndarray, int][source]

Downsample p-values for faster QQ plotting with no visible breaks.

Rather than splitting into tail / bulk regions with different sampling strategies (which produces a visible seam at the threshold), this function uses a single log-uniform thinning pass over all p-values:

Sort p-values ascending and convert to −log₁₀ scale.
Pick max_points evenly-spaced indices along the −log₁₀ axis. Because −log₁₀ compresses large p-values and expands small ones, this automatically gives dense coverage in the interesting tail and sparse coverage in the null bulk — with no hard boundary.

Parameters:

pvals – Full array of raw p-values.
tail_threshold – Kept for API compatibility; no longer used as a hard split point. All points above −log₁₀(tail_threshold) are always represented because the log-uniform spacing naturally keeps them.
max_points – Maximum number of points to return (default 50 000).
seed – Unused (kept for API compatibility — log-uniform selection is deterministic).

Returns:

kept_pvals — thinned p-values in ascending order. kept_ranks — 1-based ranks in the full sorted array. n_full — total SNP count before thinning (for expected quantiles).

Return type:

(kept_pvals, kept_ranks, n_full)

Notes

Lambda (λ) must be computed on the full pvals array before calling this function — thinning changes the empirical distribution.