pycmplot.plotting

The plotting subpackage contains three modules: one for linear (stacked) Manhattan plots, one for circular (Circos-style) Manhattan plots, and one for QQ plots.

pycmplot.plotting.linear

Generates single- and multi-track stacked linear Manhattan plots with optional significance lines, locus highlighting, cluster-aware label spreading, and intelligent arrow-angle calculation for gene annotations.

pycmplot.plotting.linear

Multi-track stacked linear Manhattan plot.

The module exposes two public functions:

  • plot_linear() — the user-facing entry point. Accepts the sumstats_loaded dict produced by get_sumstats_and_merged_sector_list(), resolves output paths, and delegates rendering to plot_linearm().

  • plot_linearm() — the core rendering engine. Accepts a list of DataFrames and a fully resolved set of plotting parameters, builds the matplotlib figure, draws all tracks, and saves the file.

Internal helpers:

  • _draw_annotation_arrows() — places angled FancyArrowPatch arrows from spread gene labels down to their corresponding signal positions.

  • _draw_annotation_arrows_multirail() — places angled FancyArrowPatch arrows from spread gene labels down to their corresponding signal positions with multirail capability and single sort + rank-reassignment to avoid arrow crossing.

pycmplot.plotting.linear.plot_linear(sumstats_loaded: list[str], track_heights: list[float] = None, logp: bool = False, point_size: float | None = 8, highlight: bool = False, highlight_color: str = 'brown', highlight_line: bool = False, highlight_line_color: str = 'grey', hits_table: DataFrame | None = None, annotate: str = None, annotation_size: float = 8, label_col: str | None = None, chr_spacing: float | None = 9000000.0, linear_track_spacing: float | None = None, annot_rail_frac: float | None = 0.98, colors: list[str] = ['steelblue', 'silver'], signif_lines: dict | None = None, plot_title: str | None = None, no_track_labels: bool = False, ylabel: str | None = None, dpi: int | None = None, output_format: str | None = 'png', output_dir: str | None = '.', figsize: list[float] | None = [10, 4])[source]

Generate a multi-track stacked linear Manhattan plot.

This is the primary user-facing entry point for linear Manhattan plots. It extracts DataFrames and labels from sumstats_loaded, resolves output file paths, then delegates all rendering to plot_linearm().

Parameters:
  • sumstats_loaded (dict) – Mapping of label [DataFrame, n_chroms] as returned by get_sumstats_and_merged_sector_list(). One track is created per key; the DataFrames must have canonical columns CHR, POS, P, and logP (when logp is True).

  • trim_pval (float, optional) – Reserved; trimming is applied upstream. Default None.

  • track_heights (list of float, optional) – Comma-parsed relative track heights (e.g. [2.0, 2.0, 1.5]). Passed directly to plot_linearm() as the gridspec height ratios for the data tracks. When None, all tracks are equal.

  • logp (bool, optional) – Plot –log₁₀(p) on the y-axis. Default False.

  • point_size (float, optional) – Scatter-plot point size. Default uses plot_linearm()’s default (5).

  • highlight (bool, optional) – Render significant-locus variants in brown. Default False.

  • highlight_thresh (float, optional) – P-value threshold for locus highlighting. Default 5e-8.

  • hits_table (pandas.DataFrame, optional) – Annotation DataFrame (hits summary table from get_hits_summary_table()). Must contain columns CHR, POS, and label_col. Passed to DataFrame suppresses annotations.

  • label_col (str, optional) – Column in hits_table to use as annotation text (e.g. 'top_gene'). Default None (falls back to plot_linearm() default 'label').

  • chr_spacing (float, optional) – Horizontal gap between chromosomes in base-pairs. Default 9e6.

  • linear_track_spacing (float, optional) – Vertical space between tracks as a fraction of average track height. Default 0.10.

  • annot_rail_frac (float, optional) – Fraction of horizontal space covering the center of the annotation track within which to place annotation texts. Default 0.98 (annotation texts will cover 98% of annotation track horizontally)

  • colors (list of str, optional) – Two alternating chromosome colours. Default ['steelblue', 'silver'].

  • signif_lines (list of dict, optional) – One {'genome': float, 'suggestive': float} dict per track, in the same order as sumstats_loaded. Produced by get_sumstats_and_merged_sector_list().

  • plot_title (str, optional) – Human-readable title used as the output file-name stem. Passed to get_output_paths().

  • no_track_labels (bool, optional) – Suppress per-track labels. Default False.

  • ylabel (str, optional) – Override the shared y-axis label (left margin). Useful for non-p-value statistics such as iHS, F_ST or XP-EHH (e.g. ylabel="iHS"). When None (the default), the label is "-log₁₀(p-value)" if logp is True and "P" otherwise.

  • dpi (int, optional) – Output resolution in dots per inch. Default 300.

  • output_format (str, optional) – Image format ('png', 'pdf', 'svg', 'jpg'). Default 'png'.

  • output_dir (str or pathlib.Path, optional) – Directory in which to save the output files. Default '.'.

  • figsize (tuple of (float, float), optional) – Figure dimensions (width, height) in inches. Default (15, 9).

Returns:

  • fig (matplotlib.figure.Figure) – The completed figure.

  • axes (list of matplotlib.axes.Axes) – All axes: axes[0] is the annotation sub-panel; axes[1:] are the per-track data axes.

See also

plot_linearm

The underlying rendering engine called by this function.

pycmplot.io.get_sumstats_and_merged_sector_list

Produces sumstats_loaded and signif_lines.

Examples

>>> from pycmplot.plotting.linear import plot_linear
>>> fig, axes = plot_linear(
...     sumstats_loaded=loaded,
...     logp=True,
...     highlight=True,
...     hits_table=hits,
...     label_col="top_gene",
...     signif_lines=sig_lines,
...     plot_title="RBC_Traits",
...     output_dir="./results",
... )
pycmplot.plotting.linear.plot_linearm(tracks: list, track_labels: list[str] | None = None, annot_df: DataFrame = None, annotate: bool = False, annotation_size: float = 8, highlight: bool = False, highlight_color: str = 'brown', highlight_line: bool = False, highlight_line_color: str = 'grey', logp: bool = True, label_col: str | None = 'SNP', chr_order: list[str] | None = None, chr_spacing: float = 9000000.0, track_heights: list[float] | None = None, linear_track_spacing: float = 0.1, annot_rail_frac: float = 0.95, point_size: float = 8, colors: list[str] | None = ['steelblue', 'silver'], sig_lines: list[dict] | None = None, plt_name: str | None = None, no_track_labels: bool = False, ylabel: str | None = None, fig_format: str | None = None, dpi: int = 300, figsize: list[float] | None = [10, 4])[source]

Core rendering engine for the multi-track stacked linear Manhattan plot.

Builds a Figure with one annotation sub-panel at the top (for gene/SNP labels and connecting arrows) and n data tracks below it, one per element of tracks. All tracks share the same cumulative x-axis.

This function is called by the higher-level plot_linear() wrapper and is not normally invoked directly.

Parameters:
  • tracks (list of pandas.DataFrame) – One DataFrame per GWAS trait. Each must have columns CHR, POS, P, and logP (when logp is True). The DataFrames are pre-processed internally (chromosome normalisation, highlighting, cumulative x-axis computation).

  • track_labels (list of str, optional) – Y-axis label for each track, in the same order as tracks. Defaults to ['Track 1', 'Track 2', …].

  • annot_df (pandas.DataFrame, optional) – Annotation DataFrame (typically the hits summary table). Must have columns CHR, POS, and label_col. When provided, a dashed vertical guide line is drawn through every annotated position across all tracks, and gene/SNP label arrows are drawn in the top sub-panel.

  • highlight (bool, optional) – If True, variants within 500 kb of a lead SNP (as determined by get_highlight_snps()) are rendered in brown. Default False.

  • highlight_thresh (float, optional) – P-value threshold used for locus highlighting. Default 5e-8.

  • trim_pval (float, optional) – Reserved for future use; trimming is currently handled upstream in get_sumstats_and_merged_sector_list().

  • logp (bool, optional) – If True, plot –log₁₀(p) on the y-axis; requires a logP column in each DataFrame. Default True.

  • label_col (str, optional) – Column in annot_df to use as annotation text labels (e.g. 'top_gene'). Default 'label'.

  • chr_order (list of str, optional) – Chromosome display order. Defaults to CHROM_ORDER (autosomes 1–22, X, Y, MT).

  • chr_spacing (float, optional) – Gap in base-pairs inserted between consecutive chromosomes on the x-axis. Default 9e6.

  • track_heights (list of float, optional) – Relative height ratios for the gridspec rows. The first element controls the annotation sub-panel; subsequent elements control the data tracks. When None, the annotation panel is given a weight of 1 and each data track a weight of 3.

  • linear_track_spacing (float, optional) – Vertical hspace between tracks as a fraction of average track height. Default 0.10.

  • annot_rail_frac (float, optional) – Fraction of horizontal space covering the center of the annotation track within which to place annotation texts. Default 0.98 (annotation texts will cover 98% of annotation track horizontally)

  • point_size (float, optional) – Scatter-plot point size passed to matplotlib.axes.Axes.scatter(). Default 5.

  • colors (list of str, optional) – Two alternating matplotlib colour strings for even/odd chromosomes. Default ['steelblue', 'orange'].

  • sig_lines (list of dict, optional) – One {'genome': float, 'suggestive': float} dict per track. 'genome' values are drawn as red dashed lines; 'suggestive' values as grey dashed lines.

  • plt_name (str, optional) – Full output file path (including extension). When provided the figure is saved to disk.

  • no_track_labels (bool, optional) – If True, per-track labels are suppressed. Default False.

  • ylabel (str, optional) – Shared y-axis label rendered on the left of the figure. Use this to set a sensible label for non-p-value statistics such as iHS, FST or XP-EHH (e.g. ylabel="iHS"). When None (the default), the label is derived automatically: "-log₁₀(p-value)" when logp is True, otherwise the p-value column name ("P").

  • fig_format (str, optional) – Output image format ('png', 'pdf', 'svg'). Inferred from plt_name’s extension when None.

  • dpi (int, optional) – Output resolution in dots per inch. Default 300.

  • figsize (tuple of (float, float), optional) – Figure dimensions (width, height) in inches. Default (15, 9).

Returns:

  • fig (matplotlib.figure.Figure) – The completed figure.

  • axes (list of matplotlib.axes.Axes) – All axes in the figure: axes[0] is the annotation sub-panel; axes[1:] are the data-track axes in the same order as tracks.

See also

plot_linear

User-facing wrapper that extracts DataFrames from the sumstats_loaded dict and resolves output paths before calling this function.

_draw_annotation_arrows

Called internally to render gene/SNP label arrows in axes[0].

pycmplot.stats.get_highlight_snps

Called internally when highlight is True.

pycmplot.plotting.circular

Generates multi-track Circos-style circular Manhattan plots. Track radii are computed automatically to give each track proportional visual weight relative to its data range.

pycmplot.plotting.circular

Circos-style multi-track circular Manhattan plot.

The module exposes two public functions and one internal per-sector helper:

  • plot_circular() — user-facing entry point. Configures the pycirclize.Circos canvas, computes track radii, iterates over sectors and tracks, renders gene/SNP annotations, and saves the figure.

  • compute_track_radii_dict() — divides the radial space between r_min and r_max into n_tracks evenly-spaced, padded bands and returns their (r_start, r_end) limits.

  • plot_circosm() — internal per-sector renderer called once per (sector, sumstat) pair inside the main loop of plot_circular(). Mutates the pycirclize.Sector object in place and returns None.

pycmplot.plotting.circular.compute_track_radii_dict(n_tracks: int, r_min: float = 20, r_max: float = 100, pad: float = 1, annotate: bool = False) dict[str, tuple[float, float]][source]

Compute (r_start, r_end) tuples for n_tracks evenly-spaced radial bands.

Divides the usable radial space between r_min and r_max into n_tracks bands of equal height, separated by gaps of pad units. The tracks are ordered from innermost ('track_1') to outermost ('track_n').

Parameters:
  • n_tracks (int) – Number of data tracks to accommodate.

  • r_min (float, optional) – Inner boundary of the full plotting area (as a percentage of the figure radius). Default 20.

  • r_max (float, optional) – Outer boundary of the full plotting area. Default 100.

  • pad (float, optional) – Gap in the same radius units between consecutive tracks. Default 1.

  • annotate (bool, optional) – If True, an extra slot is reserved for the annotation ring by incrementing n_tracks before computing heights. The extra slot is always placed at the outermost position. Default False.

Returns:

Mapping of 'track_i' (r_start, r_end) for i in 1 n_tracks (plus one extra entry when annotate is True).

Return type:

dict

Raises:

ValueError – If the total padding pad × (n_tracks 1) exceeds the available radial space r_max r_min.

Examples

>>> from pycmplot.plotting.circular import compute_track_radii_dict
>>> radii = compute_track_radii_dict(n_tracks=3, r_min=20, r_max=100, pad=2)
>>> list(radii.items())
[('track_1', (20.0, 45.33...)),
('track_2', (47.33..., 72.66...)),
('track_3', (74.66..., 100.0))]
pycmplot.plotting.circular.plot_circosm(sector=None, sector_radius=None, assoc: DataFrame | None = None, assoc_by_chr: DataFrame = None, sector_sizes: dict | None = None, chrom_label_loc: float | None = -3, chrom_label_size: float = 6, track_label_size: float = 6, track_label_orientation: str | None = 'vertical', track_index: int = 0, assoc_label: str | None = None, logp: bool = True, signif_line: float | None = 5e-08, signif_threshold: float | None = 5e-08, suggest_line: float | None = 1e-05, suggest_threshold: float | None = 1e-05, highlight: bool = False, highlight_color: str = 'brown', colors: list[str] | None = ['steelblue', 'orange'], point_size: float = 6, no_track_labels: bool = False) None[source]

Plot one track of summary statistics onto a single pycirclize sector.

This is a low-level internal function called once for every (sector, sumstat) combination in the plot_circular() main loop. It adds a scatter track to sector in-place and optionally draws significance lines, y-axis ticks (on the first chromosome only), and chromosome labels. Returns None.

Parameters:
  • sector (pycirclize.Sector) – The pycirclize Sector object representing one chromosome arc.

  • sector_radius (tuple of (float, float)) – (r_start, r_end) radial limits for this track within sector, as returned by compute_track_radii_dict().

  • assoc (pandas.DataFrame, optional) – Full summary statistics DataFrame (all chromosomes). Filtered to the current sector’s chromosome internally. Must have columns CHR, POS, P, and logP (when logp is True).

  • sector_sizes (dict, optional) – Ordered mapping of chrom [min_pos, max_pos] as returned by get_sumstats_and_merged_sector_list(). Used to identify the first and last sectors for y-axis ticks and track labels.

  • chrom_label_loc (float or None) – Radial position at which to draw the chromosome label. Computed in plot_circular() from chrom_label_side, r_min, and r_max.

  • chrom_label_size (float, optional) – Font size for chromosome labels. Default 6.

  • track_label_size (float, optional) – Font size for the track (sumstat) label written on the spacer sector. Default 6.

  • track_label_orientation ({'vertical', 'horizontal'}, optional) – Orientation of the track label text. Default 'vertical'.

  • track_index (int, optional) – 0-based index of the current sumstat track. Chromosome labels are only drawn on track_index == 0 (or for chromosome X). Default 0.

  • assoc_label (str, optional) – Track label text (sumstat name) rendered on the spacer sector.

  • logp (bool, optional) – If True, use the logP column for y-values and threshold comparisons. Default True.

  • signif_line (float, optional) – Y-value at which to draw the genome-wide significance dashed line (orange-red). Default 5e-8.

  • signif_threshold (float, optional) – Significance threshold used for y-axis scaling. Default 5e-8.

  • suggest_line (float or bool, optional) – Y-value for the suggestive significance dashed line (light blue). Pass False or None to suppress. Default 1e-5.

  • suggest_threshold (float, optional) – Suggestive threshold value used for y-axis scaling. Default 1e-5.

  • highlight (bool, optional) – If True, variants within significant loci (in_locus == True after get_highlight_snps()) are rendered in highlight_color (see below). Default False.

  • highlight_color (str, optional) – Color of highlighted positions when highlight is True. Default brown.

  • colors (list of str, optional) – Two alternating colours for even/odd chromosome numbers. Default ['steelblue', 'orange'].

  • no_track_labels (bool, optional) – Suppress the track label on the spacer sector. Default False.

pycmplot.plotting.circular.plot_circular(sumstats_loaded: dict, sector_sizes: dict = None, signif_lines: dict = None, logp: bool = False, pad: float = 1, r_min: float = 20, r_max: float = 100, annotate: str = None, label_col: str = None, chrom_label_side: str = 'inside', chrom_label_size: float = 6, signif_line: float = 5e-08, highlight: bool = False, highlight_thresh: float = 5e-08, highlight_color: str = 'brown', highlight_line: bool = False, highlight_line_color: str = 'grey', colors: list[str] = ['steelblue', 'silver'], point_size: float = 6, track_label_size: float = 6, track_label_orientation: str = 'vertical', hits_table: DataFrame = None, annotation_size: float = 6, plot_title: str | None = None, plot_title_size: float = 12, dpi: int | None = None, output_format: str | None = 'png', output_dir: str | None = '.', ylabel: str | None = None, no_track_labels: bool = False)[source]

Generate a multi-track Circos-style circular Manhattan plot.

Sets up a pycirclize.Circos canvas with one arc sector per chromosome, computes radial track extents, and calls plot_circosm() once per (sector, sumstat) pair to populate each track with scatter data and significance lines. After all tracks are rendered, gene or SNP annotations from hits_table are added to a dedicated annotation ring, and a shared y-axis label is placed on the spacer sector.

Parameters:
  • sumstats_loaded (dict) – Mapping of label [DataFrame, n_chroms] as returned by get_sumstats_and_merged_sector_list(). One radial track is created per key. The outermost track corresponds to the first key after reversal of the radii dict.

  • sector_sizes (dict, optional) – Ordered mapping of chrom [min_pos, max_pos] defining the arc length of each chromosome sector. The last key is expected to be 'Spacer1' (automatically added by get_sumstats_and_merged_sector_list()).

  • signif_lines (list of dict, optional) – One {'genome': float, 'suggestive': float} dict per track in the same order as sumstats_loaded, as returned by get_sumstats_and_merged_sector_list().

  • logp (bool, optional) – Plot –log₁₀(p) radially. Default False.

  • pad (float, optional) – Gap in radius units between consecutive tracks. Default 1.

  • r_min (float, optional) – Inner radius of the innermost track (as a percentage of the figure radius). Default 0.

  • r_max (float, optional) – Outer radius of the outermost track. Default 100.

  • annotate ({'SNP', 'GENE'} or falsy, optional) – Annotation content for significant loci. 'GENE' uses nearest_upstream_gene for genic hits and top_gene for intergenic hits (italic text); 'SNP' uses the SNP column (regular text). Pass None or False to disable annotations. Default 'SNP'.

  • chrom_label_side ({'inside', 'outside'}, optional) – Radial position of chromosome labels. 'inside' places them just inside the innermost track; 'outside' places them beyond the outermost track. Default 'inside'.

  • signif_line (float, optional) – Genome-wide significance threshold value for the orange-red dashed line. Default 5e-8.

  • highlight (bool, optional) – Render significant-locus variants in brown. Default False.

  • highlight_thresh (float, optional) – P-value threshold for locus highlighting. Default 5e-8.

  • highlight_color (str, optional) – Color of highlighted positions when highlight is True. Default brown.

  • colors (list of str, optional) – Two alternating chromosome colours. Default ['steelblue', 'grey'].

  • chrom_label_size (float, optional) – Chromosome label font size. Default 6.

  • track_label_size (float, optional) – Track (sumstat) label font size. Default 6.

  • track_label_orientation ({'vertical', 'horizontal'}, optional) – Track label text orientation. Default 'vertical'.

  • hits_table (pandas.DataFrame, optional) – Hits summary table from get_hits_summary_table(). Required for annotations (annotate truthy and hits_table non-empty).

  • annotation_size (float, optional) – Font size for annotation labels. Default 6.

  • highlight_line (bool, optional) – Draw a dashed radial line from the innermost track to the annotation ring for each annotated position. Default False.

  • highlight_line_color (str, optional) – Color of highlight line when highlight_line is True.

  • plot_title (str, optional) – Text placed in the centre of the circle and used as the output file-name stem.

  • plot_title_size (float, optional) – Font size for the centre title. Default 12.

  • dpi (int, optional) – Output resolution in dots per inch. Default 300.

  • output_format (str, optional) – Image format ('png', 'pdf', 'svg', 'jpg'). Default 'png'.

  • output_dir (str or pathlib.Path, optional) – Output directory. Default '.'.

  • ylabel (str, optional) – Override the shared y-axis label (left margin). Useful for non-p-value statistics such as iHS, F_ST or XP-EHH (e.g. ylabel="iHS"). When None (the default), the label is "-log₁₀(p-value)" if logp is True and "P" otherwise.

  • no_track_labels (bool, optional) – Suppress track labels on the spacer sector. Default False.

Returns:

The completed circular Manhattan figure (also saved to output_dir).

Return type:

matplotlib.figure.Figure

See also

pycmplot.plotting.linear.plot_linear

Linear (stacked) counterpart to this function.

compute_track_radii_dict

Computes the (r_start, r_end) limits for each track.

pycmplot.io.get_sumstats_and_merged_sector_list

Produces sumstats_loaded, sector_sizes, and signif_lines.

Examples

>>> from pycmplot.plotting.circular import plot_circular
>>> fig = plot_circular(
...     sumstats_loaded=loaded,
...     sector_sizes=sectors,
...     signif_lines=sig_lines,
...     logp=True,
...     highlight=True,
...     annotate="GENE",
...     hits_table=hits,
...     plot_title="RBC_Traits",
...     output_dir="./results",
... )

pycmplot.plotting.qq

Produces QQ plots with 95 % beta-distribution confidence bands, optional genome-wide significance lines, and genomic inflation (λ) annotation. Supports log-uniform point thinning for fast plotting of large datasets. Three high-level layouts are provided: combined (grid of per-trait panels), separate (one file per trait), and overlay (all traits on one shared axes).

pycmplot.plotting.qq.plot_qq_combined(pval_dict: dict[str, ndarray | Series], colors: list[str] | None = None, point_size: float = 8, ci: float = 0.95, signif_threshold: float | None = 5e-08, show_lambda: bool = True, ncols: int = 3, figsize: tuple | None = None, dpi: int = 300, title: str | None = None, output_path: str | None = None, fig_format: str = 'png', thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) tuple[Figure, list[Axes]][source]

Plot all QQ plots in a single figure arranged in a grid.

Parameters:
  • pval_dict – Ordered dict of {label: p_value_array}.

  • colors – List of colours, one per track. Cycles if fewer than tracks.

  • ncols – Number of columns in the subplot grid (default 3).

  • figsize – Figure size. Auto-calculated from ncols and number of tracks if None.

  • output_path – If given, save the figure here.

  • thin – See plot_qq_single().

  • thin_below – See plot_qq_single().

  • max_points – See plot_qq_single().

  • rasterized – See plot_qq_single().

Return type:

(fig, axes)

pycmplot.plotting.qq.plot_qq_overlay(pval_dict: dict[str, ndarray | Series], colors: list[str] | None = None, point_size: float = 8, ci: float = 0.95, ci_alpha: float = 0.1, signif_threshold: float | None = 5e-08, show_lambda: bool = True, figsize: tuple = (6, 6), dpi: int = 300, title: str | None = None, output_path: str | None = None, fig_format: str = 'png', thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) tuple[Figure, Axes][source]

Plot all sumstats on a single QQ axes, each coloured differently.

Lambda (λ) values appear in the legend label for each sumstat.

Parameters:
  • pval_dict – Ordered dict of {label: p_value_array}.

  • colors – List of colours, one per sumstat. Defaults to tab10 palette.

  • ci_alpha – Transparency of CI bands (default 0.10 — lower than single-panel default to keep overlapping bands readable).

  • show_lambda – Append λ to each legend entry.

  • thin – See plot_qq_single().

  • thin_below – See plot_qq_single().

  • max_points – See plot_qq_single().

  • rasterized – See plot_qq_single().

Return type:

(fig, ax)

pycmplot.plotting.qq.plot_qq_separate(pval_dict: dict[str, ndarray | Series], base_name: str = None, output_path: str = '.', colors: list[str] | None = None, point_size: float = 8, ci: float = 0.95, signif_threshold: float | None = 5e-08, show_lambda: bool = True, figsize: tuple = (5, 5), dpi: int = 300, fig_format: str = 'png', thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) list[str][source]

Save one QQ plot per sumstat as individual files.

Parameters:
  • pval_dict – Ordered dict of {label: p_value_array}.

  • output_dir – Directory to save files in.

  • file_stem – Prefix for output filenames.

  • colors – List of colours, one per track.

  • thin – See plot_qq_single().

  • thin_below – See plot_qq_single().

  • max_points – See plot_qq_single().

  • rasterized – See plot_qq_single().

Return type:

List of output file paths.

pycmplot.plotting.qq.plot_qq_single(pvals: ndarray | Series, ax: Axes, label: str | None = None, color: str = 'steelblue', point_size: float = 8, ci: float = 0.95, ci_alpha: float = 0.15, signif_threshold: float | None = 5e-08, show_lambda: bool = True, title: str | None = None, thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) Axes[source]

Draw a single QQ plot onto ax.

Parameters:
  • pvals – Array or Series of raw p-values (not −log10).

  • ax – Matplotlib Axes to draw on.

  • label – Legend label for the scatter points.

  • color – Colour for points and CI fill.

  • point_size – Scatter point size.

  • ci – Confidence interval level (default 0.95).

  • ci_alpha – Transparency of the CI band.

  • signif_threshold – If given, draw a horizontal dashed line at −log10(threshold).

  • show_lambda – Annotate the plot with the genomic inflation factor λ.

  • title – Axes title.

  • thin – Enable p-value thinning for speed (default True).

  • thin_below – P-value threshold below which all points are always kept. Points above this threshold are downsampled.

  • max_points – Maximum number of points to plot after thinning (default 50 000).

  • rasterized – Render the scatter as a bitmap inside vector output formats — greatly reduces PDF/SVG file size (default True).

Return type:

plt.Axes

pycmplot.plotting.qq.thin_pvals(pvals: ndarray, tail_threshold: float = 0.01, max_points: int = 50000, seed: int = 42) tuple[ndarray, ndarray, int][source]

Downsample p-values for faster QQ plotting with no visible breaks.

Rather than splitting into tail / bulk regions with different sampling strategies (which produces a visible seam at the threshold), this function uses a single log-uniform thinning pass over all p-values:

  1. Sort p-values ascending and convert to −log₁₀ scale.

  2. Pick max_points evenly-spaced indices along the −log₁₀ axis. Because −log₁₀ compresses large p-values and expands small ones, this automatically gives dense coverage in the interesting tail and sparse coverage in the null bulk — with no hard boundary.

Parameters:
  • pvals – Full array of raw p-values.

  • tail_threshold – Kept for API compatibility; no longer used as a hard split point. All points above −log₁₀(tail_threshold) are always represented because the log-uniform spacing naturally keeps them.

  • max_points – Maximum number of points to return (default 50 000).

  • seed – Unused (kept for API compatibility — log-uniform selection is deterministic).

Returns:

kept_pvals — thinned p-values in ascending order. kept_ranks — 1-based ranks in the full sorted array. n_full — total SNP count before thinning (for expected quantiles).

Return type:

(kept_pvals, kept_ranks, n_full)

Notes

Lambda (λ) must be computed on the full pvals array before calling this function — thinning changes the empirical distribution.