pycmplot.plotting
The plotting subpackage contains three modules: one for linear (stacked) Manhattan plots, one for circular (Circos-style) Manhattan plots, and one for QQ plots.
pycmplot.plotting.linear
Generates single- and multi-track stacked linear Manhattan plots with optional significance lines, locus highlighting, cluster-aware label spreading, and intelligent arrow-angle calculation for gene annotations.
pycmplot.plotting.linear
Multi-track stacked linear Manhattan plot.
The module exposes two public functions:
plot_linear()— the user-facing entry point. Accepts thesumstats_loadeddict produced byget_sumstats_and_merged_sector_list(), resolves output paths, and delegates rendering toplot_linearm().plot_linearm()— the core rendering engine. Accepts a list of DataFrames and a fully resolved set of plotting parameters, builds the matplotlib figure, draws all tracks, and saves the file.
Internal helpers:
_draw_annotation_arrows()— places angledFancyArrowPatcharrows from spread gene labels down to their corresponding signal positions._draw_annotation_arrows_multirail()— places angledFancyArrowPatcharrows from spread gene labels down to their corresponding signal positions with multirail capability and single sort + rank-reassignment to avoid arrow crossing.
- pycmplot.plotting.linear.plot_linear(sumstats_loaded: list[str], track_heights: list[float] = None, logp: bool = False, point_size: float | None = 8, highlight: bool = False, highlight_color: str = 'brown', highlight_line: bool = False, highlight_line_color: str = 'grey', hits_table: DataFrame | None = None, annotate: str = None, annotation_size: float = 8, label_col: str | None = None, chr_spacing: float | None = 9000000.0, linear_track_spacing: float | None = None, annot_rail_frac: float | None = 0.98, colors: list[str] = ['steelblue', 'silver'], signif_lines: dict | None = None, plot_title: str | None = None, no_track_labels: bool = False, ylabel: str | None = None, dpi: int | None = None, output_format: str | None = 'png', output_dir: str | None = '.', figsize: list[float] | None = [10, 4])[source]
Generate a multi-track stacked linear Manhattan plot.
This is the primary user-facing entry point for linear Manhattan plots. It extracts DataFrames and labels from sumstats_loaded, resolves output file paths, then delegates all rendering to
plot_linearm().- Parameters:
sumstats_loaded (dict) – Mapping of
label → [DataFrame, n_chroms]as returned byget_sumstats_and_merged_sector_list(). One track is created per key; the DataFrames must have canonical columnsCHR,POS,P, andlogP(when logp isTrue).trim_pval (float, optional) – Reserved; trimming is applied upstream. Default
None.track_heights (list of float, optional) – Comma-parsed relative track heights (e.g.
[2.0, 2.0, 1.5]). Passed directly toplot_linearm()as the gridspec height ratios for the data tracks. WhenNone, all tracks are equal.logp (bool, optional) – Plot –log₁₀(p) on the y-axis. Default
False.point_size (float, optional) – Scatter-plot point size. Default uses
plot_linearm()’s default (5).highlight (bool, optional) – Render significant-locus variants in brown. Default
False.highlight_thresh (float, optional) – P-value threshold for locus highlighting. Default
5e-8.hits_table (pandas.DataFrame, optional) – Annotation DataFrame (hits summary table from
get_hits_summary_table()). Must contain columnsCHR,POS, and label_col. Passed to DataFrame suppresses annotations.label_col (str, optional) – Column in hits_table to use as annotation text (e.g.
'top_gene'). DefaultNone(falls back toplot_linearm()default'label').chr_spacing (float, optional) – Horizontal gap between chromosomes in base-pairs. Default
9e6.linear_track_spacing (float, optional) – Vertical space between tracks as a fraction of average track height. Default
0.10.annot_rail_frac (float, optional) – Fraction of horizontal space covering the center of the annotation track within which to place annotation texts. Default
0.98(annotation texts will cover 98% of annotation track horizontally)colors (list of str, optional) – Two alternating chromosome colours. Default
['steelblue', 'silver'].signif_lines (list of dict, optional) – One
{'genome': float, 'suggestive': float}dict per track, in the same order as sumstats_loaded. Produced byget_sumstats_and_merged_sector_list().plot_title (str, optional) – Human-readable title used as the output file-name stem. Passed to
get_output_paths().no_track_labels (bool, optional) – Suppress per-track labels. Default
False.ylabel (str, optional) – Override the shared y-axis label (left margin). Useful for non-p-value statistics such as iHS, F_ST or XP-EHH (e.g.
ylabel="iHS"). WhenNone(the default), the label is"-log₁₀(p-value)"if logp isTrueand"P"otherwise.dpi (int, optional) – Output resolution in dots per inch. Default
300.output_format (str, optional) – Image format (
'png','pdf','svg','jpg'). Default'png'.output_dir (str or pathlib.Path, optional) – Directory in which to save the output files. Default
'.'.figsize (tuple of (float, float), optional) – Figure dimensions
(width, height)in inches. Default(15, 9).
- Returns:
fig (matplotlib.figure.Figure) – The completed figure.
axes (list of matplotlib.axes.Axes) – All axes:
axes[0]is the annotation sub-panel;axes[1:]are the per-track data axes.
See also
plot_linearmThe underlying rendering engine called by this function.
pycmplot.io.get_sumstats_and_merged_sector_listProduces sumstats_loaded and signif_lines.
Examples
>>> from pycmplot.plotting.linear import plot_linear >>> fig, axes = plot_linear( ... sumstats_loaded=loaded, ... logp=True, ... highlight=True, ... hits_table=hits, ... label_col="top_gene", ... signif_lines=sig_lines, ... plot_title="RBC_Traits", ... output_dir="./results", ... )
- pycmplot.plotting.linear.plot_linearm(tracks: list, track_labels: list[str] | None = None, annot_df: DataFrame = None, annotate: bool = False, annotation_size: float = 8, highlight: bool = False, highlight_color: str = 'brown', highlight_line: bool = False, highlight_line_color: str = 'grey', logp: bool = True, label_col: str | None = 'SNP', chr_order: list[str] | None = None, chr_spacing: float = 9000000.0, track_heights: list[float] | None = None, linear_track_spacing: float = 0.1, annot_rail_frac: float = 0.95, point_size: float = 8, colors: list[str] | None = ['steelblue', 'silver'], sig_lines: list[dict] | None = None, plt_name: str | None = None, no_track_labels: bool = False, ylabel: str | None = None, fig_format: str | None = None, dpi: int = 300, figsize: list[float] | None = [10, 4])[source]
Core rendering engine for the multi-track stacked linear Manhattan plot.
Builds a
Figurewith one annotation sub-panel at the top (for gene/SNP labels and connecting arrows) and n data tracks below it, one per element of tracks. All tracks share the same cumulative x-axis.This function is called by the higher-level
plot_linear()wrapper and is not normally invoked directly.- Parameters:
tracks (list of pandas.DataFrame) – One DataFrame per GWAS trait. Each must have columns
CHR,POS,P, andlogP(when logp isTrue). The DataFrames are pre-processed internally (chromosome normalisation, highlighting, cumulative x-axis computation).track_labels (list of str, optional) – Y-axis label for each track, in the same order as tracks. Defaults to
['Track 1', 'Track 2', …].annot_df (pandas.DataFrame, optional) – Annotation DataFrame (typically the hits summary table). Must have columns
CHR,POS, and label_col. When provided, a dashed vertical guide line is drawn through every annotated position across all tracks, and gene/SNP label arrows are drawn in the top sub-panel.highlight (bool, optional) – If
True, variants within 500 kb of a lead SNP (as determined byget_highlight_snps()) are rendered in brown. DefaultFalse.highlight_thresh (float, optional) – P-value threshold used for locus highlighting. Default
5e-8.trim_pval (float, optional) – Reserved for future use; trimming is currently handled upstream in
get_sumstats_and_merged_sector_list().logp (bool, optional) – If
True, plot –log₁₀(p) on the y-axis; requires alogPcolumn in each DataFrame. DefaultTrue.label_col (str, optional) – Column in annot_df to use as annotation text labels (e.g.
'top_gene'). Default'label'.chr_order (list of str, optional) – Chromosome display order. Defaults to
CHROM_ORDER(autosomes 1–22, X, Y, MT).chr_spacing (float, optional) – Gap in base-pairs inserted between consecutive chromosomes on the x-axis. Default
9e6.track_heights (list of float, optional) – Relative height ratios for the gridspec rows. The first element controls the annotation sub-panel; subsequent elements control the data tracks. When
None, the annotation panel is given a weight of 1 and each data track a weight of 3.linear_track_spacing (float, optional) – Vertical
hspacebetween tracks as a fraction of average track height. Default0.10.annot_rail_frac (float, optional) – Fraction of horizontal space covering the center of the annotation track within which to place annotation texts. Default
0.98(annotation texts will cover 98% of annotation track horizontally)point_size (float, optional) – Scatter-plot point size passed to
matplotlib.axes.Axes.scatter(). Default5.colors (list of str, optional) – Two alternating matplotlib colour strings for even/odd chromosomes. Default
['steelblue', 'orange'].sig_lines (list of dict, optional) – One
{'genome': float, 'suggestive': float}dict per track.'genome'values are drawn as red dashed lines;'suggestive'values as grey dashed lines.plt_name (str, optional) – Full output file path (including extension). When provided the figure is saved to disk.
no_track_labels (bool, optional) – If
True, per-track labels are suppressed. DefaultFalse.ylabel (str, optional) – Shared y-axis label rendered on the left of the figure. Use this to set a sensible label for non-p-value statistics such as iHS, FST or XP-EHH (e.g.
ylabel="iHS"). WhenNone(the default), the label is derived automatically:"-log₁₀(p-value)"when logp isTrue, otherwise the p-value column name ("P").fig_format (str, optional) – Output image format (
'png','pdf','svg'). Inferred from plt_name’s extension whenNone.dpi (int, optional) – Output resolution in dots per inch. Default
300.figsize (tuple of (float, float), optional) – Figure dimensions
(width, height)in inches. Default(15, 9).
- Returns:
fig (matplotlib.figure.Figure) – The completed figure.
axes (list of matplotlib.axes.Axes) – All axes in the figure:
axes[0]is the annotation sub-panel;axes[1:]are the data-track axes in the same order as tracks.
See also
plot_linearUser-facing wrapper that extracts DataFrames from the
sumstats_loadeddict and resolves output paths before calling this function._draw_annotation_arrowsCalled internally to render gene/SNP label arrows in
axes[0].pycmplot.stats.get_highlight_snpsCalled internally when highlight is
True.
pycmplot.plotting.circular
Generates multi-track Circos-style circular Manhattan plots. Track radii are computed automatically to give each track proportional visual weight relative to its data range.
pycmplot.plotting.circular
Circos-style multi-track circular Manhattan plot.
The module exposes two public functions and one internal per-sector helper:
plot_circular()— user-facing entry point. Configures thepycirclize.Circoscanvas, computes track radii, iterates over sectors and tracks, renders gene/SNP annotations, and saves the figure.compute_track_radii_dict()— divides the radial space between r_min and r_max into n_tracks evenly-spaced, padded bands and returns their(r_start, r_end)limits.plot_circosm()— internal per-sector renderer called once per(sector, sumstat)pair inside the main loop ofplot_circular(). Mutates thepycirclize.Sectorobject in place and returnsNone.
- pycmplot.plotting.circular.compute_track_radii_dict(n_tracks: int, r_min: float = 20, r_max: float = 100, pad: float = 1, annotate: bool = False) dict[str, tuple[float, float]][source]
Compute
(r_start, r_end)tuples for n_tracks evenly-spaced radial bands.Divides the usable radial space between r_min and r_max into n_tracks bands of equal height, separated by gaps of pad units. The tracks are ordered from innermost (
'track_1') to outermost ('track_n').- Parameters:
n_tracks (int) – Number of data tracks to accommodate.
r_min (float, optional) – Inner boundary of the full plotting area (as a percentage of the figure radius). Default
20.r_max (float, optional) – Outer boundary of the full plotting area. Default
100.pad (float, optional) – Gap in the same radius units between consecutive tracks. Default
1.annotate (bool, optional) – If
True, an extra slot is reserved for the annotation ring by incrementing n_tracks before computing heights. The extra slot is always placed at the outermost position. DefaultFalse.
- Returns:
Mapping of
'track_i' → (r_start, r_end)foriin1 … n_tracks(plus one extra entry when annotate isTrue).- Return type:
- Raises:
ValueError – If the total padding
pad × (n_tracks − 1)exceeds the available radial spacer_max − r_min.
Examples
>>> from pycmplot.plotting.circular import compute_track_radii_dict >>> radii = compute_track_radii_dict(n_tracks=3, r_min=20, r_max=100, pad=2) >>> list(radii.items()) [('track_1', (20.0, 45.33...)), ('track_2', (47.33..., 72.66...)), ('track_3', (74.66..., 100.0))]
- pycmplot.plotting.circular.plot_circosm(sector=None, sector_radius=None, assoc: DataFrame | None = None, assoc_by_chr: DataFrame = None, sector_sizes: dict | None = None, chrom_label_loc: float | None = -3, chrom_label_size: float = 6, track_label_size: float = 6, track_label_orientation: str | None = 'vertical', track_index: int = 0, assoc_label: str | None = None, logp: bool = True, signif_line: float | None = 5e-08, signif_threshold: float | None = 5e-08, suggest_line: float | None = 1e-05, suggest_threshold: float | None = 1e-05, highlight: bool = False, highlight_color: str = 'brown', colors: list[str] | None = ['steelblue', 'orange'], point_size: float = 6, no_track_labels: bool = False) None[source]
Plot one track of summary statistics onto a single pycirclize sector.
This is a low-level internal function called once for every
(sector, sumstat)combination in theplot_circular()main loop. It adds a scatter track to sector in-place and optionally draws significance lines, y-axis ticks (on the first chromosome only), and chromosome labels. ReturnsNone.- Parameters:
sector (pycirclize.Sector) – The pycirclize Sector object representing one chromosome arc.
sector_radius (tuple of (float, float)) –
(r_start, r_end)radial limits for this track within sector, as returned bycompute_track_radii_dict().assoc (pandas.DataFrame, optional) – Full summary statistics DataFrame (all chromosomes). Filtered to the current sector’s chromosome internally. Must have columns
CHR,POS,P, andlogP(when logp isTrue).sector_sizes (dict, optional) – Ordered mapping of
chrom → [min_pos, max_pos]as returned byget_sumstats_and_merged_sector_list(). Used to identify the first and last sectors for y-axis ticks and track labels.chrom_label_loc (float or None) – Radial position at which to draw the chromosome label. Computed in
plot_circular()from chrom_label_side, r_min, and r_max.chrom_label_size (float, optional) – Font size for chromosome labels. Default
6.track_label_size (float, optional) – Font size for the track (sumstat) label written on the spacer sector. Default
6.track_label_orientation ({'vertical', 'horizontal'}, optional) – Orientation of the track label text. Default
'vertical'.track_index (int, optional) – 0-based index of the current sumstat track. Chromosome labels are only drawn on
track_index == 0(or for chromosome X). Default0.assoc_label (str, optional) – Track label text (sumstat name) rendered on the spacer sector.
logp (bool, optional) – If
True, use thelogPcolumn for y-values and threshold comparisons. DefaultTrue.signif_line (float, optional) – Y-value at which to draw the genome-wide significance dashed line (orange-red). Default
5e-8.signif_threshold (float, optional) – Significance threshold used for y-axis scaling. Default
5e-8.suggest_line (float or bool, optional) – Y-value for the suggestive significance dashed line (light blue). Pass
FalseorNoneto suppress. Default1e-5.suggest_threshold (float, optional) – Suggestive threshold value used for y-axis scaling. Default
1e-5.highlight (bool, optional) – If
True, variants within significant loci (in_locus == Trueafterget_highlight_snps()) are rendered inhighlight_color(see below). DefaultFalse.highlight_color (str, optional) – Color of highlighted positions when highlight is
True. Defaultbrown.colors (list of str, optional) – Two alternating colours for even/odd chromosome numbers. Default
['steelblue', 'orange'].no_track_labels (bool, optional) – Suppress the track label on the spacer sector. Default
False.
- pycmplot.plotting.circular.plot_circular(sumstats_loaded: dict, sector_sizes: dict = None, signif_lines: dict = None, logp: bool = False, pad: float = 1, r_min: float = 20, r_max: float = 100, annotate: str = None, label_col: str = None, chrom_label_side: str = 'inside', chrom_label_size: float = 6, signif_line: float = 5e-08, highlight: bool = False, highlight_thresh: float = 5e-08, highlight_color: str = 'brown', highlight_line: bool = False, highlight_line_color: str = 'grey', colors: list[str] = ['steelblue', 'silver'], point_size: float = 6, track_label_size: float = 6, track_label_orientation: str = 'vertical', hits_table: DataFrame = None, annotation_size: float = 6, plot_title: str | None = None, plot_title_size: float = 12, dpi: int | None = None, output_format: str | None = 'png', output_dir: str | None = '.', ylabel: str | None = None, no_track_labels: bool = False)[source]
Generate a multi-track Circos-style circular Manhattan plot.
Sets up a
pycirclize.Circoscanvas with one arc sector per chromosome, computes radial track extents, and callsplot_circosm()once per(sector, sumstat)pair to populate each track with scatter data and significance lines. After all tracks are rendered, gene or SNP annotations from hits_table are added to a dedicated annotation ring, and a shared y-axis label is placed on the spacer sector.- Parameters:
sumstats_loaded (dict) – Mapping of
label → [DataFrame, n_chroms]as returned byget_sumstats_and_merged_sector_list(). One radial track is created per key. The outermost track corresponds to the first key after reversal of the radii dict.sector_sizes (dict, optional) – Ordered mapping of
chrom → [min_pos, max_pos]defining the arc length of each chromosome sector. The last key is expected to be'Spacer1'(automatically added byget_sumstats_and_merged_sector_list()).signif_lines (list of dict, optional) – One
{'genome': float, 'suggestive': float}dict per track in the same order as sumstats_loaded, as returned byget_sumstats_and_merged_sector_list().logp (bool, optional) – Plot –log₁₀(p) radially. Default
False.pad (float, optional) – Gap in radius units between consecutive tracks. Default
1.r_min (float, optional) – Inner radius of the innermost track (as a percentage of the figure radius). Default
0.r_max (float, optional) – Outer radius of the outermost track. Default
100.annotate ({'SNP', 'GENE'} or falsy, optional) – Annotation content for significant loci.
'GENE'usesnearest_upstream_genefor genic hits andtop_genefor intergenic hits (italic text);'SNP'uses theSNPcolumn (regular text). PassNoneorFalseto disable annotations. Default'SNP'.chrom_label_side ({'inside', 'outside'}, optional) – Radial position of chromosome labels.
'inside'places them just inside the innermost track;'outside'places them beyond the outermost track. Default'inside'.signif_line (float, optional) – Genome-wide significance threshold value for the orange-red dashed line. Default
5e-8.highlight (bool, optional) – Render significant-locus variants in brown. Default
False.highlight_thresh (float, optional) – P-value threshold for locus highlighting. Default
5e-8.highlight_color (str, optional) – Color of highlighted positions when highlight is
True. Defaultbrown.colors (list of str, optional) – Two alternating chromosome colours. Default
['steelblue', 'grey'].chrom_label_size (float, optional) – Chromosome label font size. Default
6.track_label_size (float, optional) – Track (sumstat) label font size. Default
6.track_label_orientation ({'vertical', 'horizontal'}, optional) – Track label text orientation. Default
'vertical'.hits_table (pandas.DataFrame, optional) – Hits summary table from
get_hits_summary_table(). Required for annotations (annotatetruthy andhits_tablenon-empty).annotation_size (float, optional) – Font size for annotation labels. Default
6.highlight_line (bool, optional) – Draw a dashed radial line from the innermost track to the annotation ring for each annotated position. Default
False.highlight_line_color (str, optional) – Color of highlight line when highlight_line is
True.plot_title (str, optional) – Text placed in the centre of the circle and used as the output file-name stem.
plot_title_size (float, optional) – Font size for the centre title. Default
12.dpi (int, optional) – Output resolution in dots per inch. Default
300.output_format (str, optional) – Image format (
'png','pdf','svg','jpg'). Default'png'.output_dir (str or pathlib.Path, optional) – Output directory. Default
'.'.ylabel (str, optional) – Override the shared y-axis label (left margin). Useful for non-p-value statistics such as iHS, F_ST or XP-EHH (e.g.
ylabel="iHS"). WhenNone(the default), the label is"-log₁₀(p-value)"if logp isTrueand"P"otherwise.no_track_labels (bool, optional) – Suppress track labels on the spacer sector. Default
False.
- Returns:
The completed circular Manhattan figure (also saved to output_dir).
- Return type:
See also
pycmplot.plotting.linear.plot_linearLinear (stacked) counterpart to this function.
compute_track_radii_dictComputes the
(r_start, r_end)limits for each track.pycmplot.io.get_sumstats_and_merged_sector_listProduces sumstats_loaded, sector_sizes, and signif_lines.
Examples
>>> from pycmplot.plotting.circular import plot_circular >>> fig = plot_circular( ... sumstats_loaded=loaded, ... sector_sizes=sectors, ... signif_lines=sig_lines, ... logp=True, ... highlight=True, ... annotate="GENE", ... hits_table=hits, ... plot_title="RBC_Traits", ... output_dir="./results", ... )
pycmplot.plotting.qq
Produces QQ plots with 95 % beta-distribution confidence bands, optional genome-wide significance lines, and genomic inflation (λ) annotation. Supports log-uniform point thinning for fast plotting of large datasets. Three high-level layouts are provided: combined (grid of per-trait panels), separate (one file per trait), and overlay (all traits on one shared axes).
- pycmplot.plotting.qq.plot_qq_combined(pval_dict: dict[str, ndarray | Series], colors: list[str] | None = None, point_size: float = 8, ci: float = 0.95, signif_threshold: float | None = 5e-08, show_lambda: bool = True, ncols: int = 3, figsize: tuple | None = None, dpi: int = 300, title: str | None = None, output_path: str | None = None, fig_format: str = 'png', thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) tuple[Figure, list[Axes]][source]
Plot all QQ plots in a single figure arranged in a grid.
- Parameters:
pval_dict – Ordered dict of
{label: p_value_array}.colors – List of colours, one per track. Cycles if fewer than tracks.
ncols – Number of columns in the subplot grid (default 3).
figsize – Figure size. Auto-calculated from ncols and number of tracks if
None.output_path – If given, save the figure here.
thin – See
plot_qq_single().thin_below – See
plot_qq_single().max_points – See
plot_qq_single().rasterized – See
plot_qq_single().
- Return type:
(fig, axes)
- pycmplot.plotting.qq.plot_qq_overlay(pval_dict: dict[str, ndarray | Series], colors: list[str] | None = None, point_size: float = 8, ci: float = 0.95, ci_alpha: float = 0.1, signif_threshold: float | None = 5e-08, show_lambda: bool = True, figsize: tuple = (6, 6), dpi: int = 300, title: str | None = None, output_path: str | None = None, fig_format: str = 'png', thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) tuple[Figure, Axes][source]
Plot all sumstats on a single QQ axes, each coloured differently.
Lambda (λ) values appear in the legend label for each sumstat.
- Parameters:
pval_dict – Ordered dict of
{label: p_value_array}.colors – List of colours, one per sumstat. Defaults to
tab10palette.ci_alpha – Transparency of CI bands (default 0.10 — lower than single-panel default to keep overlapping bands readable).
show_lambda – Append λ to each legend entry.
thin – See
plot_qq_single().thin_below – See
plot_qq_single().max_points – See
plot_qq_single().rasterized – See
plot_qq_single().
- Return type:
(fig, ax)
- pycmplot.plotting.qq.plot_qq_separate(pval_dict: dict[str, ndarray | Series], base_name: str = None, output_path: str = '.', colors: list[str] | None = None, point_size: float = 8, ci: float = 0.95, signif_threshold: float | None = 5e-08, show_lambda: bool = True, figsize: tuple = (5, 5), dpi: int = 300, fig_format: str = 'png', thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) list[str][source]
Save one QQ plot per sumstat as individual files.
- Parameters:
pval_dict – Ordered dict of
{label: p_value_array}.output_dir – Directory to save files in.
file_stem – Prefix for output filenames.
colors – List of colours, one per track.
thin – See
plot_qq_single().thin_below – See
plot_qq_single().max_points – See
plot_qq_single().rasterized – See
plot_qq_single().
- Return type:
List of output file paths.
- pycmplot.plotting.qq.plot_qq_single(pvals: ndarray | Series, ax: Axes, label: str | None = None, color: str = 'steelblue', point_size: float = 8, ci: float = 0.95, ci_alpha: float = 0.15, signif_threshold: float | None = 5e-08, show_lambda: bool = True, title: str | None = None, thin: bool = True, thin_below: float = 0.01, max_points: int = 50000, fontsize: float = 8, rasterized: bool = True) Axes[source]
Draw a single QQ plot onto ax.
- Parameters:
pvals – Array or Series of raw p-values (not −log10).
ax – Matplotlib Axes to draw on.
label – Legend label for the scatter points.
color – Colour for points and CI fill.
point_size – Scatter point size.
ci – Confidence interval level (default 0.95).
ci_alpha – Transparency of the CI band.
signif_threshold – If given, draw a horizontal dashed line at −log10(threshold).
show_lambda – Annotate the plot with the genomic inflation factor λ.
title – Axes title.
thin – Enable p-value thinning for speed (default
True).thin_below – P-value threshold below which all points are always kept. Points above this threshold are downsampled.
max_points – Maximum number of points to plot after thinning (default 50 000).
rasterized – Render the scatter as a bitmap inside vector output formats — greatly reduces PDF/SVG file size (default
True).
- Return type:
plt.Axes
- pycmplot.plotting.qq.thin_pvals(pvals: ndarray, tail_threshold: float = 0.01, max_points: int = 50000, seed: int = 42) tuple[ndarray, ndarray, int][source]
Downsample p-values for faster QQ plotting with no visible breaks.
Rather than splitting into tail / bulk regions with different sampling strategies (which produces a visible seam at the threshold), this function uses a single log-uniform thinning pass over all p-values:
Sort p-values ascending and convert to −log₁₀ scale.
Pick
max_pointsevenly-spaced indices along the −log₁₀ axis. Because −log₁₀ compresses large p-values and expands small ones, this automatically gives dense coverage in the interesting tail and sparse coverage in the null bulk — with no hard boundary.
- Parameters:
pvals – Full array of raw p-values.
tail_threshold – Kept for API compatibility; no longer used as a hard split point. All points above −log₁₀(tail_threshold) are always represented because the log-uniform spacing naturally keeps them.
max_points – Maximum number of points to return (default 50 000).
seed – Unused (kept for API compatibility — log-uniform selection is deterministic).
- Returns:
kept_pvals — thinned p-values in ascending order. kept_ranks — 1-based ranks in the full sorted array. n_full — total SNP count before thinning (for expected quantiles).
- Return type:
(kept_pvals, kept_ranks, n_full)
Notes
Lambda (λ) must be computed on the full pvals array before calling this function — thinning changes the empirical distribution.