| Title: | Power Analysis for Phosphopeptide Abundance Hypothesis Test |
|---|---|
| Description: | Estimate best fit distributions and do power analysis for hypothesis tests on phosphopeptide abundance data. |
| Authors: | Dan MacLean [aut, cre] |
| Maintainer: | Dan MacLean <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-16 08:54:25 UTC |
| Source: | https://github.com/teammaclean/peppwr |
Calculates Spearman correlation between log(mean_abundance) and NA rate across peptides to detect whether low-abundance peptides have more missing values than high-abundance peptides.
compute_dataset_mnar(missingness)compute_dataset_mnar(missingness)
missingness |
A tibble with columns na_rate and mean_abundance (as produced by fit_distributions) |
A list with:
correlation: Spearman correlation coefficient (negative = MNAR pattern)
p_value: p-value for the correlation
n_peptides: Number of peptides with missing data used in calculation
interpretation: Human-readable interpretation string
MNAR (Missing Not At Random) in mass spectrometry typically manifests as low-abundance peptides having higher rates of missing values due to detection limits. This function detects this pattern by correlating mean abundance with NA rate across all peptides.
A negative correlation indicates that low-abundance peptides have more missing values - the hallmark of detection-limit-driven MNAR.
| Correlation (r) | Interpretation |
| r > -0.1 | No evidence of MNAR |
| -0.3 < r < -0.1 | Weak evidence |
| -0.5 < r < -0.3 | Moderate evidence |
| r < -0.5 | Strong evidence of MNAR |
Calculates the number and proportion of missing (NA) values.
compute_missingness(values)compute_missingness(values)
values |
Numeric vector (may contain NAs) |
List with:
n_total: Total number of values
n_missing: Number of NA values
na_rate: Proportion missing (0-1)
compute_dataset_mnar() for dataset-level MNAR detection
Fits candidate distributions to each peptide's abundance values and selects the best fit by AIC. Also computes missingness statistics including dataset-level MNAR detection.
fit_distributions(data, id, group, value, distributions = "continuous")fit_distributions(data, id, group, value, distributions = "continuous")
data |
A data frame containing peptide abundance data |
id |
Column name for peptide identifier |
group |
Column name for group/condition |
value |
Column name for abundance values |
distributions |
Which distributions to fit: "continuous" (default), "counts", "all", or a character vector of distribution names |
A peppwr_fits object containing:
$data: Nested tibble with original data and fit results
$best: Best-fitting distribution for each peptide
$missingness: Per-peptide missingness statistics
$dataset_mnar: Dataset-level MNAR correlation and interpretation
The returned object includes:
Per-peptide NA rates (in $missingness)
Dataset-level MNAR correlation (in $dataset_mnar)
The dataset-level MNAR metric correlates log(mean_abundance) with NA rate across peptides. A negative correlation indicates low-abundance peptides have more missing values - typical of detection-limit-driven MNAR.
Print the result to see both metrics. For small sample sizes (N < 15), the dataset-level correlation is more reliable than per-peptide scores.
compute_dataset_mnar() for dataset-level MNAR details,
plot_missingness() to visualize missingness patterns
Get distributions for a preset
get_distribution_preset(preset = "continuous")get_distribution_preset(preset = "continuous")
preset |
One of "continuous" (default), "counts", or "all" |
Character vector of distribution names
Check if data appears to be count data (non-negative integers)
is_count_data(x)is_count_data(x)
x |
Numeric vector |
TRUE if data looks like counts, FALSE otherwise
Create a new peppwr_fits object
new_peppwr_fits( data, fits, best, call, missingness = NULL, dataset_mnar = NULL )new_peppwr_fits( data, fits, best, call, missingness = NULL, dataset_mnar = NULL )
data |
Original data (nested tibble) |
fits |
Fit results per peptide (list of tibbles with dist, loglik, aic) |
best |
Best-fitting distribution per peptide (character vector) |
call |
Original function call |
missingness |
Tibble with missingness statistics per peptide (optional) |
dataset_mnar |
Dataset-level MNAR metric (optional, from compute_dataset_mnar) |
A peppwr_fits object
Create a new peppwr_power object
new_peppwr_power(mode, question, answer, simulations, params, call)new_peppwr_power(mode, question, answer, simulations, params, call)
mode |
Either "aggregate" or "per_peptide" |
question |
What was solved for: "power", "sample_size", or "effect_size" |
answer |
The computed answer |
simulations |
List of simulation details |
params |
List of input parameters used |
call |
Original function call |
A peppwr_power object
Plot density overlay: observed histogram with fitted density curve
plot_density_overlay(fits, peptide_id = NULL, n_overlay = 6)plot_density_overlay(fits, peptide_id = NULL, n_overlay = 6)
fits |
A peppwr_fits object |
peptide_id |
Specific peptide to plot (NULL for multiple) |
n_overlay |
Number of peptides to overlay when peptide_id is NULL |
A ggplot object
Creates a visualization of missing data patterns with two panels:
plot_missingness(fits)plot_missingness(fits)
fits |
A peppwr_fits object |
Panel 1: Distribution of NA rates across peptides
Panel 2: Mean abundance vs NA rate scatter plot showing the dataset-level MNAR correlation. The subtitle displays the Spearman correlation coefficient and p-value. A negative correlation indicates that low-abundance peptides have more missing values - the hallmark of detection-limit-driven MNAR.
A ggplot object or gtable (combined panels)
MNAR (Missing Not At Random) in mass spectrometry typically manifests as low-abundance peptides having higher rates of missing values due to detection limits. Panel 2 visualizes this relationship and reports the correlation coefficient.
compute_dataset_mnar() for the underlying correlation calculation
Plot distribution of fitted parameters across peptidome
plot_param_distribution(fits)plot_param_distribution(fits)
fits |
A peppwr_fits object |
A ggplot object
Plot power heatmap: N x effect size grid
plot_power_heatmap( distribution, params, n_range, effect_range, n_steps = 6, n_sim = 100, test = "wilcoxon", alpha = 0.05 )plot_power_heatmap( distribution, params, n_range, effect_range, n_steps = 6, n_sim = 100, test = "wilcoxon", alpha = 0.05 )
distribution |
Distribution name |
params |
Distribution parameters |
n_range |
Range of sample sizes (vector of length 2) |
effect_range |
Range of effect sizes (vector of length 2) |
n_steps |
Number of grid points per dimension |
n_sim |
Number of simulations per grid cell |
test |
Statistical test to use |
alpha |
Significance level |
A ggplot object
Plot power vs effect size at fixed N
plot_power_vs_effect(power_result, effect_range, n_steps = 10, n_sim = NULL)plot_power_vs_effect(power_result, effect_range, n_steps = 10, n_sim = NULL)
power_result |
A peppwr_power object |
effect_range |
Range of effect sizes to explore |
n_steps |
Number of effect size values to compute |
n_sim |
Number of simulations per point |
A ggplot object
Plot QQ plots for goodness-of-fit
plot_qq(fits, peptide_id = NULL, n_plots = 6)plot_qq(fits, peptide_id = NULL, n_plots = 6)
fits |
A peppwr_fits object |
peptide_id |
Specific peptide to plot (NULL for multiple) |
n_plots |
Number of peptides to plot when peptide_id is NULL |
A ggplot object
Creates a bar chart showing the count of best-fit distributions
## S3 method for class 'peppwr_fits' plot(x, ...)## S3 method for class 'peppwr_fits' plot(x, ...)
x |
A peppwr_fits object |
... |
Additional arguments (ignored) |
A ggplot object
Creates power curves or % peptides at threshold plots
## S3 method for class 'peppwr_power' plot(x, ...)## S3 method for class 'peppwr_power' plot(x, ...)
x |
A peppwr_power object |
... |
Additional arguments (ignored) |
A ggplot object
Power analysis for peptide abundance data
power_analysis(distribution, ...)power_analysis(distribution, ...)
distribution |
Distribution name (character) or peppwr_fits object for per-peptide mode |
... |
Additional arguments |
A peppwr_power object
Power analysis with specified distribution (aggregate mode)
Default power analysis method (aggregate mode with defaults)
## S3 method for class 'character' power_analysis( distribution, params, effect_size = NULL, n_per_group = NULL, target_power = NULL, alpha = 0.05, test = "wilcoxon", find = "power", n_sim = 1000, ... ) ## Default S3 method: power_analysis(distribution, ...)## S3 method for class 'character' power_analysis( distribution, params, effect_size = NULL, n_per_group = NULL, target_power = NULL, alpha = 0.05, test = "wilcoxon", find = "power", n_sim = 1000, ... ) ## Default S3 method: power_analysis(distribution, ...)
distribution |
Distribution name (e.g., "norm", "gamma", "lnorm") |
params |
List of distribution parameters |
effect_size |
Fold change to detect |
n_per_group |
Sample size per group (required for find="power") |
target_power |
Target power (required for find="sample_size" or find="effect_size") |
alpha |
Significance level (default 0.05) |
test |
Statistical test to use (default "wilcoxon") |
find |
What to solve for: "power", "sample_size", or "effect_size" |
n_sim |
Number of simulations (default 1000) |
... |
Additional arguments (ignored) |
A peppwr_power object
Power analysis for per-peptide mode using fitted distributions
## S3 method for class 'peppwr_fits' power_analysis( distribution, effect_size = NULL, n_per_group = NULL, target_power = NULL, alpha = 0.05, test = "wilcoxon", find = "power", n_sim = 1000, on_fit_failure = "exclude", proportion_threshold = 0.5, include_missingness = FALSE, apply_fdr = FALSE, prop_null = 0.9, fdr_threshold = 0.05, ... )## S3 method for class 'peppwr_fits' power_analysis( distribution, effect_size = NULL, n_per_group = NULL, target_power = NULL, alpha = 0.05, test = "wilcoxon", find = "power", n_sim = 1000, on_fit_failure = "exclude", proportion_threshold = 0.5, include_missingness = FALSE, apply_fdr = FALSE, prop_null = 0.9, fdr_threshold = 0.05, ... )
distribution |
A peppwr_fits object from fit_distributions() |
effect_size |
Fold change to detect |
n_per_group |
Sample size per group (required for find="power") |
target_power |
Target power (required for find="sample_size") |
alpha |
Significance level (default 0.05) |
test |
Statistical test to use (default "wilcoxon") |
find |
What to solve for: "power" or "sample_size" |
n_sim |
Number of simulations per peptide (default 1000) |
on_fit_failure |
How to handle failed fits: "exclude", "empirical", or "lognormal" |
proportion_threshold |
Proportion of peptides that must reach target_power (default 0.5) |
include_missingness |
If TRUE, incorporate peptide-specific NA rates into simulations |
apply_fdr |
If TRUE, use FDR-aware simulation with Benjamini-Hochberg correction.
Note: not compatible with |
prop_null |
Proportion of true null peptides (default 0.9 = 90% unchanged) |
fdr_threshold |
FDR threshold for calling discoveries (default 0.05) |
... |
Additional arguments (ignored) |
A peppwr_power object
Print method for peppwr_fits
## S3 method for class 'peppwr_fits' print(x, ...)## S3 method for class 'peppwr_fits' print(x, ...)
x |
A peppwr_fits object |
... |
Additional arguments (ignored) |
The object invisibly
Print method for peppwr_power
## S3 method for class 'peppwr_power' print(x, ...)## S3 method for class 'peppwr_power' print(x, ...)
x |
A peppwr_power object |
... |
Additional arguments (ignored) |
The object invisibly
Print method for summary.peppwr_fits
## S3 method for class 'summary.peppwr_fits' print(x, ...)## S3 method for class 'summary.peppwr_fits' print(x, ...)
x |
A summary.peppwr_fits object |
... |
Additional arguments (ignored) |
The object invisibly
Print method for summary.peppwr_power
## S3 method for class 'summary.peppwr_power' print(x, ...)## S3 method for class 'summary.peppwr_power' print(x, ...)
x |
A summary.peppwr_power object |
... |
Additional arguments (ignored) |
The object invisibly
Run power simulation
run_power_sim( distribution, params, n_per_group, effect_size, alpha = 0.05, test = "wilcoxon", n_sim = 1000 )run_power_sim( distribution, params, n_per_group, effect_size, alpha = 0.05, test = "wilcoxon", n_sim = 1000 )
distribution |
Distribution name |
params |
List of distribution parameters |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment |
alpha |
Significance level |
test |
Statistical test to use ("wilcoxon", "bootstrap_t", "bayes_t", "rankprod") |
n_sim |
Number of simulations |
Power estimate (proportion of significant results)
Estimates power by repeatedly resampling from observed data rather than sampling from fitted distributions.
run_power_sim_empirical( raw_data, n_per_group, effect_size, alpha = 0.05, test = "wilcoxon", n_sim = 1000 )run_power_sim_empirical( raw_data, n_per_group, effect_size, alpha = 0.05, test = "wilcoxon", n_sim = 1000 )
raw_data |
Numeric vector of observed values |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment |
alpha |
Significance level |
test |
Statistical test to use |
n_sim |
Number of simulations |
Power estimate (proportion of significant results)
Simulates an entire peptidome experiment with a mix of true nulls and true alternatives, then applies Benjamini-Hochberg correction to compute FDR-adjusted power.
run_power_sim_fdr( fits, effect_size, n_per_group, prop_null = 0.9, fdr_threshold = 0.05, alpha = 0.05, test = "wilcoxon", n_sim = 1000 )run_power_sim_fdr( fits, effect_size, n_per_group, prop_null = 0.9, fdr_threshold = 0.05, alpha = 0.05, test = "wilcoxon", n_sim = 1000 )
fits |
A peppwr_fits object |
effect_size |
Fold change for treatment |
n_per_group |
Number of samples per group |
prop_null |
Proportion of peptides that are true nulls (no effect). Default 0.9 (90% unchanged). |
fdr_threshold |
FDR threshold for calling discoveries. Default 0.05. |
alpha |
Nominal significance level (used for simulation). Default 0.05. |
test |
Statistical test to use. Default "wilcoxon". |
n_sim |
Number of simulation iterations. Default 1000. |
FDR-adjusted power estimate (proportion of true alternatives detected)
Estimates power accounting for realistic missing data patterns.
run_power_sim_with_missingness( distribution, params, n_per_group, effect_size, na_rate = 0, mnar_score = 0, alpha = 0.05, test = "wilcoxon", n_sim = 1000 )run_power_sim_with_missingness( distribution, params, n_per_group, effect_size, na_rate = 0, mnar_score = 0, alpha = 0.05, test = "wilcoxon", n_sim = 1000 )
distribution |
Distribution name |
params |
List of distribution parameters |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment |
na_rate |
Proportion of values that will be NA |
mnar_score |
MNAR intensity (0 = MCAR) |
alpha |
Significance level |
test |
Statistical test to use |
n_sim |
Number of simulations |
Power estimate (proportion of significant results)
Resamples from observed data instead of using parametric distributions. Useful when distribution fitting fails or for non-parametric analysis.
simulate_empirical(raw_data, n_per_group, effect_size)simulate_empirical(raw_data, n_per_group, effect_size)
raw_data |
Numeric vector of observed values |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment (multiplicative effect) |
List with control and treatment vectors
Simulate an experiment with control and treatment groups
simulate_experiment(distribution, params, n_per_group, effect_size)simulate_experiment(distribution, params, n_per_group, effect_size)
distribution |
Distribution name (e.g., "norm", "gamma", "lnorm") |
params |
List of distribution parameters |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment (multiplicative effect) |
List with control and treatment vectors
Generates control and treatment samples from a distribution, then introduces missing values according to the specified rate and MNAR pattern.
simulate_with_missingness( distribution, params, n_per_group, effect_size, na_rate = 0, mnar_score = 0 )simulate_with_missingness( distribution, params, n_per_group, effect_size, na_rate = 0, mnar_score = 0 )
distribution |
Distribution name (e.g., "norm", "gamma", "lnorm") |
params |
List of distribution parameters |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment (multiplicative effect) |
na_rate |
Proportion of values to make NA (0-1) |
mnar_score |
MNAR intensity: 0 = MCAR, positive = low values more likely to be missing. Typical values: 0-3. |
List with control and treatment vectors (may contain NAs)
Summary method for peppwr_fits
## S3 method for class 'peppwr_fits' summary(object, ...)## S3 method for class 'peppwr_fits' summary(object, ...)
object |
A peppwr_fits object |
... |
Additional arguments (ignored) |
A list with summary statistics
Summary method for peppwr_power
## S3 method for class 'peppwr_power' summary(object, ...)## S3 method for class 'peppwr_power' summary(object, ...)
object |
A peppwr_power object |
... |
Additional arguments (ignored) |
A list with summary statistics
Computes Bayes factor for difference between two groups
test_bayes_t(control, treatment)test_bayes_t(control, treatment)
control |
Control group values |
treatment |
Treatment group values |
Bayes factor (BF10: evidence for alternative over null)
Performs a bootstrap-t test comparing two groups
test_bootstrap_t(control, treatment, n_boot = 1000)test_bootstrap_t(control, treatment, n_boot = 1000)
control |
Control group values |
treatment |
Treatment group values |
n_boot |
Number of bootstrap iterations (default 1000) |
p-value from the test
Performs rank products test comparing two groups Simplified implementation for two-group comparison
test_rankprod(control, treatment, n_perm = 1000)test_rankprod(control, treatment, n_perm = 1000)
control |
Control group values |
treatment |
Treatment group values |
n_perm |
Number of permutations for p-value estimation (default 1000) |
p-value from the test
Wilcoxon rank-sum test
test_wilcoxon(control, treatment)test_wilcoxon(control, treatment)
control |
Control group values |
treatment |
Treatment group values |
p-value from the test
Validate a peppwr_fits object
validate_peppwr_fits(x)validate_peppwr_fits(x)
x |
Object to validate |
The validated object (invisibly)
Validate a peppwr_power object
validate_peppwr_power(x)validate_peppwr_power(x)
x |
Object to validate |
The validated object (invisibly)