API

Boosting

boosting.boost(csm_fdr=(0.0, 1.0), pep_fdr=(0.0, 1.0), prot_fdr=(0.0, 1.0), link_fdr=(0.0, 1.0), ppi_fdr=(0.0, 1.0), boost_cols=None, neg_boost_cols=None, boost_level='ppi', boost_between=True, method='manhattan', decoy_adjunct='REV_', countdown=3, points=10, n_jobs=-1, **kwargs)

Find the best FDR cutoffs to optimize results for a certain FDR level.

Parameters:

df (polars.dataframe.frame.DataFrame) – CSM DataFrame
csm_fdr ((float, float)) – Search range for CSM FDR level cutoff
pep_fdr ((float, float)) – Search range for peptide FDR level cutoff
prot_fdr ((float, float)) – Search range for protein FDR level cutoff
link_fdr ((float, float)) – Search range for residue link FDR level cutoff
ppi_fdr ((float, float)) – Search range for protein pair FDR level cutoff
boost_cols (list) – Columns in which to look for lower cutoffs
neg_boost_cols (list) – Columns in which to look for upper cutoffs
boost_level (str) – FDR level tp boost for
boost_between (bool) – Whether to boost for between links
method (str) – Search algorithm to use
countdown (int) – Number interation without improvement to stop
points (int) – Number of FDR cutoffs to search in one iteration
n_jobs (int) – Number of threads to use

Return type:

(float, float, float, float, float)

Returns:

Returns a tuple with the optimal FDR levels.

Multi-Level FDR calculation

fdr.full_fdr(csm_fdr=1.0, pep_fdr=1.0, prot_fdr=1.0, link_fdr=1.0, ppi_fdr=1.0, min_len=5, decoy_adjunct='REV_', unique_csm=True, filter_back=True, prepare_column=True, td_prob=2, td_prot_prob=10, td_dd_ratio=1.0, custom_aggs=None)

Parameters:

df (typing.Union[polars.dataframe.frame.DataFrame, pandas.DataFrame]) – Input CSM dataframe
csm_fdr (float) – CSM level FDR cutoff
pep_fdr (float) – Peptide level FDR cutoff
prot_fdr (float) – Protein level FDR cutoff
link_fdr (float) – Link level FDR cutoff
ppi_fdr (float) – Protein pair level FDR cutoff
min_len (int) – Minimum peptide sequence length
decoy_adjunct (str) – Prefix/Suffix indicating a decoy match
unique_csm (bool) – Make CSMs unique
filter_back (bool) – Filter lower levels to include only matches that also pass on higher levels
prepare_column (bool) – Perform preparation of aggregation columns like sorting ambiguous proteins and swapping protein 1/2
td_prob (int) – Minimum theoretical TD machtes for the FDR levels (except protein level)
td_prot_prob (int) – Minimum theoretical TD machtes for the protein FDR level
td_dd_ratio (float) – Minimum ratio of TD/DD
custom_aggs (dict) – Custom aggregation functions for the FDR levels

Return type:

dict[str, polars.dataframe.frame.DataFrame]

Returns:

Return a dict with keys ‘csm’, ‘pep’, ‘prot’, ‘link’, ‘ppi’ that contains the resulting polars DataFrame for each FDR level.