API

Boosting

boosting.boost(csm_fdr=(0.0, 1.0), pep_fdr=(0.0, 1.0), prot_fdr=(0.0, 1.0), link_fdr=(0.0, 1.0), ppi_fdr=(0.0, 1.0), boost_cols=None, neg_boost_cols=None, boost_level='ppi', boost_between=True, method='manhattan', decoy_adjunct='REV_', countdown=3, points=10, n_jobs=-1, **kwargs)

Find the best FDR cutoffs to optimize results for a certain FDR level.

Parameters:
  • df (polars.dataframe.frame.DataFrame) – CSM DataFrame

  • csm_fdr ((float, float)) – Search range for CSM FDR level cutoff

  • pep_fdr ((float, float)) – Search range for peptide FDR level cutoff

  • prot_fdr ((float, float)) – Search range for protein FDR level cutoff

  • link_fdr ((float, float)) – Search range for residue link FDR level cutoff

  • ppi_fdr ((float, float)) – Search range for protein pair FDR level cutoff

  • boost_cols (list) – Columns in which to look for lower cutoffs

  • neg_boost_cols (list) – Columns in which to look for upper cutoffs

  • boost_level (str) – FDR level tp boost for

  • boost_between (bool) – Whether to boost for between links

  • method (str) – Search algorithm to use

  • countdown (int) – Number interation without improvement to stop

  • points (int) – Number of FDR cutoffs to search in one iteration

  • n_jobs (int) – Number of threads to use

Return type:

(float, float, float, float, float)

Returns:

Returns a tuple with the optimal FDR levels.

Multi-Level FDR calculation

fdr.full_fdr(csm_fdr=1.0, pep_fdr=1.0, prot_fdr=1.0, link_fdr=1.0, ppi_fdr=1.0, min_len=5, decoy_adjunct='REV_', unique_csm=True, filter_back=True, prepare_column=True, td_prob=2, td_prot_prob=10, td_dd_ratio=1.0, custom_aggs=None)
Parameters:
  • df (typing.Union[polars.dataframe.frame.DataFrame, pandas.DataFrame]) – Input CSM dataframe

  • csm_fdr (float) – CSM level FDR cutoff

  • pep_fdr (float) – Peptide level FDR cutoff

  • prot_fdr (float) – Protein level FDR cutoff

  • link_fdr (float) – Link level FDR cutoff

  • ppi_fdr (float) – Protein pair level FDR cutoff

  • min_len (int) – Minimum peptide sequence length

  • decoy_adjunct (str) – Prefix/Suffix indicating a decoy match

  • unique_csm (bool) – Make CSMs unique

  • filter_back (bool) – Filter lower levels to include only matches that also pass on higher levels

  • prepare_column (bool) – Perform preparation of aggregation columns like sorting ambiguous proteins and swapping protein 1/2

  • td_prob (int) – Minimum theoretical TD machtes for the FDR levels (except protein level)

  • td_prot_prob (int) – Minimum theoretical TD machtes for the protein FDR level

  • td_dd_ratio (float) – Minimum ratio of TD/DD

  • custom_aggs (dict) – Custom aggregation functions for the FDR levels

Return type:

dict[str, polars.dataframe.frame.DataFrame]

Returns:

Return a dict with keys ‘csm’, ‘pep’, ‘prot’, ‘link’, ‘ppi’ that contains the resulting polars DataFrame for each FDR level.