constants#

Shared constants for column names and sentinel values used across the package. All names live on the Constants class so they can be imported from a single location without risk of typos.

Shared constants for column names and sentinel values used across the package.

class casanovoutils.constants.Constants#

Bases: object

Global constants for column names and sentinel values.

ground_truth_sequence_columnstr

Name of the column holding ground truth peptide sequences.

aa_scores_columnstr

Name of the column holding per-amino-acid score strings.

pep_score_columnstr

Name of the column holding peptide-level search engine scores.

aa_idx_columnstr

Name of the column holding per-amino-acid positional indices, added during alignment and explosion.

precision_columnstr

Name of the column holding cumulative precision values computed by calc_precision_coverage.

coverage_columnstr

Name of the column holding cumulative coverage values computed by calc_precision_coverage.

min_scorefloat

Sentinel score assigned to gap positions during sequence alignment.

aa_idx_column: str = 'pc_aa_idx'#
aa_scores_column: str = 'mztab_opt_ms_run[1]_aa_scores'#
coverage_column: str = 'pc_coverage'#
static get_pred_sequence_column(df: DataFrame) str#

Determine the name of the predicted sequence column.

Checks for the presence of a ProForma-formatted prediction column first, falling back to the plain mzTab sequence column if it is absent.

Parameters:

df (pl.DataFrame) – A DataFrame expected to contain either "mztab_opt_ms_run[1]_proforma" or "mztab_sequence".

Returns:

The name of the predicted sequence column.

Return type:

str

ground_truth_sequence_column: str = 'mgf_seq'#
ground_truth_tokens: str = 'mgf_tokens'#
min_score: float = -1.0#
pep_score_column: str = 'mztab_search_engine_score[1]'#
precision_column: str = 'pc_precision'#
predicted_tokens: str = 'mztab_tokens'#