massdash.loaders.access.ResultsTSVDataAccess

class massdash.loaders.access.ResultsTSVDataAccess(filename: str, verbose: bool = False)

Bases: GenericResultsAccess

Class for generic access to TSV file containing the results, currently only supports DIA-NN tsv files

detectResultsType(columns) Literal['OpenSWATH', 'DIA-NN', 'DreamDIA']

Detects the type of results file by looking at the column names

getExactRunName(run_basename_wo_ext: str) str

Returns the run name from the filename

getIdentifiedPeptides(qvalue: float = 0.01, run: str | None = None) set | Dict[str, set]

Get the identified peptides at a certain q-value.

Parameters:
  • qvalue – (float) The q-value threshold for identification

  • run – (str) The run name for which to get the identified peptides, if None, get for all runs

Returns:

The identified peptides across all runs (Dict[str, set]) or for a single run (set)

getIdentifiedPrecursorIntensities(qvalue: float = 0.01, run: str | None = None, precursorLevel=False) DataFrame

Get a dataframe of identified precursors and their intensities from the results file :param qvalue: Qvalue threshold :type qvalue: float :param run: Run name :type run: str :param precursorLevel: If True, do not filter by protein Q.Value (only on precursor level) - “False” Only supported for DIA-NN results type will automatically be True otherwise :type precursorLevel: bool

getIdentifiedPrecursors(qvalue: float = 0.01, run: str | None = None, precursorLevel=False) set | Dict[str, set]

Get identified precursors from the results file :param qvalue: Qvalue threshold :type qvalue: float :param run: Run name :type run: str :param precursorLevel: If True, do not filter by protein Q.Value (only on precursor level) - “False” Only supported for DIA-NN results type will automatically be True otherwise :type precursorLevel: bool

getIdentifiedProteins(qvalue: float = 0.01, run: str | None = None) set | Dict[str, set]

Get the identified proteins at a certain q-value.

Parameters:
  • qvalue – (float) The q-value threshold for identification

  • run – (str) The run name for which to get the identified proteins, if None, get for all runs

Returns:

The identified proteins across all runs (Dict[str, set]) or for a single run (set)

getRunNames() List[str]

Get run names without the file extension

Returns:

List of run names

Return type:

list

getTopTransitionGroupFeature(runname: str, pep: str, charge: int) TransitionGroupFeature

Loads the top TransitionGroupFeature from the results file :param pep_id: Peptide ID :type pep_id: str :param charge: Charge :type charge: int

Returns:

TransitionGroupFeature object containing peak boundaries, intensity and confidence

Return type:

TransitionGroupFeature

getTopTransitionGroupFeatureDf(runname: str, pep_id: str, charge: int) DataFrame

Get a pandas dataframe with the top TransitionGroupFeatures found in the results file. Since there is only one feature this is the same as getTransitionGroupFeaturesDf

Parameters:
  • pep_id (str) – Peptide ID

  • charge (int) – Charge

Returns:

Dataframe with the TransitionGroupFeatures

Return type:

pd.DataFrame

getTransitionGroupFeatures(runname: str, peptide: str, charge: int)

Loads a PeakFeature object from the results file :param pep_id: Peptide ID :type pep_id: str :param charge: Charge :type charge: int

Returns:

TransitionGroupFeature object containing peak boundaries, intensity and confidence

Return type:

TransitionGroupFeature

getTransitionGroupFeaturesDf(runname: str, pep_id: str, charge: int) DataFrame

Loads a TransitionGroupFeature object from the results file to a pandas dataframe. Since there is only one feature this is the same as getTopTransitionGroupFeatureDf()

get_top_rank_precursor_features_across_runs()

Get the top ranked precursor features across all runs

loadData() DataFrame

This method loads the data from self.filename into a pandas dataframe