massdash.loaders.access.OSWDataAccess
- class massdash.loaders.access.OSWDataAccess(*args, mode: Literal['module', 'gui'] = 'module', **kwargs)
Bases:
GenericResultsAccessA class for accessing data from an OpenSWATH SQLite database.
- conn
A connection to the SQLite database.
- Type:
sqlite3.Connection
- c
A cursor for executing SQL statements on the database.
- Type:
sqlite3.Cursor
- verbose
Whether to print verbose output.
- Type:
bool
- mode
The mode to use when intiating the data access object, to control which attributes get initialized.
- Type:
str
- getAllTopTransitionGroupFeaturesDf() DataFrame
Retrieves all the top ranking features from the database.
- Returns:
The top ranking features per assay.
- Return type:
pandas.DataFrame
- getIdentifiedPeptides(qvalue: float = 0.01, run: str | None = None) set | Dict[str, set]
Get the identified peptides at a certain q-value.
- Parameters:
qvalue – (float) The q-value threshold for identification
run – (str) The run name for which to get the identified peptides, if None, get for all runs
- Returns:
The identified peptides across all runs (Dict[str, set]) or for a single run (set)
- getIdentifiedPrecursorIntensities(qvalue: float = 0.01, run: str | None = None, precursorLevel=False)
Get the identified precursor intensities at a certain q-value.
- Parameters:
**kwargs (dict) – Additional arguments to be passed to the getIdentifiedPrecursor function
- Returns:
Precursor, runName, Intensity) or for a single run (DataFrame with columns: Precursor, Intensity)
- Return type:
The identified precursor intensities across all runs (DataFrame with columns
- getIdentifiedPrecursors(qvalue: float = 0.01, run: str | None = None, precursorLevel=False)
Retrives a set of identified precursors
- Parameters:
run (str) – The run name.
qvalue (float) – The q-value threshold.
precursorLevel (bool) – True indicates q-value filtering only done on the precursor level
- getIdentifiedProteins(qvalue: float = 0.01, run: str | None = None) set | Dict[str, set]
Get the identified proteins at a certain q-value.
- Parameters:
qvalue – (float) The q-value threshold for identification
run – (str) The run name for which to get the identified proteins, if None, get for all runs
- Returns:
The identified proteins across all runs (Dict[str, set]) or for a single run (set)
- getPeptideTable(remove_ipf_peptides=True)
Retrieves the peptide table from the database.
- Parameters:
remove_ipf_peptides (bool) – Whether to remove IPF peptides from the table.
- Returns:
The peptide table.
- Return type:
pandas.DataFrame
- getPeptideTableFromProteinID(protein_id, remove_ipf_peptide=True)
Retrieves the peptide table from the database for a given protein ID.
- Parameters:
protein_id (int) – The protein ID.
remove_ipf_peptides (bool) – Whether to remove IPF peptides from the table.
- Returns:
The peptide table.
- Return type:
pandas.DataFrame
- getPeptideTransitionInfo(fullpeptidename, charge)
Retrieves transition information for a given peptide and charge.
- Parameters:
fullpeptidename (str) – The full modified sequence of the peptide.
charge (int) – The precursor charge.
- Returns:
The transition information.
- Return type:
pandas.DataFrame
- getPrecursorCharges(fullpeptidename)
Retrieves the precursor charges for a given peptide.
- Parameters:
fullpeptidename (str) – The full modified sequence of the peptide.
- Returns:
The precursor charges.
- Return type:
pandas.DataFrame
- getProteinTable(include_decoys=False)
Retrieves the protein table from the database.
- Parameters:
include_decoys (bool) – Whether to include decoy proteins in the table.
- Returns:
The protein table.
- Return type:
pandas.DataFrame
- getRunNames() List[str]
Infer the run names from the results file, extensions are removed
- Returns:
The run names
- Return type:
list
- getScoreTable(score_table: Literal['SCORE_MS2', 'SCORE_MS1', 'SCORE_TRANSITION', 'SCORE_PEPTIDE', 'SCORE_PROTEIN', 'SCORE_IPF', 'FEATURE_MS2', 'FEATURE_MS1'], score: str, context: Literal['run-specific', 'experiment-wide', 'global'] = None) DataFrame
Get a Pandas DataFrame of target and decoy scores for a given score table and score.
- Parameters:
Literal["SCORE_MS2" (score_table) – Table which score is found in
"SCORE_MS1" (str) – Table which score is found in
"SCORE_TRANSITION" (str) – Table which score is found in
"SCORE_PEPTIDE" (str) – Table which score is found in
"SCORE_PROTEIN" (str) – Table which score is found in
"SCORE_IPF" (str) – Table which score is found in
"FEATURE_MS2" (str) – Table which score is found in
"FEATURE_MS1"]] (str) – Table which score is found in
score (str) – The score to retrieve
- Raises:
ValueError – Score is not valid score for plotting
- Returns:
A pandas DataFrame with 3 columns: Decoy, Score, and Run Name
- Return type:
pd.DataFrame
- getTransitionIDAnnotationFromSequence(fullpeptidename, charge)
Retrieves transition information for a given peptide and charge.
- Parameters:
fullpeptidename (str) – The full modified sequence of the peptide.
charge (int) – The precursor charge.
- Returns:
The transition information.
- Return type:
pandas.DataFrame
- get_score_distribution(score_table: str, context: Literal['run-specific', 'experiment-wide', 'global'] = None)
Retrieves the score distribution for a given score table.
- Parameters:
score_table (str) – The score table.
- Returns:
The score distribution.
- Return type:
pandas.DataFrame
- get_score_table_contexts(score_table: str)
Retrieves the score contexts from the database.
- Returns:
The score contexts.
- Return type:
list
- get_score_tables()
Retrieves the score tables from the database.
- Returns:
The score tables.
- Return type:
list
- get_top_rank_precursor_feature(fullpeptidename, charge)
Retrieves the top ranking precursor feature for a given peptide and charge.
- Parameters:
fullpeptidename (str) – The full modified sequence of the peptide.
charge (int) – The precursor charge.
- Returns:
The top ranking precursor feature.
- Return type:
pandas.DataFrame
- get_top_rank_precursor_features_across_runs()
Retrieves the top ranking precursor features across runs from the database.
- Returns:
The top ranking precursor features.
- Return type:
pandas.DataFrame
- load_data() DataFrame
Retrieves all the top ranking features from the database.
- Returns:
The top ranking features per assay.
- Return type:
pandas.DataFrame
- validateSQL()
Validate that connection is a true SQLite connection.
- Returns:
None - throws an error if connection is not valid.