massdash.loaders.access.OSWPQResultsAccess
- class massdash.loaders.access.OSWPQResultsAccess(filename: str, verbose: bool = False)
Bases:
GenericResultsAccessClass for accessing .oswpq directory containing precursors_features.parquet and transition_features.parquet files.
The OSWPQResultsAccess class provides memory-efficient parsing of OpenSWATH results stored in Parquet format. See https://pyprophet.readthedocs.io/en/latest/file_formats.html#split-parquet-format-parquet-oswpq-oswpqd for details. It uses PyArrow datasets for lazy evaluation, avoiding loading entire files into memory and enabling efficient filtering and column projection at the parquet level.
- Parameters:
filename (str) – Path to the .oswpq directory containing the required parquet files
verbose (bool, optional) – Enable verbose logging (default: False)
- Raises:
ValueError – If filename is not a directory
FileNotFoundError – If required parquet files are missing
RuntimeError – If parquet files cannot be loaded or PyArrow is not available
Notes
The .oswpq directory must contain exactly these two files: - precursors_features.parquet: Precursor-level features and scoring - transition_features.parquet: Transition-level features and intensities
This class requires PyArrow for lazy evaluation using PyArrow datasets.
Examples
>>> access = OSWPQResultsAccess('/path/to/results.oswpq') >>> runs = access.getRunNames() >>> precursors = access.getIdentifiedPrecursors(qvalue=0.01) >>> has_im = access.has_im
- getIdentifiedPeptides(qvalue: float = 0.01, run: str | None = None, context: Literal['global', 'run_specific', 'experiment_wide'] = 'run_specific') set | Dict[str, set]
Get identified peptides
- getIdentifiedPrecursorIntensities(qvalue: float = 0.01, run: str | None = None, precursorLevel: bool = False, context: Literal['global', 'run_specific', 'experiment_wide'] = 'run_specific') DataFrame
Get identified precursor intensities
- getIdentifiedPrecursors(qvalue: float = 0.01, run: str | None = None, precursorLevel: bool = False, context: Literal['global', 'run_specific', 'experiment_wide'] = 'run_specific') set | Dict[str, set]
Get identified precursors at specified q-value threshold
- getIdentifiedProteins(qvalue: float = 0.01, run: str | None = None, context: Literal['global', 'run_specific', 'experiment_wide'] = 'run_specific') set | Dict[str, set]
Get identified proteins
- getPrecursorID(pep: str, charge: int) int | None
Get precursor ID for a given peptide and charge
- getSoftware() str
Return software name
- getTopTransitionGroupFeature(runname: str, pep: str, charge: int) TransitionGroupFeature
Get the top (best q-value) transition group feature
- getTopTransitionGroupFeatureDf(runname: str, pep: str, charge: int) DataFrame
Get the top (best q-value) transition group feature as DataFrame
- getTransitionGroupFeatures(runname: str, pep: str, charge: int) List[TransitionGroupFeature]
Get transition group features for a specific peptide and charge
- getTransitionGroupFeaturesDf(runname: str, pep: str, charge: int) DataFrame
Get transition group features as DataFrame
- property has_im: bool
Check if the data contains ion mobility information
- populateTransitionGroupFeature(transition_group_feature: TransitionGroupFeature) TransitionGroupFeature
Appends library information to a TransitionGroupFeature object
- Parameters:
transition_group_feature (TransitionGroupFeature) – The TransitionGroupFeature object to append library information to.
- Returns:
The TransitionGroupFeature object with appended library information.
- Return type: