massdash.loaders.access.OSWPQResultsAccess

class massdash.loaders.access.OSWPQResultsAccess(filename: str, verbose: bool = False)

Bases: GenericResultsAccess

Class for accessing .oswpq directory containing precursors_features.parquet and transition_features.parquet files.

The OSWPQResultsAccess class provides memory-efficient parsing of OpenSWATH results stored in Parquet format. See https://pyprophet.readthedocs.io/en/latest/file_formats.html#split-parquet-format-parquet-oswpq-oswpqd for details. It uses PyArrow datasets for lazy evaluation, avoiding loading entire files into memory and enabling efficient filtering and column projection at the parquet level.

Parameters:
  • filename (str) – Path to the .oswpq directory containing the required parquet files

  • verbose (bool, optional) – Enable verbose logging (default: False)

Raises:
  • ValueError – If filename is not a directory

  • FileNotFoundError – If required parquet files are missing

  • RuntimeError – If parquet files cannot be loaded or PyArrow is not available

Notes

The .oswpq directory must contain exactly these two files: - precursors_features.parquet: Precursor-level features and scoring - transition_features.parquet: Transition-level features and intensities

This class requires PyArrow for lazy evaluation using PyArrow datasets.

Examples

>>> access = OSWPQResultsAccess('/path/to/results.oswpq')
>>> runs = access.getRunNames()
>>> precursors = access.getIdentifiedPrecursors(qvalue=0.01)
>>> has_im = access.has_im
getIdentifiedPeptides(qvalue: float = 0.01, run: str | None = None, context: Literal['global', 'run_specific', 'experiment_wide'] = 'run_specific') set | Dict[str, set]

Get identified peptides

getIdentifiedPrecursorIntensities(qvalue: float = 0.01, run: str | None = None, precursorLevel: bool = False, context: Literal['global', 'run_specific', 'experiment_wide'] = 'run_specific') DataFrame

Get identified precursor intensities

getIdentifiedPrecursors(qvalue: float = 0.01, run: str | None = None, precursorLevel: bool = False, context: Literal['global', 'run_specific', 'experiment_wide'] = 'run_specific') set | Dict[str, set]

Get identified precursors at specified q-value threshold

getIdentifiedProteins(qvalue: float = 0.01, run: str | None = None, context: Literal['global', 'run_specific', 'experiment_wide'] = 'run_specific') set | Dict[str, set]

Get identified proteins

getPrecursorID(pep: str, charge: int) int | None

Get precursor ID for a given peptide and charge

getSoftware() str

Return software name

getTopTransitionGroupFeature(runname: str, pep: str, charge: int) TransitionGroupFeature

Get the top (best q-value) transition group feature

getTopTransitionGroupFeatureDf(runname: str, pep: str, charge: int) DataFrame

Get the top (best q-value) transition group feature as DataFrame

getTransitionGroupFeatures(runname: str, pep: str, charge: int) List[TransitionGroupFeature]

Get transition group features for a specific peptide and charge

getTransitionGroupFeaturesDf(runname: str, pep: str, charge: int) DataFrame

Get transition group features as DataFrame

property has_im: bool

Check if the data contains ion mobility information

populateTransitionGroupFeature(transition_group_feature: TransitionGroupFeature) TransitionGroupFeature

Appends library information to a TransitionGroupFeature object

Parameters:

transition_group_feature (TransitionGroupFeature) – The TransitionGroupFeature object to append library information to.

Returns:

The TransitionGroupFeature object with appended library information.

Return type:

TransitionGroupFeature