Loading Spectrum Data
Spectrum data loaders implement the same methods as Chromatogram Data Loaders as well as some additional methods since more information can be gathered from spectrum data loaders. Fetching raw data with spectrum loaders takes more time since data is extracted on the fly. Additionally TargetedDIAConfig must be specified to instruct how the peptide should be extracted.
Initiating a Spectrum Data Loader
Most Spectrum Loaders require the following inputs.
dataFiles - a list of raw data files
rsltsFile - a
.oswor DIA-NN.tsvfile containing the featureslibraryFile - a
.tsv/.osw/.pqpfile contaning the library (m/z and annotations of all transitions)
We can initiate a MzMLDataLoader object with follows.
[3]:
from massdash.loaders import MzMLDataLoader
loader = MzMLDataLoader(dataFiles="mzml/ionMobilityTest.mzML",
rsltsFile=["osw/ionMobilityTest.osw", "diann/ionMobilityTest-diannReport.tsv"])
Initializing valid scores for selection
[2024-09-30 17:29:26,200] MzMLDataAccess - INFO - Opening mzml/ionMobilityTest.mzML file...: Elapsed 0.08319735527038574 ms
[2024-09-30 17:29:26,201] MzMLDataAccess - INFO - There are 50 spectra and 0 chromatograms.
[2024-09-30 17:29:26,202] MzMLDataAccess - INFO - There are 25 MS1 spectra and 25 MS2 spectra.
Note
If only a DIA-NN file is provided, a library must also be provided. If no library file is provided, MassDash will assume the .osw file should also be used as the library. The library is required for determining the extraciton coordinates.
For the purpose of this tutorial we will be using the OpenSwath results this approach will work with any properly initiated MzMLDataLoader.
Note
If a .osw file is provided as a rslts file and no library file is provided, MassDash will assume the .osw file should also be used as the library.
Loading a Transition Group
To fetch the chromatograms for a particular transitionGroup, we can call the loadTransitionGroups() method. In addition to the modified peptide sequence and charge state, this method also requires a TargetedDIAConfig which specifies the extraction parameters and will load the transition groups across all runs. This method can take a while since it is fetching the data across all experiments from disk.
In this example we will visualize the peptide NKESPT(UniMod:21)KAIVR(UniMod:267) with a charge state of 3
First, we can create a TargetedDIAConfig.
[4]:
from massdash.structs.TargetedDIAConfig import TargetedDIAConfig
extraction_config = TargetedDIAConfig()
extraction_config.im_window = 0.2
extraction_config.rt_window = 50
extraction_config.mz_tol = 20
Then we can invoke the loadTransitionGroups() method with the target sequence, charge and extraction config. Note: the extraction will always occur
[5]:
transitionGroup = loader.loadTransitionGroups("AFVDFLSDEIK", 2, extraction_config)
transitionGroup
[5]:
TransitionGroupCollection
ionMobilityTest: -------- TransitionGroup --------
precursor data: 1
transition data: 6
data type: Chromatogram
[6]:
type(transitionGroup)
[6]:
massdash.structs.TransitionGroupCollection.TransitionGroupCollection
Like the ChromatogramLoaders, a :py:class`~structs.TransitionGroupCollection` is loaded which is a dictionary is returned where the file keys are the the runname and the values are a TransitionGroup. The TransitionGroup holds a series of chromatograms belonging to the same precursor. This TransitionGroup object can be used for plotting.
Loading Chromatogram Data as a Pandas DataFrame
Like Chromatogram Data Loaders, data can be loaded into a pandas dataframe using the loadTransitionGroupsDf().
[7]:
transitionGroupDf = loader.loadTransitionGroupsDf("AFVDFLSDEIK", 2, extraction_config )
transitionGroupDf
[7]:
| run | Annotation | rt | int | |
|---|---|---|---|---|
| 0 | ionMobilityTest | prec | 6225.005106 | 229.011734 |
| 1 | ionMobilityTest | prec | 6226.792950 | 26.001631 |
| 2 | ionMobilityTest | prec | 6228.580932 | 57.999416 |
| 3 | ionMobilityTest | prec | 6230.367189 | 826.008179 |
| 4 | ionMobilityTest | prec | 6232.156436 | 1589.015259 |
| ... | ... | ... | ... | ... |
| 163 | ionMobilityTest | y9^1 | 6259.292755 | 4355.988281 |
| 164 | ionMobilityTest | y9^1 | 6261.101406 | 1168.029907 |
| 165 | ionMobilityTest | y9^1 | 6262.909095 | 1286.014038 |
| 166 | ionMobilityTest | y9^1 | 6264.711573 | 413.995209 |
| 167 | ionMobilityTest | y9^1 | 6266.515136 | 1217.012207 |
168 rows × 4 columns
This dataframe has all of the intensities and retention times for all of the files across all transitions. Transitions can be diffrentiated by the annotation column and the run column diffrentiates the run in which the chromatograms originate from. If ion mobility was present in the original file, intensities are summed across all values of ion mobility.
Note
If a pandas dataframe is required it is recomended to use the FeatureMap object directly as described below.
Loading a Feature Map
The primary datatype that is fetched from a FeatureMap which is contains a pandas dataframe of the extracted chromatogram across all precursors and transitions. Thus, under the hood, the loadTransitionGroups() method is fetching a TransitionGroup and converting it into a FeatureMap. Due to this conversion step, if a pandas dataframe is required, it is generally faster to work with the FeatureMap directly.
The FeatureMap object can be loaded using the loadFeatureMaps() method as demonstrated below.
[8]:
featureMap = loader.loadFeatureMaps("AFVDFLSDEIK", 2, extraction_config)
print(type(featureMap))
featureMap
<class 'massdash.structs.FeatureMapCollection.FeatureMapCollection'>
[8]:
{'ionMobilityTest': <massdash.structs.FeatureMap.FeatureMap at 0x75219b42d430>}
Simmilar to the loadTransitionGroup() method this method returns a :py:class`~structs.FeatureMapCollection` where the keys are the runnames and the values are the corresponding :py:class`~structs.FeatureMap`
The FeatureMap object has two important properties:
.feature_df property which returns the dataframe
.config property which returns the
TargetedDIAExtractionthat was used to generate thisFeatureMap
[9]:
featureMap['ionMobilityTest'].feature_df
[9]:
| native_id | ms_level | precursor_mz | mz | rt | im | int | Annotation | product_mz | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 642.3295 | 642.334187 | 6225.005106 | 0.900254 | 76.000458 | prec | 642.3295 | |
| 1 | 1 | 642.3295 | 642.334187 | 6225.005106 | 0.969271 | 153.011276 | prec | 642.3295 | |
| 2 | 2 | 642.3295 | 504.262011 | 6225.110817 | 0.935281 | 68.001518 | y4^1 | 504.2664 | |
| 3 | 2 | 642.3295 | 504.262011 | 6225.110817 | 1.025902 | 41.000328 | y4^1 | 504.2664 | |
| 4 | 2 | 642.3295 | 504.262011 | 6225.110817 | 0.926001 | 43.000782 | y4^1 | 504.2664 | |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 6812 | 2 | 642.3295 | 1065.546118 | 6266.515136 | 0.975441 | 8.999968 | y9^1 | 1065.5463 | |
| 6813 | 2 | 642.3295 | 1065.551224 | 6266.515136 | 0.986777 | 33.001766 | y9^1 | 1065.5463 | |
| 6814 | 2 | 642.3295 | 1065.551224 | 6266.515136 | 0.923945 | 84.003464 | y9^1 | 1065.5463 | |
| 6815 | 2 | 642.3295 | 1065.556331 | 6266.515136 | 0.910546 | 63.997871 | y9^1 | 1065.5463 | |
| 6816 | 2 | 642.3295 | 1065.556331 | 6266.515136 | 0.921891 | 54.000694 | y9^1 | 1065.5463 |
6817 rows × 9 columns
[10]:
featureMap['ionMobilityTest'].config
[10]:
<massdash.structs.TargetedDIAConfig.TargetedDIAConfig at 0x75217a365bb0>
Converting a FeatureMap to 1D data
A :py:class:`~structs.FeatureMap` be difficult to work with due to its high dimensionality. Thus, massDash has built in methods to convert a :py:class:`~structs.FeatureMap` into a :py:class:`~structs.Chromatogram` (retention time vs intensity), :py:class:`~structs.Spectrum` (m/z vs intensity) or, if ion mobility is present a :py:class:`~structs.Mobilogram` (intensity vs ion mobility)To accomplish this we can use the :py:func:`~structs.FeatureMap.to_chromatogram`, :py:func:`~structs.FeatureMap.to_spectra`, :py:func:`~structs.FeatureMap.to_mobilograms` methods respectively[11]:
chromatograms = featureMap['ionMobilityTest'].to_chromatograms()
chromatograms
[11]:
<massdash.structs.TransitionGroup.TransitionGroup at 0x752179ff22b0>
[12]:
spectra = featureMap['ionMobilityTest'].to_spectra()
spectra
[12]:
<massdash.structs.TransitionGroup.TransitionGroup at 0x75217a365100>
[13]:
mobilograms = featureMap['ionMobilityTest'].to_mobilograms()
mobilograms
[13]:
<massdash.structs.TransitionGroup.TransitionGroup at 0x75219aaa41f0>
Note
When converting a FeatureMap a TransitionGroup is always returned however the underlying data type is different based on the conversion method used.