Peak Picking
One of the powerful tools of MassDash is the ability to perform peak picking on the chromatograms.
All supported peak pickers are located in the peakPicking module
Currently supported peak pickers are:
The general outline for any MassDash peak picker is to:
Initiate the peak picking object
Set the parameters (specific to each peak picker)
use the
pick()function to perform peak pickingVisualize results
[3]:
from massdash.loaders import SqMassLoader
import os
pep = "NKESPT(UniMod:21)KAIVR(UniMod:267)"
charge = 3
loader = SqMassLoader(dataFiles=["xics/test_chrom_1.sqMass"], rsltsFile="osw/test_data.osw")
transitionGroup = list(loader.loadTransitionGroups(pep, charge).values())[0]
transitionGroupFeatures = loader.loadTransitionGroupFeaturesDf(pep, charge)
If the above code does not look familliar, please look at previous notebooks.
MRMTransitionGroupPicker
The MRMTransitionGroupPicker picker class is a wrapper around the pyopenms MRMTransitionGroupPicker object. This is the same peak picker that is used in OpenSwath.
1. Initiate MRMTransitionGroupPicker Object
Different peak pickers are initiated differently.
The MRMTransitionGroupPicker requires one argument ‘smoother’ which is the smoothing that should be applied to the transition group before picking. This can be one of sgolay, gauss or original (no smoothing) and arguments for the smoother. Additional arguments specifying the smoothing can be supplied following. Below are some examples on initiating an MRMTransitionGroupPicker object
[4]:
from massdash.peakPickers import MRMTransitionGroupPicker
noSmoothing = MRMTransitionGroupPicker("original") # No smoothing
guassSmoothing = MRMTransitionGroupPicker("gauss", gauss_width=50.0) # Gaussian smoothing
sgolaySmoothing = MRMTransitionGroupPicker("sgolay", sgolay_frame_length = 11, sgolay_polynomial_order=3) #Sgolay smoothing
For the following example we will use the sgolay smoother as this is the default for OpenSwath.
Additional parameters can be set using the setGeneralParameters() method.
- MRMTransitionGroupPicker.setGeneralParameters(**kwargs)
Set a supported parameter
- Parameters:
stop_after_feature (int) – Stop after feature
stop_after_intensity_ratio (float) – Stop after intensity ratio
min_peak_width (float) – Minimum peak width
recalculate_peaks_max_z (float) – Recalculate peaks max z
resample_boundary (float) – Resample boundary
recalculate_peaks (bool) – Recalculate peaks
background_subtraction (str) – Background subtraction
use_precursors (bool) – Use precursors
signal_to_noise (float) – Signal to noise
minimal_quality (float) – Minimal quality (if set, automatically sets compute_peak_quality to true)
2. Customize Parameters
Here to cap ourselves at a reasonable number of features we will change stop_after_feature to 5. Also since we know that for this example the precursor signal is reasonable we will try turn on the use_precursor parameter. Since this precursor is quite high in intensity we can change the signal_to_noise cutoff to be more stringent.
[5]:
sgolaySmoothing.setGeneralParameters(stop_after_feature=5, signal_to_noise=0.001, use_precursors='true')
3. Pick Precursor
After we have set the parameters we can use the pick() function to pick the precursor
All peak pickers implement the pick() function which requires a TransitionGroup object and outputs a list of TransitionGroupFeature objects.
[6]:
features = sgolaySmoothing.pick(transitionGroup)
For easier inspection of the features, we can convert them to a pandas dataframe
[7]:
from massdash.structs import TransitionGroupFeature
TransitionGroupFeature.toPandasDf(features)
[7]:
| leftBoundary | rightBoundary | areaIntensity | qvalue | consensusApex | consensusApexIntensity | |
|---|---|---|---|---|---|---|
| 0 | 843.5 | 901.700012 | 223735.421875 | None | 865.628726 | None |
| 1 | 818.099976 | 843.900024 | 65524.886719 | None | 839.389013 | None |
| 2 | 1177.900024 | 1207.0 | 32929.136719 | None | 1196.055723 | None |
| 3 | 1051.099976 | 1087.5 | 48799.734375 | None | 1069.299988 | None |
| 4 | 978.400024 | 1011.099976 | 19260.796875 | None | 995.780439 | None |
| 5 | 1112.5 | 1152.5 | 36867.34375 | None | 1133.06705 | None |
4. Visualize Results
As shown in the plotting1D notebook, the chromatogram can easily be visualized directly from the transitionGroup object. Here instead of linking OpenSwath or DIA-NN found features, we can use the features that we just computed.
[8]:
transitionGroup.plot(transitionGroupFeatures=features)
PyMRMPeakPicker
The pyMRMTransitionGroupPicker is a python reimplementation based on the pyopenms MRMPeakPicker above. Looking at the source code for this peak picker can give insights into how the OpenSwath peak picker is working in a python readable format.
1. Initiate the Peak Picker
This peak picker requires no arguments for initating the peak picking object
[9]:
from massdash.peakPickers.pyMRMTransitionGroupPicker import pyMRMTransitionGroupPicker
picker = pyMRMTransitionGroupPicker()
2. Customize Parameters
For this peak picker no parameters can be customized
3. Pick Precursor
[10]:
features = picker.pick(transitionGroup)
TransitionGroupFeature.toPandasDf(features)
/home/joshua/Development/massdash_alt/massdash/massdash/peakPickers/pyMRMTransitionGroupPicker.py:111: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:
df["col"][row_indexer] = value
Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
newPeaks['areaIntensity'].iloc[-1] += peaksDf['areaIntensity'].iloc[idx]
[10]:
| leftBoundary | rightBoundary | areaIntensity | qvalue | consensusApex | consensusApexIntensity | |
|---|---|---|---|---|---|---|
| 0 | 843.5 | 901.700012 | 1537205.913208 | None | None | 235346.499471 |
| 1 | 818.099976 | 843.5 | 425795.080078 | None | None | 78247.752761 |
| 2 | 1189.199951 | 1203.800049 | 2428.022644 | None | None | 64678.458586 |
| 3 | 1167.400024 | 1189.199951 | 226734.0625 | None | None | 28407.271623 |
| 4 | 1131.099976 | 1156.5 | 2091.005585 | None | None | 10654.984463 |
Like above can inspect in pandas dataframe. We can see that the features are slightly different from above however the top two most intense features are the same.
4. Visualization
[11]:
transitionGroup.plot(transitionGroupFeatures=features)
Note: Some of the intensities of the features are so small they cannot be seen.
Implementing Your Own Peak Picker
To implement your own peak picker create a custom python class that inherits from the GenericPeakPicker the only required method to implement is the pick() method which performs peak picking.