Peak Picking

One of the powerful tools of MassDash is the ability to perform peak picking on the chromatograms.

All supported peak pickers are located in the peakPicking module

Currently supported peak pickers are:

  1. MRMTransitionGroupPicker

  2. pyMRMTransitionGroupPicker

  3. ConformerPeakPicker

The general outline for any MassDash peak picker is to:

  1. Initiate the peak picking object

  2. Set the parameters (specific to each peak picker)

  3. use the pick() function to perform peak picking

  4. Visualize results

[3]:
from massdash.loaders import SqMassLoader
import os
pep = "NKESPT(UniMod:21)KAIVR(UniMod:267)"
charge = 3
loader = SqMassLoader(dataFiles=["xics/test_chrom_1.sqMass"], rsltsFile="osw/test_data.osw")
transitionGroup = list(loader.loadTransitionGroups(pep, charge).values())[0]
transitionGroupFeatures = loader.loadTransitionGroupFeaturesDf(pep, charge)

If the above code does not look familliar, please look at previous notebooks.

MRMTransitionGroupPicker

The MRMTransitionGroupPicker picker class is a wrapper around the pyopenms MRMTransitionGroupPicker object. This is the same peak picker that is used in OpenSwath.

1. Initiate MRMTransitionGroupPicker Object

Different peak pickers are initiated differently. The MRMTransitionGroupPicker requires one argument ‘smoother’ which is the smoothing that should be applied to the transition group before picking. This can be one of sgolay, gauss or original (no smoothing) and arguments for the smoother. Additional arguments specifying the smoothing can be supplied following. Below are some examples on initiating an MRMTransitionGroupPicker object

[4]:
from massdash.peakPickers import MRMTransitionGroupPicker

noSmoothing = MRMTransitionGroupPicker("original") # No smoothing
guassSmoothing = MRMTransitionGroupPicker("gauss", gauss_width=50.0) # Gaussian smoothing
sgolaySmoothing = MRMTransitionGroupPicker("sgolay", sgolay_frame_length = 11, sgolay_polynomial_order=3) #Sgolay smoothing

For the following example we will use the sgolay smoother as this is the default for OpenSwath.

Additional parameters can be set using the setGeneralParameters() method.

MRMTransitionGroupPicker.setGeneralParameters(**kwargs)

Set a supported parameter

Parameters:
  • stop_after_feature (int) – Stop after feature

  • stop_after_intensity_ratio (float) – Stop after intensity ratio

  • min_peak_width (float) – Minimum peak width

  • recalculate_peaks_max_z (float) – Recalculate peaks max z

  • resample_boundary (float) – Resample boundary

  • recalculate_peaks (bool) – Recalculate peaks

  • background_subtraction (str) – Background subtraction

  • use_precursors (bool) – Use precursors

  • signal_to_noise (float) – Signal to noise

  • minimal_quality (float) – Minimal quality (if set, automatically sets compute_peak_quality to true)

2. Customize Parameters

Here to cap ourselves at a reasonable number of features we will change stop_after_feature to 5. Also since we know that for this example the precursor signal is reasonable we will try turn on the use_precursor parameter. Since this precursor is quite high in intensity we can change the signal_to_noise cutoff to be more stringent.

[5]:
sgolaySmoothing.setGeneralParameters(stop_after_feature=5, signal_to_noise=0.001, use_precursors='true')

3. Pick Precursor

After we have set the parameters we can use the pick() function to pick the precursor

All peak pickers implement the pick() function which requires a TransitionGroup object and outputs a list of TransitionGroupFeature objects.

[6]:
features = sgolaySmoothing.pick(transitionGroup)

For easier inspection of the features, we can convert them to a pandas dataframe

[7]:
from massdash.structs import TransitionGroupFeature
TransitionGroupFeature.toPandasDf(features)
[7]:
leftBoundary rightBoundary areaIntensity qvalue consensusApex consensusApexIntensity
0 843.5 901.700012 223735.421875 None 865.628726 None
1 818.099976 843.900024 65524.886719 None 839.389013 None
2 1177.900024 1207.0 32929.136719 None 1196.055723 None
3 1051.099976 1087.5 48799.734375 None 1069.299988 None
4 978.400024 1011.099976 19260.796875 None 995.780439 None
5 1112.5 1152.5 36867.34375 None 1133.06705 None

4. Visualize Results

As shown in the plotting1D notebook, the chromatogram can easily be visualized directly from the transitionGroup object. Here instead of linking OpenSwath or DIA-NN found features, we can use the features that we just computed.

[8]:
transitionGroup.plot(transitionGroupFeatures=features)
Loading BokehJS ...

PyMRMPeakPicker

The pyMRMTransitionGroupPicker is a python reimplementation based on the pyopenms MRMPeakPicker above. Looking at the source code for this peak picker can give insights into how the OpenSwath peak picker is working in a python readable format.

1. Initiate the Peak Picker

This peak picker requires no arguments for initating the peak picking object

[9]:
from massdash.peakPickers.pyMRMTransitionGroupPicker import pyMRMTransitionGroupPicker
picker = pyMRMTransitionGroupPicker()

2. Customize Parameters

For this peak picker no parameters can be customized

3. Pick Precursor

[10]:
features = picker.pick(transitionGroup)
TransitionGroupFeature.toPandasDf(features)
/home/joshua/Development/massdash_alt/massdash/massdash/peakPickers/pyMRMTransitionGroupPicker.py:111: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  newPeaks['areaIntensity'].iloc[-1] += peaksDf['areaIntensity'].iloc[idx]
[10]:
leftBoundary rightBoundary areaIntensity qvalue consensusApex consensusApexIntensity
0 843.5 901.700012 1537205.913208 None None 235346.499471
1 818.099976 843.5 425795.080078 None None 78247.752761
2 1189.199951 1203.800049 2428.022644 None None 64678.458586
3 1167.400024 1189.199951 226734.0625 None None 28407.271623
4 1131.099976 1156.5 2091.005585 None None 10654.984463

Like above can inspect in pandas dataframe. We can see that the features are slightly different from above however the top two most intense features are the same.

4. Visualization

[11]:
transitionGroup.plot(transitionGroupFeatures=features)
Loading BokehJS ...

Note: Some of the intensities of the features are so small they cannot be seen.

Implementing Your Own Peak Picker

To implement your own peak picker create a custom python class that inherits from the GenericPeakPicker the only required method to implement is the pick() method which performs peak picking.