Quick Start
This page provides an overview of the functionalities of MassDash and shows a highlights a few convience functions to help you get started with the python interface. The python interface is designed to be run in a jupyter notebook so that plots can be reproducibly generated for a particular peptide of interest.
Plotting a Chromatogram
Here, we will demonstrate how to plot a chromatogram of the peptide NKESPT(UniMod:21)KAIVR(UniMod:267) with a charge state of 3 from multiple different file inputs. The general process is that first the loader object must be initiated and then we can call the plotChromatogram() method to fetch an interactive plot for a specified precursor.
SqMass Visualization
Running OpenSwath with a out_chrom filename.sqMass will output the extracted chromatograms alongside the OpenSwath results. Since extraction is already performed this is the quickest way to visualize plots.
Since the .sqMass file does not contain any metadata, to link chromatograms with their corresponding peptide sequence we must link an .osw file with the .sqMass file. In MassDash this is done by initiating a SqMassLoader as shown below.
[3]:
from massdash.loaders import SqMassLoader
sqMassLoader = SqMassLoader(rsltsFile='osw/test_data.osw', dataFiles='xics/test_chrom_1.sqMass')
Initializing valid scores for selection
Now that the SqMassLoader is initiated we can use it to plot a chroamtogram. To plot a chromatogram, we use the plotChromatogram() function.
[4]:
sqMassLoader.plotChromatogram("NKESPT(UniMod:21)KAIVR(UniMod:267)", 3)
[4]:
By default the plot is smoothed and the boundaries of the features found in the OpenSwath file are plotted. The length of the boundary lines is indicative of the intensity of the feature. These can be turned off as shown below.
[5]:
sqMassLoader.plotChromatogram("NKESPT(UniMod:21)KAIVR(UniMod:267)", 3, includeBoundaries=False, smooth=False)
[5]:
Furthermore, the MS1 elution can also be shown.
[6]:
sqMassLoader.plotChromatogram("NKESPT(UniMod:21)KAIVR(UniMod:267)", 3, include_ms1=True)
[6]:
Note
Boundaries shown are directly copied from the the OpenSwath output and are not effected by changing smoothing parameters.
MzML Visualization
Chromatograms can also be extracted on the fly from .mzML files. This is useful for visualizing how a chromatogram would appear if different extraction parameters were used and if chromatograms were not saved upon data analysis.
First we must initiate a MzMLDataLoader which links a .mzML file with a results file and a library file. Currently supported results files include .osw and a DIA-NN .tsv report. We can link multiple result files in a single loader object in order to compare the features across software tools as shown below.
[7]:
from massdash.loaders import MzMLDataLoader
mzml_loader = MzMLDataLoader(rsltsFile=["example_dia/diann/report/test_1_diann_report.tsv", "example_dia/openswath/osw/test.osw"],
dataFiles="example_dia/raw/test_raw_1.mzML",
libraryFile="example_dia/diann/lib/test_1_lib.tsv")
Initializing valid scores for selection
[2024-10-10 13:02:19,790] MzMLDataAccess - INFO - Opening example_dia/raw/test_raw_1.mzML file...: Elapsed 0.3596317768096924 ms
[2024-10-10 13:02:19,791] MzMLDataAccess - INFO - There are 3867 spectra and 1 chromatograms.
[2024-10-10 13:02:19,805] MzMLDataAccess - INFO - There are 117 MS1 spectra and 3750 MS2 spectra.
Now we can plot each of these using the plotChromatogram() function. Here, we must also specify extraction parameters such as mz_tol (in ppm) and rt_window (in seconds).
[8]:
mzml_loader.plotChromatogram("DYASIDAAPEER", 2, mz_tol=20, rt_window=200, im_window=None, smooth=True)
[8]:
Visualizing Chromatograms Across Files
plotChromatogram() can also be supplied with multiple results file and data files. However, one of the results files supplied must always be a .osw file.
[9]:
from massdash.loaders import SqMassLoader
sqMassLoader = SqMassLoader(rsltsFile='osw/test_data.osw', dataFiles=['xics/test_chrom_1.sqMass', 'xics/test_chrom_2.sqMass'])
Initializing valid scores for selection
Currently, plotChromatogram() only supports visualization of a single run. Thus if multiple data files are supplied like above, the runName must also be specified. The runName is the name of the file without the extension, e.g. test_chrom_2
[10]:
sqMassLoader.plotChromatogram("NKESPT(UniMod:21)KAIVR(UniMod:267)", 3, runName='test_chrom_2', includeBoundaries=False)
[10]:
MzMLDataLoader can also be supplied with multiple .mzML as shown below.
[11]:
from massdash.loaders import MzMLDataLoader
mzml_loader = MzMLDataLoader(rsltsFile=["example_dia/diann/report/test_diann_report_combined.tsv", "example_dia/openswath/osw/test.osw"],
dataFiles=["example_dia/raw/test_raw_1.mzML", "example_dia/raw/test_raw_2.mzML"],
libraryFile="example_dia/diann/lib/test_2_lib.tsv")
Initializing valid scores for selection
[2024-10-10 13:02:20,817] MzMLDataAccess - INFO - Opening example_dia/raw/test_raw_1.mzML file...: Elapsed 0.31288671493530273 ms
[2024-10-10 13:02:20,818] MzMLDataAccess - INFO - There are 3867 spectra and 1 chromatograms.
[2024-10-10 13:02:20,836] MzMLDataAccess - INFO - There are 117 MS1 spectra and 3750 MS2 spectra.
[2024-10-10 13:02:21,181] MzMLDataAccess - INFO - Opening example_dia/raw/test_raw_2.mzML file...: Elapsed 0.33651185035705566 ms
[2024-10-10 13:02:21,182] MzMLDataAccess - INFO - There are 3867 spectra and 1 chromatograms.
[2024-10-10 13:02:21,198] MzMLDataAccess - INFO - There are 117 MS1 spectra and 3750 MS2 spectra.
[12]:
mzml_loader.plotChromatogram("DYASIDAAPEER", 2,
mz_tol=20, rt_window=200, im_window=None,
runName='test_raw_2', includeBoundaries=True)
[2024-10-10 13:02:21,594] InteractivePlotter - WARNING - prec is empty, Returning 0.
[2024-10-10 13:02:21,598] InteractivePlotter - WARNING - prec is empty, Returning 0.
[12]:
More Information
For more details on the implementation please checkout the Python API.