Aug 26, 2022

Public workspaceProtocol for Data Independent Acquisition - Mass spectrometry analysis – a DIA-based Organelle Proteomics

  • 1Medical Research Council Protein Phosphorylation and Ubiquitylation Unit, College of Life Sciences, University of Dundee, Dow Street, Dundee, DD1 5EH;
  • 2Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
Icon indicating open access to content
QR code linking to this content
Protocol CitationRaja Sekhar Nirujogi, Rotimi Fasimoye, Toan K Phung, Dario R Alessi 2022. Protocol for Data Independent Acquisition - Mass spectrometry analysis – a DIA-based Organelle Proteomics. protocols.io https://dx.doi.org/10.17504/protocols.io.kxygxzrokv8j/v1
Manuscript citation:
Fasimoye R, Dong W, Nirujogi RS, Rawat ES, Iguchi M, Nyame K, Phung TK, Bagnoli E, Prescott AR, Alessi DR, Abu-Remaileh M, Golgi-IP, a tool for multimodal analysis of Golgi molecular content. Proceedings of the National Academy of Sciences of the United States of America 120(20). doi: 10.1073/pnas.2219953120
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol and it's working
Created: August 18, 2022
Last Modified: May 31, 2024
Protocol Integer ID: 68832
Keywords: Mass spectrometry, Proteomics, Orbitrap exploris, ASAPCRN
Funders Acknowledgement:
Aligning Science Across Parkinson’s (ASAP) initiative
Grant ID: 000463
UK Medical Research Council
Grant ID: MC_UU_00018/1
Abstract
Purification of intact organelles by previously described methods (dx.doi.org/10.17504/protocols.io.bybjpskn; dx.doi.org/10.17504/protocols.io.6qpvrdjrogmk/v1) allows to profile the organelle proteome using quantitative mass spectrometry. Here we provide a detailed protocol for the Data Independent Acquisition (DIA)-based mass spectrometry (MS) data acquisition method for proteomic profiling of the Golgi. This includes a description of how to construct the nano Liquid chromatography and DIA MS methods as well as a Data Dependent Acquisition (DDA) strategy to generate deep spectral libraries to be able to use in searching the DIA data. In addition, we provide detailed search parameters for database search for both DDA and DIA and downstream MS data analysis.
Attachments
Materials
Reagents/Buffers:

  1. ReagentAcetonitrile ≥99.9%VWR AvantorCatalog #1.00030.2500
  2. ReagentLC-grade Formic acidSigma – AldrichCatalog #695076
  3. ReagentLC-grade WaterFisher ScientificCatalog #10777404
  4. ReagentAmmonium formateSigma AldrichCatalog #70221-25G-F
  5. ReagentAcetonitrile ≥99.9%VWR AvantorCatalog #1.00030.2500
  6. Ammonium Hydroxide (Acros #42305000)
  7. X100 20 mL Amber Glass EPA Vial (Cole Parmer # 11533750)
  8. LC- vials (VWR #548-0120)
  9. Snap ring caps (548-0016)
  10. ReagentHigh Recovery Vials & InsertsAgilent TechnologiesCatalog #5181-8872
  11. Acclaim 2cm trap column (C18, 5µm, 100Ao, 100µ, 2cm Nano-viper column # 164564, Thermo Scientific)
  12. 50cm analytical column (C18, 5µm, 50cm, 100Ao Easy nano spray column # ES903, Thermo Scientific)
  13. Water’s XBridge Peptide BEH C18 Column (130 Ao, 3.5 µm, 1mm X 100mm. Waters #186003561).
  14. Reagent96-Well DeepWell™ Polypropylene MicroplatesThermo FisherCatalog #12566120
  15. ReagentpH indicator strips mid rangeVWR international LtdCatalog #1.09584.0001
  16. Biognosys iRT peptides mix (https://biognosys.com/product/irt-kit/)
  17. High-pH RPLC buffer: Solvent-A: 10 mM Ammonium formate (w/v) in LC-MS grade water. Solvent-B: 10 mM Ammonium Bicarbonate (w/v) in 80% ACN (v/v). Adjust pH to 10.0 using Ammonium Hydroxide
  18. LC buffer: 3% ACN (v/v) in 0.1% Formic acid (v/v).
Note
Note: Prepare a stock of 40ml in Amber glass vials and store at room temperature. The buffer can be stored up to 2 months. Avoid using any autoclaved pipette tips in aliquoting the buffer.

Equipment/Software:

  1. Dionex Ultimate 3000 nano-liquid chromatography system.
  2. Orbitrap Exploris 480 mass spectrometer.
  3. Dionex Ultimate 3000 liquid chromatography system with UV detector and fraction collector modules.
  4. Thermomixer.
  5. Speedvac (Thermo Scientific #SPD140DDA).
  6. MaxQuant software suite.
  7. Biognosys Spectronaut Software suite (Optional: DIA-NN Software suite).
High-pH Reversed-phase Liquid Chromatography fractionation of pooled Golgi-tag IP peptides to generate Spectral library:
High-pH Reversed-phase Liquid Chromatography fractionation of pooled Golgi-tag IP peptides to generate Spectral library:
Take ~Amount5 µg of peptide digest from each of the Golgi-tag IP and Control-IP sample.
Vacuum dry the pooled samples.
Dissolve the peptide digest by adding Amount120 µL of High-pH Solvent-A (Concentration10 millimolar (mM) Ammonium formate Ph10.0 ). Place the sample on a Thermomixer with an agitation at Centrifigation1800 rpm for Duration00:30:00 .
30m
Centrifigation
Pipetting
Centrifuge the sample at high speed (Centrifigation17000 x g ) for Duration00:05:00 at TemperatureRoom temperature .
5m
Centrifigation
Take Amount0.5 µL of the sample and verify the pH and transfer the sample into LC-vial.
Ensure the LC-solvent are as Solvent-A (Concentration10 millimolar (mM) Ammonium formate Ph10.0 ); Solvent-B (90% ACN (v/v) in Concentration10 millimolar (mM) Ammonium formate Ph10.0 ).
Note
Note: Adjust the pH with 30% Ammonium Hydroxide.

Prepare the LC method by following the below gradient:
ABC
Time (minutes)Nano pump Flow rate (µl/min)% Of Solvent-B
0.00.1003.0
5.00.1007.0
5.50.1007.0
10.00.10010.0
50.00.10040.0
55.00.10090.0
62.00.10090.0
62.50.1003.0
70.00.1003.0
70.10.01003.0
Set the fraction collection time as Start time Duration00:05:05 and End time Duration01:02:00 .
1h 7m 5s
Collect a total of 45 fractions by keeping the fraction collection for Duration00:01:15 for each fraction.
1m 15s
Transfer the fractions into a pre-labelled Amount1.5 mL protein lo binding tubes.
Vacuum dry the samples and freeze in -20 freezer until the LC-MS/MS analysis.
Single shot DIA acquisition on Orbitrap Exploris 480:
Single shot DIA acquisition on Orbitrap Exploris 480:
Dissolve vacuum dried peptides in Amount60 µL of LC buffer (3% ACN in 0.1% Formic acid) and place the samples on a Thermomixer and mix them at Centrifigation1800 rpm at TemperatureRoom temperature for about Duration00:30:00 .
30m
Centrifigation
Mix
Take Amount4 µg equivalent of peptide digest and spike Amount1 µL of iRT peptide mix. Adjust the total volume of the sample anywhere between Amount5 µL to Amount15 µL but don’t exceed Amount15 µL . Transfer the sample into glass insert and place them on LC vial.
Note
Note: Strictly avoid using any autoclaved pipette tips. If possible, use dedicated Pipette set for MS analysis.

Construct LC and vDIA MS method as described below using Xaclibur software integrated in Thermo Orbitrap Exploris 480 MS acquisition software suite.
Ensure Thikness2 cm trap column (C18, Amount5 μm , 100A°, 100 µ, Thikness2 cm Nano-viper column # 164564, Thermo Scientific) and Thikness50 cm analytical column (C18, Concentration5 micromolar (µM) , Thikness50 cm , 100Aº Easy nano spray column # ES903, Thermo Scientific) are equilibrated and verify the column performance by injecting Amount50 ng HeLa or another standard digest.

Nano LC gradient for Duration02:25:00 DIA analysis:
ABC
Time (minutes)Nano pump Flow rate (µl/min)% Of Solvent-B
0.00.2503.0
12.00.2507.0
115.00.25025.0
129.00.25037.0
130.00.25095.0
135.300.25095.0
135.800.2503.0
145.00.2503.0
145.0Stop Run
2h 25m
Mass spectrometer parameters: Refer below settings to construct variable DIA method:
ABCD
Method duration145 min
MS Global settings:
Infusion mode:Liquid Chromatography
Expected LC peak width (s):20
Advanced Peak determination:TRUE
Default charge state:3
Internal mass calibration:offNote: If needed enable user defined calibrant ion (Polysilaxolane 445.120025 or enable Easy-IC option
Full scan settings:
Orbitrap resolution:120000
Scan range (m/z):375-1500
RF lens (%):40
AGC target:Custom
Normalized AGC target (%):300
Maximum injection Time mode:Custom
Maximum injection Time (ms):30
Micro scans:1
Data type:Profile
tMS2 or DIA settingsIsolation offset:Off
Collision Energy Mode:Stepped
Collision Energy Type:Normalized
HCD Collision Energy (%):25, 28, 32
Orbitrap resolution:30000
Scan range mode:Define m/z range
Scan Range (m/z):200 - 1200Note: Maximum of the matched fragement ions (b series and y-series) fall within this range and if needed this can be modified.
RF Lens (%):50
AGC target:Custom
Normalized AGC target (%):3000Note: It is recommended to fill the trap with a maximum accumulation of ions (3000% = 3E6 ions) for each of the DIA window to increase the sensitivity
Maximum injection Time mode:Custom
Maximum injection Time (ms):70
Micro scans:1
Data type:Profile
Polarity:Positive
Loop control:N
N (Number of Spectra): 24We include one full MS1 scan after every 24 DIA scans to accommodate maximum possible MS1 scans
Dynamic RT:Off
Time Mode: Unscheduled
ABCD
Scheme of vDIA windows mass list table:
m/zz Isolation Window (mz)
383.375366.8
423313.5
435311.5
446.5312.5
458311.5
469311.5
480311.5
490.5310.5
501311.5
512311.5
523311.5
533.5310.5
544311.5
554.5310.5
565311.5
575.5310.5
586311.5
597.5312.5
609.5312.5
621.5312.5
633311.5
645313.5
657.5312.5
670.5314.5
684313.5
697313.5
710.5314.5
725.5316.5
741315.5
756.53 16.5
773.5318.5
791317.5
808.5318.5
827319.5
846.5320.5
866.5320.5
887.5322.5
910.5324.5
935.5326.5
962.5328.5
992331.5
1025335.5
1063341.5
1108.5350.5
1391.6253516.8

Export the MS raw data for database searches by library-free (direct DIA) or library-based as illustrated in the workflow with Biognosys Spectronaut software suite.
Note
Optional: As the Biognosys Spectronaut is a commercial software suite if you don’t have access to it then you could use an open-source software suite such as DIA-NN.

Computational step
Optional
Data Dependent Acquisition (DDA) MS analysis to generate Spectral library:
Data Dependent Acquisition (DDA) MS analysis to generate Spectral library:
Dissolve vacuum dried peptides of each fraction in Amount60 µL of LC buffer (3% ACN in 0.1% Formic acid) and place the samples on a Thermomixer and mix them at Centrifigation1800 rpm at TemperatureRoom temperature for about Duration00:30:00 .
30m
Centrifigation
Mix
Take Amount1 µg equivalent of peptide digest and Spike Amount1 µL of iRT peptide mix. Adjust the total volume of the sample anywhere between Amount5 µL to Amount15 µL but don’t exceed Amount15 µL . Transfer the sample into glass insert and place them on LC vial.
Note
Note: Strictly avoid using any autoclaved pipette tips. If possible, use dedicated Pipette set for MS analysis.

Ensure Thikness2 cm trap column (C18, Amount5 μm , 100Ao, 100 µ, Thikness2 cm Nano-viper column # 164564, Thermo Scientific) and Thikness50 cm analytical column (C18, Concentration5 micromolar (µM) , Thikness50 cm , 100Ao Easy nano spray column # ES903, Thermo Scientific) are equilibrated and verify the column performance by injecting Amount50 ng HeLa or another standard digest.

Nano LC gradient for Duration01:25:00 DDA analysis:
ABC
Time (minutes)Nano pump Flow rate (µl/min)% Of Solvent-B
0.00.3003.0
7.00.3007.0
60.00.30022.0
70.00.30035.0
71.00.30095.0
78.00.30095.0
79.00.3003.0
85.00.3003.0
85.0 Stop Run

1h 25m
Mass spectrometer parameters: Refer below settings to construct DDA method:
ABC
Method duration85 min
MS Global settings:
Infusion mode: Liquid Chromatography
Expected LC peak width (s):15
Advanced Peak determination:TRUE
Default charge state:2
Internal mass calibration:off
Full scan settings:
Orbitrap resolution:60000
Scan range (m/z):350-1200
RF lens (%):40
AGC target:Custom
Normalized AGC target (%):300
Maximum injection Time mode:Custom
Maximum injection Time (ms):28
Micorscans:1
Data type:Profile
Polarity:Positive
Filters:
MIPSMonoisotopic peak determination:Peptide
Relax restrictions when too few precursors are found: FALSE
IntensityFilter Type:ntensity Threshold
Intensity Threshold:1.00E+04
Charge StateInclude charge state(s):2 to 6
Include undetermined charge states: False
Dynamic ExclusionDynamic Exclusion Mode:Custom
Exclude after n times:1
Exclusion duration (s):45
Mass Tolerance:ppm
Low:10
High10
Exclude isotopes:TRUE
Perform dependent scan on single charge state per precursor only:FALSE
Data DependentData Dependent Mode:Cycle Time
Time between Master Scans (sec):3
ddMS2 settingsIsolation Window (m/z):1.2
Isolation Offset:Off
Collision Energy Mode:Fixed
Collision Energy Type:Normalized
HCD Collision Energy (%):30
Orbitrap resolution:15000
Scan range mode:Auto
Scan Range (m/z):200 - 1200
AGC target:Custom
Normalized AGC target (%):100
Maximum injection Time mode:Custom
Maximum injection Time (ms):85
Micorscans:1
Data type:Centroid
Polarity:Positive
Database searches with MaxQuant for Data Dependent Acquisition (DDA) MS analysis to generate Spectral library:
Database searches with MaxQuant for Data Dependent Acquisition (DDA) MS analysis to generate Spectral library:
Export Raw MS data to a Windows server to perform database searches using MaxQuant. Refer the below search parameters for the search.
Note
Note: It is recommended to have a good computational capability for a faster and successful MaxQuant search. We used the configuration: Intel® Xeon® Silver 421R CPU @ 2.40GHz and 2.39 GHz (2 processors), 384 GB RAM, 64-bit Windows OS with 1TB SSD drive.
AB
Value
Version1.6.10.0
Include contaminantsTRUE
PSM FDR0.01
PSM FDR Crosslink0.01
Protein FDR0.01
Site FDR0.01
Use Normalized Ratios for OccupancyTRUE
Min. peptide Length7
Min. score for unmodified peptides0
Min. score for modified peptides40
Min. delta score for unmodified peptides0
Min. delta score for modified peptides6
Min. unique peptides0
Min. razor peptides1
Min. peptides1
Use only unmodified peptides andTRUE
Modifications included in protein quantificationOxidation (M);Acetyl (Protein N-term)
Peptides used for protein quantificationRazor
Discard unmodified counterpart peptidesTRUE
Label min. ratio count2
Use delta scoreFALSE
iBAQTRUE
iBAQ log fitTRUE
Match between runsTRUE
Matching time window [min]0.7
Match ion mobility window [indices]0.05
Alignment time window [min]20
Alignment ion mobility window [indices]1
Find dependent peptidesFALSE
Fasta fileD:\Database\20200723-Human-Uniprot.fasta
Decoy moderevert
Include contaminantsTRUE
Advanced ratiosTRUE
Fixed andromeda index folder
Temporary folder
Combined folder location
Second peptidesTRUE
Stabilize large LFQ ratiosFALSE
Separate LFQ in parameter groupFALSE
Require MS/MS for LFQ comparisonsFALSE
Calculate peak propertiesFALSE
Main search max. combinations200
Advanced site intensitiesTRUE
Write msScans tableTRUE
Write msmsScans tableTRUE
Write ms3Scans tableFALSE
Write allPeptides tableTRUE
Write mzRange tableTRUE
Write pasefMsmsScans tableFALSE
Write accumulatedPasefMsmsScans tableFALSE
Max. peptide mass [Da]4600
Min. peptide length for unspecific search8
Max. peptide length for unspecific search25
Razor protein FDRTRUE
Disable MD5FALSE
Max mods in site table3
Match unidentified featuresFALSE
Epsilon score for mutations
Evaluate variant peptides separatelyTRUE
Variation modeNone
MS/MS tol. (FTMS)20 ppm
Top MS/MS peaks per Da interval. (FTMS)12
Da interval. (FTMS)100
MS/MS deisotoping (FTMS)TRUE
MS/MS deisotoping tolerance (FTMS)7
MS/MS deisotoping tolerance unit (FTMS)ppm
MS/MS higher charges (FTMS)TRUE
MS/MS water loss (FTMS)TRUE
MS/MS ammonia loss (FTMS)TRUE
MS/MS dependent losses (FTMS)TRUE
MS/MS recalibration (FTMS) TRUE
MS/MS tol. (ITMS) 0.5 Da
Top MS/MS peaks per Da interval. (ITMS) 8
Da interval. (ITMS) 100
MS/MS deisotoping (ITMS) FALSE
MS/MS deisotoping tolerance (ITMS) 0.15
MS/MS deisotoping tolerance unit (ITMS) Da
MS/MS higher charges (ITMS) TRUE
MS/MS water loss (ITMS) TRUE
MS/MS ammonia loss (ITMS) TRUE
MS/MS dependent losses (ITMS) TRUE
MS/MS recalibration (ITMS) FALSE
MS/MS tol. (TOF) 40 ppm
Top MS/MS peaks per Da interval. (TOF) 10
Da interval. (TOF) 100
MS/MS deisotoping (TOF) TRUE
MS/MS deisotoping tolerance (TOF) 0.01
MS/MS deisotoping tolerance unit (TOF) Da
MS/MS higher charges (TOF) TRUE
MS/MS water loss (TOF) TRUE
MS/MS ammonia loss (TOF)TRUE
MS/MS dependent losses (TOF)TRUE
MS/MS recalibration (TOF)FALSE
MS/MS tol. (Unknown)20 ppm
Top MS/MS peaks per Da interval. (Unknown)12
Da interval. (Unknown)100
MS/MS deisotoping (Unknown)TRUE
MS/MS deisotoping tolerance (Unknown)7
MS/MS deisotoping tolerance unit (Unknown)ppm
MS/MS higher charges (Unknown)TRUE
MS/MS water loss (Unknown) TRUE
MS/MS ammonia loss (Unknown)TRUE
MS/MS dependent losses (Unknown)TRUE
MS/MS recalibration (Unknown)FALSE
Site tablesDeamidation (NQ)Sites.txt;Oxidation (M)Sites.txt;Phospho (STY)Sites.txt
Computational step
Database searches with Biognosys Spectronaut for Data Dependent Independent Acquisition (DIA) MS analysis (Library free and Library-based search):
Database searches with Biognosys Spectronaut for Data Dependent Independent Acquisition (DIA) MS analysis (Library free and Library-based search):
Import the msms.txt file form the MaxQuant search output files into Spectronaut to generate a Spectral library.
Note
Make sure to provide a correct path of the DDA raw data.

Alternatively perform a Pulsar search of DDA data to generate a library.
As illustrated in the workflow we recommend doing a direct-DIA or Library free search using Human Uniprot FAST file to construct a hybrid library. Enable search archive option during the direct-DIA search.
Computational step
Merge the direct-DIA search archive and DDA library to construct a hybrid library and use this library to perform library-based search of the DIA data.
Use the below settings for the library-based DIA search within Spectronaut.
AB
Spectronaut 15.7.220308.50606
Computer Name: MRC-DRI-2
User Domain Name: LIFESCI-AD
User Name: rnirujogi
Analysis Mode: UI
Analysis Type: Peptide-Centric
Settings Used: RN_DIA_Default
Data Extraction
MS1 Mass Tolerance Strategy:Dynamic
Correction Factor:1
MS2 Mass Tolerance Strategy:Dynamic
Correction Factor:1
Intensity Extraction MS1:Maximum Intensity
Intensity Extraction MS2:Maximum Intensity
XIC Extraction
XIC IM Extraction Window:Dynamic
Correction Factor:1
XIC RT Extraction Window:Dynamic
Correction Factor:1
Calibration
Calibration Mode:Automatic
MS1 Mass Tolerance Strategy:System Default
MS2 Mass Tolerance Strategy:System Default
Precision iRT:TRUE
iRT <-> RT Regression Type:Local (Non-Linear) Regression
Exclude Deamidated Peptides:TRUE
MZ Extraction Strategy:Maximum Intensity
Allow source specific iRT Calibration:TRUE
Used Biognosys' iRT Kit:TRUE
Calibration Carry-Over:FALSE
Identification
Generate Decoys:TRUE
Decoy Limit Strategy:Dynamic
Library Size Fraction:0.1
Decoy Method:Mutated
Preferred Fragment Source:NN Predicted Fragments
Machine Learning:Per Run
Exclude Duplicate Assays:TRUE
Precursor PEP Cutoff:0.2
Protein Qvalue Cutoff (Experiment):0.01
Protein Qvalue Cutoff (Run):0.05
Exclude Single Hit Proteins:TRUE
Pvalue Estimator:Kernel Density Estimator
Precursor Qvalue Cutoff:0.01
Single Hit Definition:By Stripped Sequence
Quantification
Interference Correction:TRUE
MS1 Min:2
MS2 Min:3
Exclude All Multi-Channel Interferences:TRUE
Only Identified Peptides:TRUE
Protein LFQ Method:Automatic
Major (Protein) Grouping:by Protein Group Id
Minor (Peptide) Grouping:by Stripped Sequence
Minor Group Top N:TRUE
Min:1
Max:3
Minor Group Quantity:Mean precursor quantity
Major Group Top N:TRUE
Min:1
Max:3
Major Group Quantity:Mean peptide quantity
Quantity MS-Level:MS2
Quantity Type:Area
Proteotypicity Filter:None
Data Filtering:Qvalue
Cross Run Normalization:TRUE
Row Selection:Automatic
Normalization Strategy:None
Normalization Filter Type:None
PTM Workflow
PTM Localization: TRUE
Probability Cutoff:0.75
PTM Analysis:TRUE
Multiplicity:TRUE
Run Clustering:FALSE
PTM Consolidation:Sum
Flanking Region:7
Workflow
In-Silico Library Optimization:FALSE
Profiling Strategy:iRT Profiling
Profiling Row Selection:Minimum Qvalue Row Selection
Qvalue Threshold:0.01
Profiling Target Selection:Automatic Selection
Carry-over exact Peak Boundaries:FALSE
Unify Peptide Peaks Strategy:None
Multi-Channel Workflow Definition:From Library Annotation
Fallback Option:Labeled
Protein Inference
Protein Inference Workflow:Automatic
Inference Algorithm:IDPicker
Post Analysis
Calculate Sample Correlation Matrix:TRUE
Calculate Explained TIC:None
Gene Ontology:geneOntology/Ontologies\bgs_default_go basic.obo
Differential Abundance Grouping:Major Group (Quantification Settings)
Smallest Quantitative Unit:Major Group (Quantification Settings)
Use All MS-Level Quantities:FALSE
Differential Abundance Testing:Un-Paired t-test
Assume Equa Variance:FALSE
Group-Wise Testing Correction:FALSE
Run Clustering:TRUE
Distance Metric:Manhattan Distance
Linkage Strategy:Ward's Method
Z-score transformation:FALSE
Order Runs by Clustering:TRUE
Pipeline Mode
Post Analysis Reports:
Scoring Histograms:TRUE
Data Completeness Bar Chart:TRUE
Run Identifications Bar Chart:TRUE
CV Density Line Chart:TRUE
CVs Below X Bar Chart:TRUE
Generate SNE File:TRUE
Store Iontraces in SNE:FALSE
Report Schema:PTMSiteReport (Pivot), RN_PG_Pivot (Pivot), MSStats Report (v 3.7.3)(Normal), Protein Quant (Normal), Protein Quant (Pivot), BGS Factory Report(Normal)
Reporting Unit: Across Experiment
Computational step
Data analysis of DIA data and data visualization:
Data analysis of DIA data and data visualization:
Export Protein group tables from Spectronaut in PG Pivot format.
Computational step
For the Golgi-tag IP data, annotate using a complied list of know Golgi proteins from a resource e.g. (https://compartments.jensenlab.org/Search) and Uniprot-GO terms.
Note
Note: Annotate Golgi proteins by using the VLOOKUP function in Excel from the compiled known Golgi-tag proteins. Similarly in case of Mito IP use Mito carta resource.

Computational step
Prepare the data for differential analysis and this can be done using Perseus software suite (https://maxquant.net/perseus/). The basic functionalities of the software and various workflows can be adopted from the published literature (PMID: 27348712) and available tutorials (http://coxdocs.org/doku.php?id=perseus:start) on Youtube (https://www.youtube.com/c/MaxQuantChannel/featured)

Computational step
Follow the Perseus workflow illustrated in Figure 2.
Figure: 2


Figure 2: Workflow describing the DIA data analysis using Perseus software package to identify enriched Golgi proteins and subsequent data visualization.
The T-test results can be exported and could be analysed using other software suites such as curtain tool to visualize the volcano plot and associated protein raw intensities for all the conditions, protein domain architecture, STRING interaction prediction and alpha fold prediction.

Optional: In addition to using the Perseus other data quality can be done using custom R or Python Scripts (Provided below) and other relevant packages.

Figure: 1

Figure1: Workflow of DIA MS data acquisition: Workflow describing the data acquisition of Golgi-tag IP, control IP and whole cell extracts of the DIA data and subsequent database search using Spectronaut. The library can be generated using DDA strategy and which can be searched using MaxQuant or Pulsar within Spectronaut

Computational step
Optional
Scripts - R correlation plot
Scripts - R correlation plot

library(corrplot)

filename <- "//mrc-smb.lifesci.dundee.ac.uk/mrc-group-folder/ALESSI/Toan/For
Golgitag_Paper/For_Pearson_Corr_02.txt"

df <- read.table(filename, header = TRUE, sep="\t")

df <- df[colnames(df)[1:which(colnames(df) == "HA.WCL_06")]]

cor_mat <- cor(as.matrix(df), use="everything")

pdf("//mrc-smb.lifesci.dundee.ac.uk/mrc-group-folder/ALESSI/Toan/For
Golgitag_Paper/For_Pearson_Corr_02.txt.pdf")

corrplot(cor_mat, order="hclust", type="lower", method="ellipse")

dev.off()

Computational step
Scripts - Python Network Interaction with Cytoscape and Plotly Dash
Scripts - Python Network Interaction with Cytoscape and Plotly Dash

import dash import dash_cytoscape as cyto from dash.dependencies import Input, Output import dash_core_components as dcc import dash_html_components as html import pandas as pd cyto.load_extra_layouts() app = dash.Dash(__name__) server = app.server def add_individual_protein(df, source, elements): highest = df["Difference"].max() n = 0 for i, r in df.iterrows(): if n < 15: opacity = r["Difference"]/highest elements.append({'data': {'id': r["Gene.names"], 'label': r["Gene.names"], 'color': f"rgba(136, 86, 167,{opacity})", "opacity": opacity}, 'classes': 'protein'}) elements.append( {'data': {'source': source, 'target': r["Gene.names"], 'color': f"rgba(136, 86, 167,{opacity})", "opacity": opacity}, 'classes': 'protein-edge'},) else: break n += 1 def add_groups_enriched(edf, elements): edf = edf.sort_values(by="Difference", ascending=False) golgi = edf[(edf["Golgi"] == "+")] #golgi = edf[(edf["C: Golgi"] == "+")] golgi_count = len(golgi.index) print(golgi_count) glyco = golgi[golgi["Glycosylation"] == "+"] #glyco = golgi[golgi["Glycosylation genes"] == "+"] glyco_count = len(glyco.index) print(glyco_count) phospha = golgi[golgi["Phosphatases"] == "+"] phospha_count = len(phospha.index) kinases = golgi[(golgi["Kinases"] == "+") | (golgi["Dark.kinase"] == "+")] #kinases = golgi[(golgi["Kinases"] == "+") | (golgi["Dark Kinases"] == "+")] kinases_count = len(kinases.index) ubi = golgi[golgi["Ub.Pathway"] == "+"] ubi_count = len(ubi.index) l = [ {'data': {'id': 'enriched-golgi', 'label': f'Golgi: {golgi_count}', "size": golgi_count * block}, 'classes': 'golgi enriched'}, #{'data': {'source': 'significant', 'target': 'enriched-golgi'}, 'classes': 'significant-edge'}, {'data': {'id': 'enriched-glyco', 'label': f'Glycosylation genes: {glyco_count}', "size": glyco_count * block}, 'classes': 'golgi enriched'}, {'data': {'id': 'enriched-phospha', 'label': f'Phosphatases: {phospha_count}', "size": phospha_count * block}, 'classes': 'golgi enriched'}, {'data': {'id': 'enriched-kinase', 'label': f'Kinases: {kinases_count}', "size": kinases_count * block}, 'classes': 'golgi enriched'}, {'data': {'id': 'ubi', 'label': f'Ubiquitin components: {ubi_count}', "size": ubi_count * block}, 'classes': 'golgi enriched'}, {'data': {'source': 'enriched-golgi', 'target': 'enriched-glyco'}, 'classes': 'golgi-edge enriched'}, {'data': {'source': 'enriched-golgi', 'target': 'enriched-phospha'}, 'classes': 'golgi-edge enriched'}, {'data': {'source': 'enriched-golgi', 'target': 'enriched-kinase'}, 'classes': 'golgi-edge enriched'}, {'data': {'source': 'enriched-golgi', 'target': 'ubi'}, 'classes': 'golgi-edge enriched'}, ] for i in l: elements.append(i) add_individual_protein(glyco, "enriched-glyco", elements) add_individual_protein(phospha, "enriched-phospha", elements) add_individual_protein(kinases, "enriched-kinase", elements) add_individual_protein(ubi, "ubi", elements) def add_groups_not_enriched(edf, elements): edf = edf.sort_values(by="Difference", ascending=False) golgi = edf[(edf["Golgi"] != "+")] #golgi = edf[(edf["C: Golgi"] != "+")] golgi_count = len(golgi.index) print(golgi_count) glyco = golgi[golgi["Glycosylation"] == "+"] #glyco = golgi[golgi["Glycosylation genes"] == "+"] glyco_count = len(glyco.index) print(glyco_count) phospha = golgi[golgi["Phosphatases"] == "+"] phospha_count = len(phospha.index) kinases = golgi[(golgi["Kinases"] == "+") | (golgi["Dark.kinase"] == "+")] #kinases = golgi[(golgi["Kinases"] == "+") | (golgi["Dark Kinases"] == "+")] kinases_count = len(kinases.index) ubi = golgi[golgi["Ub.Pathway"] == "+"] ubi_count = len(ubi.index) l = [ {'data': {'id': 'non-enriched-golgi', 'label': f'Non-golgi: {golgi_count}', "size": golgi_count * block}, 'classes': 'not-golgi not-enriched'}, #{'data': {'source': 'significant', 'target': 'non-enriched-golgi'}, 'classes': 'not-golgi significant-edge'}, {'data': {'id': 'non-enriched-glyco', 'label': f'Glycosylation genes: {glyco_count}', "size": glyco_count * block}, 'classes': 'not-golgi not-enriched'}, {'data': {'id': 'non-enriched-phospha', 'label': f'Phosphatases: {phospha_count}', "size": phospha_count * block}, 'classes': 'not-golgi not-enriched'}, {'data': {'id': 'non-enriched-kinase', 'label': f'Kinases: {kinases_count}', "size": kinases_count * block}, 'classes': 'not-golgi not-enriched'}, {'data': {'id': 'non-enriched-ubi', 'label': f'Ubiquitin components: {ubi_count}', "size": ubi_count * block}, 'classes': 'not-golgi not-enriched'}, {'data': {'source': 'non-enriched-golgi', 'target': 'non-enriched-glyco'}, 'classes': 'not-golgi-edge not-enriched'}, {'data': {'source': 'non-enriched-golgi', 'target': 'non-enriched-phospha'}, 'classes': 'not-golgi-edge not-enriched'}, {'data': {'source': 'non-enriched-golgi', 'target': 'non-enriched-kinase'}, 'classes': 'not-golgi-edge not-enriched'}, {'data': {'source': 'non-enriched-golgi', 'target': 'non-enriched-ubi'}, 'classes': 'not-golgi-edge not-enriched'}, ] for i in l: elements.append(i) add_individual_protein(glyco, "non-enriched-glyco", elements) add_individual_protein(phospha, "non-enriched-phospha", elements) add_individual_protein(kinases, "non-enriched-kinase", elements) add_individual_protein(ubi, "non-enriched-ubi", elements) block = 0.2 #df = pd.read_csv(r"C:\Users\toanp\Downloads\All enriched_For Network.txt", sep="\t") #df = pd.read_csv(r"C:\Users\toanp\Downloads\GT-IP_Mock-IP_tTest.txt", sep="\t") df = pd.read_csv(r"C:\Users\toanp\Downloads\GT-IP_WCL_tTest.txt", sep="\t") df = df[(df["Significant"]=="+")&(df["Difference"] >= 1)] elements = [ #{'data': {'id': 'significant', 'label': f'Significant: {len(df.index)}', "size": len(df.index) * block}, 'classes': 'significant'}, ] add_groups_enriched(df, elements) add_groups_not_enriched(df, elements) app.layout = html.Div([ cyto.Cytoscape( id='cytoscape', elements=elements, layout={'name': 'cose', 'idealEdgeLength': 20}, style={'width': '2000px', 'height': '2000px'}, stylesheet=[ { 'selector': '.significant', 'style': { 'shape': 'ellipse', 'background-color': 'rgb(173, 218, 226)', } }, { 'selector': '.not-golgi', 'style': { 'shape': 'ellipse', 'background-color': 'rgb(255, 154, 162)', } }, { 'selector': '.not-golgi-edge', 'style': { 'curve-style': 'straight-triangle', "width": 5, 'line-color': 'rgb(255, 154, 162)', } }, { 'selector': '.golgi', 'style': { 'shape': 'ellipse', 'background-color': 'rgb(255, 218, 193)', } }, { 'selector': '.golgi-edge', 'style': { 'curve-style': 'straight-triangle', "width": 5, 'line-color': 'rgb(255, 218, 193)', } }, { 'selector': '.protein', 'style': { 'shape': 'ellipse', 'background-color': 'data(color)', 'background-opacity': 'data(opacity)', 'line-color': 'black' } }, { 'selector': '.protein-edge', 'style': { 'line-color': 'data(color)', 'opacity': 'data(opacity)', } }, { 'selector': 'node', 'style': { "content": "data(label)", "width": "data(size)", "height": "data(size)", } }, { 'selector': '.enriched', 'style': { 'shape': 'ellipse', 'background-color': 'rgb(255, 154, 162)', 'line-color': 'rgb(255, 154, 162)', } }, { 'selector': '.not-enriched', 'style': { 'shape': 'ellipse', 'background-color': 'rgb(255, 218, 193)', 'line-color': 'rgb(255, 218, 193)', } } ] ), html.Div([html.Button("as svg", id="btn-get-svg")]) ]) print(elements) @app.callback( Output('image-text', 'children'), Input('cytoscape', 'imageData'), ) def put_image_string(data): return data @app.callback( Output("cytoscape", "generateImage"), [ Input("btn-get-svg", "n_clicks"), ]) def get_image(get_svg_clicks): # File type to output of 'svg, 'png', 'jpg', or 'jpeg' (alias of 'jpg') # 'store': Stores the image data in 'imageDataf' !only jpg/png are supported # 'download'`: Downloads the image as a file with all data handling # 'both'`: Stores image data and downloads image as file. ctx = dash.callback_context if ctx.triggered: input_id = ctx.triggered[0]["prop_id"].split(".")[0] if input_id != "tabs": action = "download" ftype = input_id.split("-")[-1] return { 'type': 'svg', 'action': 'download' } return { 'type': 'png', 'action': 'store' }
if __name__ == "__main__": app.run_server(debug=True)

Computational step