2. Focus of the two sessions:
The aim of this course is to provide an overview
of the applications, laboratory equipment and
online bioinformatic portals for metabolomics
research.
Plant Bias – all techniques transferable to
other organisms
3. What is the metabolome?
• Total quantitative collection of chemical compounds
(metabolites) present in an organism eg. sugars,
amino acids, phenolics, lipids
• Thorough and unbiased assessment of all metabolites
within an organism
• Not proteins or peptides
• Highly complex
4. Complexity
• Physically and chemically complex
Large range of molecular weights 10’s to 100’s MW (and size)
Polar and non-polar metabolites
Volatiles
• Variation in number of known metabolites per species
Yeast Saccharomyces cerevisiae (584)
E. coli (436)
Plant kingdom (up to 200 000)
Human (2900)
• Very wide concentration range
mM to sub-pM
• Temporal changes - flux
6. Measuring many metabolites is nothing new, but…the scale of the analysis is
“Metabolomics” first appeared in the literature in 1998 (Fiehn et al. 2007. Metabolomics)
Rather than look at individual reactions to understand an organism (reductionist theory)
an attempt is made to measure the whole system (systems biology)
8. Why study the metabolome?
•Need to understand metabolites and metabolic pathways before we can exploit them
•Metabolic status of cells provides a clearer indication of health than mRNA or proteins
•Advance systems biology
Trait development in crops
eg, salt and drought tolerance;
defence; photoprotection
Genetic engineering
safety – substantial equivalences
High value products – cosmetics; medicine
Biofuels
Plant disease biomarkers
Plant population / evolutionary studies
Applications
Biomarkers for:
Disease
drug intervention
environmental stress
Nutrigenomics
Personal health assessments
Personalised medicine
Metabolic engineering
9. Examples of early applications:
Diagnosis of coronary heart disease using metabonomics
Brindle, J.T. et al.(2002). Rapid and noninvasive diagnosis of the presence and
severity of coronary heart disease using 1H-NMR-based metabonomics.
Nature Medicine, 8, 1439-1444.
Why study the metabolome?
Metabolite profiling for plant functional genomics
Arabidopsis thaliana – model species. Quantified and identified many metabolites
and related different genotypes to their metabolic profiles (by GC-MS)
Fiehn et al.(2000). Metabolite profiling for plant functional genomics. Nature
Biotechnology, 18, 1157-1161.
10. Natural selection (traits related to fitness eg, survival, reproduction)
Local adaptation (variation in traits)
Why is this application important in natural systems?
Plants
Abiotic
Biotic
Grow
(morphological traits)
Function
(biochemical traits)
Very little is known about variation in such metabolic traits – how do we measure them?
Time
Population
spread
11. Dunn et al. 2005
Techniques used in measuring the metabolome
12. Sample collection
STOP metabolism
• Cold methanol
• Hot Ethanol or Methanol
• Freeze clamping for plants
• Liquid nitrogen (-196’C)
• Spike to check recovery
How to get metabolites out of the cell
Solvent extraction and storage
• Methanol (hot or cold)
• Methanol/chloroform/water
• Hot ethanol
• Ball milling or grinding with mortar/pestle
• Store at -80’C
13. Mass Spectrometry (MS)
Nuclear Magnetic Resonance (NMR)
Fourier Transform Infra Red (FT-IR)
Detecting metabolites – Metabolic Fingerprinting
High throughput screening
for metabolic phenotypes
Direct injection of crude
extract
Direct injection of crude
extract
14. Davey et al. 2008
Arabidopsis petraea Wales
Arabidopsis petraea Sweden
Metabolite fingerprinting
15. High Performance Liquid Chromatography (HPLC) – Photodiode array (PDA) – Mass
spectrometry (MS)
Detecting metabolites – Metabolic Profiling
HPLC
PDA
MS
MS/MS
Cyanidin 3-
(3-malonyl
glucoside)
534.90
448.9 (-malonyl)
287.04 (-hexose
and malonyl)
The mirror crack’d: the intense blue colour of
Ophrys speculum is produced by both
chemical and structural means
Silvia Vignolini1,2, Matthew P. Davey1, Julia Tratt3,
Svante Malmgren4, Richard Bateman3, Paula
Rudall3, Ullrich Steiner2, and Beverley J. Glover1
New Phytologist in press
18. Multivariate Data Analysis
Unsupervised
Principal Component Analysis (PCA)
Supervised
Partial Least Squares
-Discriminant Analysis (PLS-DA)
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8
PC2
(22%)
PC1 (30%)
Iceland
Ireland
Norway
Scotland
Sweden
Wales
IC
IC
IC IC
IC
IC
IC
IC
IR
NO
NO
NO
NO
NO
NO
NO
NO
SC
SC
SW
SW
SW
SW
SW
SW
SW SW
WA
SIMCA-P 11 - 09/04/2008 15:18:20
Hierarchical Cluster Analysis (HCA)
0
5
10
15
20
25
+---------+---------+---------+---------+---------+
Sweden Sco,Wal
Ire,Nor
Norway Iceland
0
5
10
15
20
25
+---------+---------+---------+---------+---------+
Sweden Sco,Wal
Ire,Nor
Norway Iceland
Sweden Sco,Wal
Ire,Nor
Norway Iceland
Trygg et al. 2007
19. Metabolite matches and mapping based on mass matching
Brown et al. 2009
Updated yearly in Nucleic Acids Research
Galperin and Fernández-Suárez 2012
Large number of online databases for
metabolite and network mapping
KEGG is the most widely used/known
site for metabolic mapping
– there are errors but getting better!
Another common site is MAPMAN
Eg. Search mass 504.159
24. Summary
• Metabolomics – logical progression of genomic and post-genomic science
• Diverse range of applications – especially trait identification
•Range of fingerprinting and profiling techniques
•Large datasets require multivariate statistics
•Large number of online databases for metabolite identification and mapping
25. Introduction to Metabolomics
Overview of techniques
Targeted and non-targeted metabolomics
(metabolite extraction procedures, equipment
GC-MS, HPLC-PDA-MS)
27. Experimental design: need to consider…
Organism
•Purity of sample (eg,
is it contaminated
with bacteria, fungus)
•How much sample
(weight) do you need
for analysis?
1-10mg MS
50-100mg NMR
•How many samples
do you need for
correct biological
interpretation?
•How many samples
do you have access
to?
•Cellular
compartments -
whole cell, ER, mito?
How are you going to
stop (quench)
metabolism within
seconds ?
•Location – lab, field,
hospital ward
•How are you going to
extract the metabolites?
Which techniques do
you have available for
analysis?
MS, HPLC, NMR, GC
How are you going to
interpret the data –
uni- or multivariate
statistics?
Will this answer
your
hypothesis?
29. Extractions – chemical and physical properties
1: Molecular weight = the sum of weights of all atoms making the molecule,
H2O = 18 (18 g per mol); lipids = >1000 g per mol
2: Molecular size = the 3D size of the structure, measured as Å
3: *Polarity* = differences in electronegativity:
Polar (large difference in positive and negative charges) (hydrophilic)
non(a)-polar compounds (no or little difference in charge) (hydrophobic)
more O and H = more polar; more N = less polar
4: Volatility = depends on boiling and melting point – liquid to gas phase
(more polar = less volatile)
5: *Solubility* = related to polarity, temperature and size (like dissolves like)
To dissolve - particles need to separate and fit between the solvent spaces
Eg, in polar metabolites, a positive end of a metabolite attaches to a negative end of
solvent – cannot happen if a positive charge has no negative charge to attach to
6: Stability = thermal or oxidative instability
30. Quenching - Stop enzymatic metabolism
• Turnover rate is fast: reaction half lives < 1s
glucose to glucose-6-phosphate 0.3 to 1 mM per s
ATP used at a rate of 1.5mM per s
• Cold (< -40oC) Hot (>80oC)
• Acid (pH <2.0) Alkaline (pH > 10)
• Hot or cold Ethanol/Methanol Liquid nitrogen
• Perchloric acid/Sodium Hydroxide Cold NaCl
• Freeze dry
• Once stopped – how do you extract the metabolites?
32. Extractions – quality control
ALWAYS validate methodologies
•Pool from representative samples after extraction
•Run at the start and end, an every 5 or 10 samples during data acquisition
•Observe technical reproducibility
•Spike (add) extract with known amount of non-interfering substance – can you
recover all of that spike after your analysis?
33. Very common bi-phasic metabolite extraction procedure:
Solvent mixture A = MeOH/CHCl3/H2O, 2.5:1:1, v/v/v at –20 oC;
Solvent mixture B = MeOH/CHCl3, 1:1, v/v at –20 oC
Solvent C = deionised/distilled H2O at 4 oC
What next – need to analyse metabolites
34. Dunn et al. 2005
Techniques used in measuring the metabolome
35. Separating metabolites
Basics - Thin Layer Chromatography
Paper or Silica Gel
http://www.teachengineering.org
Aim is to separate (resolve) different metabolites in a mixture
Maximum number of peaks that can be resolved is called ‘peak capacity’
Can be increased by changing ratio of liquids/solvents or temperature
36. Separation - HPLC – PDAD
High Performance Liquid Chromatography
Photodiode Array Detection
Same principle as TLC but as particles are packed tight it needs pumps to push solvents
37. Separation - HPLC – PDAD
Normal phase liquid chromatography
-column is packed full of a polar compound (eg. alkyl nitrile)
-non-polar mobile phase such as hexane
-good for lipids
Reverse-phase liquid chromatography
-column is packed with a non-polar silica compound
(eg C8 octasilane or C18 octadecylsilane)
-polar mobile phases such as water/methanol/acetonitrile
-changes in pH, salts, solvent affect retention times
-good for phenolics, sugars, amino acids, drugs, pesticides
Columns are packed full of resin (STATIONARY PHASE)
Solvents flow into the column and around the resin (MOBILE PHASE)
38. Separation - HPLC – PDAD
Isocratic – same solvents ratios running through column
eg. 100% Methanol
Gradient - change in solvent ratios over time
eg. start at 80 % acetonitrile 20% 1% formic acid
finish at 60% acetonitrol 40% 1% formic acid
over 20 minutes
Different metabolites will have different retention times
Now need to detect the metabolites coming off the column
Columns are packed full of resin (STATIONARY PHASE)
Solvents flow into the column and around the resin (MOBILE PHASE)
39. Detection - HPLC – PDAD
•Absorption of electromagnetic radiation
•Intensity of light passing through a sample falls off
exponentially as it progresses through the sample
•Usually linear with concentration
Beer-Lambert law
•Typical photo diode array detector measures
absorbance from 260 to 800 nm
•What does a typical HPLC trace look like?
SPECTROPHOTOMETRY - Absorption of UV and visible light
40. HPLC separation – set wavelength in UV range (284 nm)
min
5 10 15 20 25 30 35
mAU
-25
0
25
50
75
100
125
150
DAD1 A, Sig=287,4 Ref=off (MD8H18~1000-0101.D)
2.313
2.406
2.563
2.674
2.741
3.056
3.546
3.984
4.591
5.224
6.446
7.205
7.697
10.461
11.493
12.870
24.776
31.472
32.349
34.474
nm
260 280 300 320 340 360 380
mAU
0
50
100
150
200
250
300
DAD1, 10.450 (327 mAU,Int) of 000-0101.D
followed by UV-absorbance
Multi-scan wavelength
(200 to 500nm)
What does this UV spectrum
tell us about the compound?
Detection - HPLC – PDAD
Type of flavonoid?
41. Band II Band I
at 304-350nm
Flavone – eg. Apigenin
Band I and II close
Band II Band I
at 352-385nm
Flavonol – eg. Quercitin
Band I more defined and further away from band II
42. Separation – GC-FID
Gas Chromatography – Flame Ionisation Detection
Same principle as HPLC but use GAS rather than Liquid to separate metabolites
chemwiki.ucdavis.edu
Typically Helium is the MOBILE phase
Non-polar Fused Silica is the STATIONARY phase
Hydrogen is used for flame
Column diameter 50-500 µm
10 – 100 metres long
43. Separation – GC-FID
Gas Chromatography – Flame Ionisation Detection
chemwiki.ucdavis.edu
Samples (about 1µL) are injected
into a hot (250 ºC) glass tube where
the sample is vapourised
Vapour goes into the column
Separation based on the difference
in partition coefficients between the
solid(liquid) stationary phase and
the mobile gas phase
Increasing temperature biases
compounds to leave the stationary
phase and enter the gas phase
Many polar compounds are NOT volatile
Sugars, amino acids, organic acids
Derivatisation – make compounds volatile
By making them more apolar
Silylation is the most widely used technique,
replaces an acidic hydrogen with an alkylsilyl group
Eg. SiMe3,to form tri-methyl silyl (TMS) derivatives (MSTFA)
44. Detection – GC-FID
Gas Chromatography – Flame Ionisation Detection
Mixture of Hydrogen and Air is used
for flame
The flame jet is one electrode
Another electrode near tip of flame
Voltage output
Connected to chart recorder or PC
When sample emerges from column
it ionises and increases signal
voltage
www.chromacademy.com
45. Gas Chromatography (GC) – Flame Ionisation Detector (FID)
Lipid profiling for biofuels
Triglycerides
Free fatty acids
Polar lipids
Increasing oven temperature
47. Mass Spectrometry (MS)
Detecting metabolites – Metabolic Fingerprinting
High throughput screening
for metabolic phenotypes
Direct injection of crude
extract
Direct injection of crude
extract
48. Arabidopsis petraea Wales
Arabidopsis petraea Sweden
Metabolite fingerprinting
A instrument that measures the masses of molecules that have been converted into
ions - have been electrically charged (positive or negative).
Measure mass over charge (m/z), not just the mass of ions. mass/1 or mass/2
49. Identification – MS
Ionisation
Individual metabolites in the sample are
ionised and either become positively or
negatively charged.
Acceleration (Separation)
These ions are then accelerated so that
they all have the same amount of energy.
Deflection (Separation)
The ions are then deflected by a magnetic
field according to their masses. The lighter
and more charged they are, the more they
are deflected.
Detection
The ions passing through the machine are
detected electrically.
50. Ionisation– MS
Many different types:
Atmospheric Pressure Chemical Ionisation (APCI) (good for polar compounds)
Chemical Ionisation (CI)
Electron Impact (EI) (hard)
Electrospray Ionisation (ESI) (good for polar compounds) (soft)
Fast Atom Bombardment (FAB)
Matrix Assisted Laser Desorption Ionisation (MALDI)
http://www.astbury.leeds.ac.uk/facil/MStut/mstutorial.htm
51. Electrospray Ionisation– MS
http://www.astbury.leeds.ac.uk/facil/MStut/mstutorial.htm
•Sample is dissolved in a polar, volatile solvent and pumped through a narrow, stainless steel capillary
at a flow rate of usually <1 mL per min.
•A high voltage (3 or 4 kV) is applied to the tip of the capillary
•Sample emerging from the tip is dispersed into an aerosol of highly charged droplets, aided by a
nebulising gas (usually N or He) that direct the spray towards the MS.
•The charged droplets diminish in size by solvent evaporation, assisted by a warm flow of N (drying gas)
•Eventually charged sample ions, free from solvent, are released from the droplets, some of which pass
through a sampling cone and into the MS under high vacuum.
52. Positive or Negative Ionisation? – MS
http://www.astbury.leeds.ac.uk/facil/MStut/mstutorial.htm
If the sample has functional groups that readily accept a proton (H+) then positive
ion detection is used
e.g. amines R-NH2 + H+ = R-NH3
+ as in proteins or peptides.
Positive ion mode: often by addition of H+, Na+, K+
Negative ion mode: often by loss of H+ or addition of Cl-
If the sample has functional groups that readily lose a proton then negative ion
detection is used
e.g. carboxylic acids R-CO2H = R-CO2
- and alcohols R-OH = R-O- as in saccharides or
oligonucleotides
53. Detection– MS
Sensitivity: The ppm (parts per million) mass accuracy is a percent error quoting
the difference between the measured and calculated mass for a particular ion.
0.1% would be equivalent to 1000 ppm error.
Standard 5ppm error is equivalent to 0.0005% - eg: for a mass 100 m/z the error
would be +/- 0.05 and on mass 1000 error would be +/- 0.5.
The detector monitors the ion current, amplifies it and the signal is then
transmitted to the data system where it is recorded in the form of mass
spectra .
The m/z values of the ions are plotted against their intensities to show the
number of components in the sample, the molecular mass of each
component, and the relative abundance of the various components in the
sample.
54. High Performance Liquid Chromatography (HPLC) – Photodiode array (PDA) – Mass
spectrometry (MS)
Detecting metabolites – Metabolic Profiling
HPLC
PDA
MS
MS/MS
Cyanidin 3-
(3-malonyl
glucoside)
534.90
448.9 (-malonyl)
287.04 (-hexose
and malonyl)
55. Metabolic Profiling - Gas Chromatography (GC) – Mass spectrometry (MS)
Unique fragmentation patterns
60. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Very few good websites for metabolic mapping
•The key sites are:
•Reactome
http://www.reactome.org/
Human Metabolome Database
http://www.hmdb.ca/
Biocyc
http://biocyc.org/
KEGG
http://www.genome.jp/kegg/kegg2.html
Plantcyc
http://plantcyc.org/
MetExplore
http://metexplore.toulouse.inra.fr/metexplore/
61. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Human Metabolome Database – excellent pathway mapping – eg, Aspirin
62. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Biocyc
collection of 2038 Pathway/Genome Databases
Many pathways derived from genome screen – in silico pathways
63. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Biocyc – very comprehensive website for many species
64. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Biocyc – Photosynthesis outline
65. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Biocyc – pressing more detail adds enzyme (EC) and gene data (AGI codes)
66. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Biocyc – metabolite structures can also be viewed
67. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Biocyc – RuBisCO – fixes CO2 in plants
68. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Biocyc – RuBisCO – enzyme – relation to genes and reactions
69. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Biocyc – RuBisCO – gene location on chloroplast
70. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Biocyc – RuBisCO – hyperlinks to other sites for gene and protein information
71. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•KEGG – TCA cycle – EC numbers for reactions – colour coded per species
72. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•KEGG – malate formation (4.2.1.2)
73. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Plantcyc – similar format to Biocyc
74. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Plantcyc – metabolic mapping – colour code reactions from your datasets
75. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Plantcyc – zoom into individual pathways
76. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•Plantcyc – hyperlinks back to original pathway information
77. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•MetExplore
78. How to obtain metabolic, reaction, protein, transcript and gene data from your
identified metabolite
•MetExplore – Cytoscape plugins