SlideShare a Scribd company logo
1 of 24
Open Babel
Noel M. O’Boyle
An open chemical toolbox
Open Babel development team and NextMove Software, Cambridge, UK
EMBL-EBI May 2016
MIOSS – Molecular Informatics Open-Source Software
J. Cheminf. 2011, 3, 33.
http://openbabel.org
Image credit: AJ Cann (AJC1 on Flickr)
File format A
Image credit: Jon Osborne (jonno101101 on Flickr)
File format B
What is Open Babel?
• A programming library in C++
– With access from Perl, Python, Java, Ruby, .NET/Mono, Ruby,
R, PHP
• A set of command-line applications
– Most famously obabel for interconverting chemical file formats
• A graphical user interface for interconverting chemical file
formats
• Available on Win/Mac/Lin, through
conda/pip/brew/apt/yum/dnf, or from http://openbabel.org
History
Sources: Andrew Dalke
http://www.dalkescientific.com/writings/diary/archive/2004/01/03/available_toolkits.html,Roger Sayle
• 1992
– Matt Stahl and Pat Walters wrote Babel (an open source
molecule converter) at the University of Arizona
• 1999
– Matt joined OpenEye Scientific and based their cheminformatics
library OELib on Babel – this was also open source
• 2001
– OpenEye decided to rewrite their cheminformatics library as a
proprietary library, OEChem
– OELib was renamed to Open Babel, and continued as a
community project led by Geoff Hutchison
• 2002 (Dec)
– First release (1.0)
Features
• Multiple chemical file formats (+ options) and utility
formats
• 2D coordinate generation and depiction (PNG and SVG)
• 3D coordinate generation, forcefield minimisation,
conformer generation
• Binary fingerprints (path-based, substructure-based) and
associated “fast search” database
• Bond perception, aromaticity detection and atom-typing
• Canonical labelling, automorphisms, alignment
• Materials science: computational chemistry, molecular
dynamics, crystal structures
• Charge models: MMFF, Gasteiger, EEM, (E)QEq, QTPIE
Known Usage
• 45K downloads (from SF) in last 12 months
– 1.2K downloads of Windows Python bindings
• Paper published in 2011
– 984 citations (Google Scholar)
• Pybel paper published in 2008
– 117 citations
https://github.com/Magnusnorrby/MolecularRift
https://twitter.com/AstraZeneca/status/730775739264536576
Molecular Rift (as used by the King of Sweden) uses Open
Babel
Norrby, Grebner, Eriksson, Boström. J. Chem. Inf. Model., 2015, 55, 2475
Measuring the project’s pulse
• Oct 2012 – Last release and move to Github
– 112 “forks” on Github
– Commits from 59 developers (12 drive-by, 41 in the
last year)
• 37 pull requests since the start of the year
• 52 emails to the general mailing list this year
– Of these, 45 were replied to at least once
Contributors per month
Most committed developers in last 12 months
• Geoff Hutchison
– Professor, materials chemistry, Uni Pitt, Avogadro
• Dmitriy Fomichev
– PhD student, comp chemistry, Lobachevsky Uni, Russia
• Alexandr Fonari
– Assoc developer, Schrödinger, materials science, NWChem,
Quantum Espresso
• David van der Spoel
– Prof, Cell and Mol Biol, Uppsala Uni, Gromacs
• David Koes
– Assistant Prof, Comp and Sys Biology, Uni Pittsburgh,
3DMol.js, pharmit, pharmer
• Jeff Janes
– PI, Calibr (California Institute for Biomed Res), PostgreSQL
Chemistry file formats
• Chemists love inventing new file formats
• Every new chemistry application has its own file format
– Some exceptions: e.g. Avogadro
– De facto standards such as Daylight SMILES and
MDL/Symyx/Accelrys/Biovia/Dassault MOL
• The ability to read and interconvert chemical file formats is
important, both for scientitific and economic reasons
– To unlock chemical data for analysis
– To avoid vendor lock-in
– To develop workflows/pipelines
Formats: most recent additions
• Siesta [read]
– ab initio molecular dynamics
• STL [write]
– (STereoLithography) 3D
printing
• Point cloud format [write]
– Write VdW surface as points
• AOForce [read]
– Turbomole vibrational freqs
• MDFF [read/write]
– MD fitting to density maps
• EXYZ [read/write]
– Extended XYZ
git log --pretty=oneline --name-status | grep "^A" | grep src/formats | grep -v inchi | grep -v
libxml | less
Formats: most recent additions
• Siesta [read]
– ab initio molecular dynamics
• STL [write]
– (STereoLithography) 3D
printing
• Point cloud format [write]
– Write VdW surface as points
• AOForce [read]
– Turbomole vibrational freqs
• MDFF [read/write]
– MD fitting to density maps
• EXYZ [read/write]
– Extended XYZ
git log --pretty=oneline --name-status | grep "^A" | grep src/formats | grep -v inchi | grep -v
libxml | less
• Orca [read/write]
– QM package
• JSON formats [read/write]
– ChemDoodle JSON
– PubChem JSON
• Confab report [write]
– Conformation generation
• Dalton [read]
– QM package
• LPMD [read/write]
– MD with interatomic potentials
• Smiley [read]
– Validating SMILES parser
Consider rolling your own plugins
• The Open Babel library itself is fairly compact and
much of the functionality is implemented as plugins
– File formats, descriptors, fingerprints, and arbitrary
operations that take molecules and do something
• Relatively straightforward to add your own plugins,
even if you have never programmed in C++ before
– Easier to add a plugin than write your own C++ application
– Can use the obabel command-line to call it
– Can optionally donate the plugin to the community
• Almost anything can be a plugin
– I have written an entire conformation generator as a plugin
(Confab)
The GPL and industry
• Companies can use or modify Open Babel, add
plugins, and write their own code using it without any
problem
• If they distribute the resulting software outside the
company then they need to provide the source code
under the GPL
– This clause really only affects software companies
developing their own products, not end users in companies
Industry involvement
Code
• OpenEye
• eMolecules
• Silicos-IT
• Kitware
• Dalke Scientific
• Acpharis
• Astex
• Materials Design
• Schrödinger
• Vernalis
Note: based on email addresses
• Acellera
• AMRI
• ArQule
• Avant-garde materials sim
• Avesthagen
• Basilea
• Bayer
• Cambridgesoft
• Constellation Pharma
• Culgi
• Digital Chemistry
• Evotec
• Givaudin
• Global Phasing
• GreenPharma
• Inhibox
• Ingenuity
• Invitrogen (now ThermoFisher)
• Jubilant Biosys
• Lexicon
• Ligon Discovery
• LHASA
• Merck(.de)
• Molplex
• OmegaChem
• PeakDale
• Prometic
• PsycoGenics
• Specs
• Symyx/Accelrys
• Syngenta
• Takasago
• Targacept
• Thomson Reuters
Emails to list
Supporting open source
• When emailing a list, please give your affiliation
– It’s nice to know companies find it useful
• Spread the word, give credit in talks
• Give feedback
– What we’re doing right/wrong
– Can help reorder our priorities/reality check
• Bug bounty?
Future outlook
• Dude, there’s a plan??
• New features are driven by needs/interests of individuals
– Research interests
– Gaps in functionality
– Features needed ‘downstream’ by software using the library
• Avogadro is driving improved support for QM/MD
packages
• Generation of 3D structures based on distance geometry
• Housekeeping: Kekulization rewrite, implicit valency
• Improved performance? Has historically been low on the
agenda.
• Would be nice to have meetings like RDKit does
• What do *you* think we should be focusing on?
Ascii Depiction
A cry for help
Like mailing lists?
openbabel-
discuss@lists.sf.net
Like forums?
http://forums.openbabel.org
Like to email a developer
directly?
Step away from the keyboard
:-)
Don’t forget to read the
docs first and Google it
http://openbabel.org/docs
Image: Tintin44 (Flickr)

More Related Content

Viewers also liked

Donald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & Physiology
Donald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & PhysiologyDonald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & Physiology
Donald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & PhysiologyPamela Donald
 
Scanwtcsdtentprisesletter
ScanwtcsdtentprisesletterScanwtcsdtentprisesletter
ScanwtcsdtentprisesletterMarty Tiezzi
 
Thesis statement poster
Thesis statement posterThesis statement poster
Thesis statement posterschroerl
 
The 2015 Nspire Talks
The 2015 Nspire TalksThe 2015 Nspire Talks
The 2015 Nspire TalksGary Abud Jr
 
เป้าหมายการพัฒนา เขมร
เป้าหมายการพัฒนา เขมรเป้าหมายการพัฒนา เขมร
เป้าหมายการพัฒนา เขมรItnog Kamix
 
KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...
KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...
KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...sajidinbulu
 
Copy of modern agriculture
Copy of modern agricultureCopy of modern agriculture
Copy of modern agricultureChristine Bancod
 
Medical Books Presentation l
Medical Books Presentation lMedical Books Presentation l
Medical Books Presentation lDilshad Alam
 
Hvgpress presentation
Hvgpress presentationHvgpress presentation
Hvgpress presentationHvg Press
 
2 rancang bangun ekonomi islam
2 rancang bangun ekonomi islam2 rancang bangun ekonomi islam
2 rancang bangun ekonomi islamXINYOUWANZ
 

Viewers also liked (17)

Donald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & Physiology
Donald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & PhysiologyDonald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & Physiology
Donald BYOD/ BYOT Implementation 11th-12th Grade Human Anatomy & Physiology
 
Scanwtcsdtentprisesletter
ScanwtcsdtentprisesletterScanwtcsdtentprisesletter
Scanwtcsdtentprisesletter
 
UZZI Quotes
UZZI QuotesUZZI Quotes
UZZI Quotes
 
Tissues
TissuesTissues
Tissues
 
Cardiologia amir
Cardiologia amirCardiologia amir
Cardiologia amir
 
Penerimaan abstrak
Penerimaan abstrakPenerimaan abstrak
Penerimaan abstrak
 
Staging
StagingStaging
Staging
 
Thesis statement poster
Thesis statement posterThesis statement poster
Thesis statement poster
 
The 2015 Nspire Talks
The 2015 Nspire TalksThe 2015 Nspire Talks
The 2015 Nspire Talks
 
National anthem
National anthemNational anthem
National anthem
 
เป้าหมายการพัฒนา เขมร
เป้าหมายการพัฒนา เขมรเป้าหมายการพัฒนา เขมร
เป้าหมายการพัฒนา เขมร
 
KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...
KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...
KUMPULAN SOAL TRYOUT KABUPATEN UJIAN NASIONAL (UN) IPA TAHUN 2014-DOK.SMPN 1 ...
 
Malaria
MalariaMalaria
Malaria
 
Copy of modern agriculture
Copy of modern agricultureCopy of modern agriculture
Copy of modern agriculture
 
Medical Books Presentation l
Medical Books Presentation lMedical Books Presentation l
Medical Books Presentation l
 
Hvgpress presentation
Hvgpress presentationHvgpress presentation
Hvgpress presentation
 
2 rancang bangun ekonomi islam
2 rancang bangun ekonomi islam2 rancang bangun ekonomi islam
2 rancang bangun ekonomi islam
 

Similar to Open Babel project overview

SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)
SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)
SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)SoundSoftware ac.uk
 
Open Source Visualization of Scientific Data
Open Source Visualization of Scientific DataOpen Source Visualization of Scientific Data
Open Source Visualization of Scientific DataMarcus Hanwell
 
Guidelines for Working with Contract Developers in Evergreen
Guidelines for Working with Contract Developers in EvergreenGuidelines for Working with Contract Developers in Evergreen
Guidelines for Working with Contract Developers in Evergreenloriayre
 
'Scikit-project': How open source is empowering open science – and vice versa
'Scikit-project': How open source is empowering open science – and vice versa'Scikit-project': How open source is empowering open science – and vice versa
'Scikit-project': How open source is empowering open science – and vice versaNathan Shammah
 
Intro to open source - 101 presentation
Intro to open source - 101 presentationIntro to open source - 101 presentation
Intro to open source - 101 presentationJavier Perez
 
Avogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and SemanticsAvogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and SemanticsMarcus Hanwell
 
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformaticsStephen Turner
 
Code the docs-yu liu
Code the docs-yu liuCode the docs-yu liu
Code the docs-yu liuStreamNative
 
Chem4Word Wade
Chem4Word WadeChem4Word Wade
Chem4Word WadeAlex Wade
 
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityRob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityGigaScience, BGI Hong Kong
 
But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?gagravarr
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisMarcus Hanwell
 
Chemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopChemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopMarcus Hanwell
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.orgNorman Morrison
 
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science LabScalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science LabSri Ambati
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012François Belleau
 
Array computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyDataArray computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyDataTravis Oliphant
 

Similar to Open Babel project overview (20)

SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)
SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)
SoundSoftware.ac.uk: Sustainable software for audio and music research (DMRN 5+)
 
Open Source Visualization of Scientific Data
Open Source Visualization of Scientific DataOpen Source Visualization of Scientific Data
Open Source Visualization of Scientific Data
 
G3 talk rld_2
G3 talk rld_2G3 talk rld_2
G3 talk rld_2
 
Guidelines for Working with Contract Developers in Evergreen
Guidelines for Working with Contract Developers in EvergreenGuidelines for Working with Contract Developers in Evergreen
Guidelines for Working with Contract Developers in Evergreen
 
'Scikit-project': How open source is empowering open science – and vice versa
'Scikit-project': How open source is empowering open science – and vice versa'Scikit-project': How open source is empowering open science – and vice versa
'Scikit-project': How open source is empowering open science – and vice versa
 
10. ROS (1).pptx
10. ROS (1).pptx10. ROS (1).pptx
10. ROS (1).pptx
 
Intro to open source - 101 presentation
Intro to open source - 101 presentationIntro to open source - 101 presentation
Intro to open source - 101 presentation
 
Avogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and SemanticsAvogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and Semantics
 
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
 
Code the docs-yu liu
Code the docs-yu liuCode the docs-yu liu
Code the docs-yu liu
 
Chem4Word Wade
Chem4Word WadeChem4Word Wade
Chem4Word Wade
 
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityRob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
 
But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?
 
Sound soft hackday-100905
Sound soft hackday-100905Sound soft hackday-100905
Sound soft hackday-100905
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & Analysis
 
Chemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopChemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the Desktop
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science LabScalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012
 
Array computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyDataArray computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyData
 

More from baoilleach

We need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILESWe need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILESbaoilleach
 
Protein-ligand docking
Protein-ligand dockingProtein-ligand docking
Protein-ligand dockingbaoilleach
 
Cheminformatics
CheminformaticsCheminformatics
Cheminformaticsbaoilleach
 
Making the most of a QM calculation
Making the most of a QM calculationMaking the most of a QM calculation
Making the most of a QM calculationbaoilleach
 
Data Analysis in QSAR
Data Analysis in QSARData Analysis in QSAR
Data Analysis in QSARbaoilleach
 
Large-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cellsLarge-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cellsbaoilleach
 
My Open Access papers
My Open Access papersMy Open Access papers
My Open Access papersbaoilleach
 
Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...baoilleach
 
De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...baoilleach
 
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...baoilleach
 
Application of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling MicroscopyApplication of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling Microscopybaoilleach
 
Towards Practical Molecular Devices
Towards Practical Molecular DevicesTowards Practical Molecular Devices
Towards Practical Molecular Devicesbaoilleach
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...baoilleach
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...baoilleach
 
Improving enrichment rates
Improving enrichment ratesImproving enrichment rates
Improving enrichment ratesbaoilleach
 
The Blue Obelisk community
The Blue Obelisk communityThe Blue Obelisk community
The Blue Obelisk communitybaoilleach
 
Interoperability and the Blue Obelisk
Interoperability and the Blue ObeliskInteroperability and the Blue Obelisk
Interoperability and the Blue Obeliskbaoilleach
 
Goslar2010 poster
Goslar2010 posterGoslar2010 poster
Goslar2010 posterbaoilleach
 
Open Babel 2.3 Quick Reference
Open Babel 2.3 Quick ReferenceOpen Babel 2.3 Quick Reference
Open Babel 2.3 Quick Referencebaoilleach
 
Classification of Enzyme Reaction Mechanisms
Classification of Enzyme Reaction MechanismsClassification of Enzyme Reaction Mechanisms
Classification of Enzyme Reaction Mechanismsbaoilleach
 

More from baoilleach (20)

We need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILESWe need to talk about Kekulization, Aromaticity and SMILES
We need to talk about Kekulization, Aromaticity and SMILES
 
Protein-ligand docking
Protein-ligand dockingProtein-ligand docking
Protein-ligand docking
 
Cheminformatics
CheminformaticsCheminformatics
Cheminformatics
 
Making the most of a QM calculation
Making the most of a QM calculationMaking the most of a QM calculation
Making the most of a QM calculation
 
Data Analysis in QSAR
Data Analysis in QSARData Analysis in QSAR
Data Analysis in QSAR
 
Large-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cellsLarge-scale computational design and selection of polymers for solar cells
Large-scale computational design and selection of polymers for solar cells
 
My Open Access papers
My Open Access papersMy Open Access papers
My Open Access papers
 
Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...Improving the quality of chemical databases with community-developed tools (a...
Improving the quality of chemical databases with community-developed tools (a...
 
De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...De novo design of molecular wires with optimal properties for solar energy co...
De novo design of molecular wires with optimal properties for solar energy co...
 
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...Density functional theory calculations on Ruthenium polypyridyl complexes inc...
Density functional theory calculations on Ruthenium polypyridyl complexes inc...
 
Application of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling MicroscopyApplication of Density Functional Theory to Scanning Tunneling Microscopy
Application of Density Functional Theory to Scanning Tunneling Microscopy
 
Towards Practical Molecular Devices
Towards Practical Molecular DevicesTowards Practical Molecular Devices
Towards Practical Molecular Devices
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
 
Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...Why multiple scoring functions can improve docking performance - Testing hypo...
Why multiple scoring functions can improve docking performance - Testing hypo...
 
Improving enrichment rates
Improving enrichment ratesImproving enrichment rates
Improving enrichment rates
 
The Blue Obelisk community
The Blue Obelisk communityThe Blue Obelisk community
The Blue Obelisk community
 
Interoperability and the Blue Obelisk
Interoperability and the Blue ObeliskInteroperability and the Blue Obelisk
Interoperability and the Blue Obelisk
 
Goslar2010 poster
Goslar2010 posterGoslar2010 poster
Goslar2010 poster
 
Open Babel 2.3 Quick Reference
Open Babel 2.3 Quick ReferenceOpen Babel 2.3 Quick Reference
Open Babel 2.3 Quick Reference
 
Classification of Enzyme Reaction Mechanisms
Classification of Enzyme Reaction MechanismsClassification of Enzyme Reaction Mechanisms
Classification of Enzyme Reaction Mechanisms
 

Recently uploaded

Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 

Recently uploaded (20)

Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 

Open Babel project overview

  • 1. Open Babel Noel M. O’Boyle An open chemical toolbox Open Babel development team and NextMove Software, Cambridge, UK EMBL-EBI May 2016 MIOSS – Molecular Informatics Open-Source Software J. Cheminf. 2011, 3, 33. http://openbabel.org
  • 2. Image credit: AJ Cann (AJC1 on Flickr)
  • 3.
  • 4. File format A Image credit: Jon Osborne (jonno101101 on Flickr) File format B
  • 5. What is Open Babel? • A programming library in C++ – With access from Perl, Python, Java, Ruby, .NET/Mono, Ruby, R, PHP • A set of command-line applications – Most famously obabel for interconverting chemical file formats • A graphical user interface for interconverting chemical file formats • Available on Win/Mac/Lin, through conda/pip/brew/apt/yum/dnf, or from http://openbabel.org
  • 6. History Sources: Andrew Dalke http://www.dalkescientific.com/writings/diary/archive/2004/01/03/available_toolkits.html,Roger Sayle • 1992 – Matt Stahl and Pat Walters wrote Babel (an open source molecule converter) at the University of Arizona • 1999 – Matt joined OpenEye Scientific and based their cheminformatics library OELib on Babel – this was also open source • 2001 – OpenEye decided to rewrite their cheminformatics library as a proprietary library, OEChem – OELib was renamed to Open Babel, and continued as a community project led by Geoff Hutchison • 2002 (Dec) – First release (1.0)
  • 7. Features • Multiple chemical file formats (+ options) and utility formats • 2D coordinate generation and depiction (PNG and SVG) • 3D coordinate generation, forcefield minimisation, conformer generation • Binary fingerprints (path-based, substructure-based) and associated “fast search” database • Bond perception, aromaticity detection and atom-typing • Canonical labelling, automorphisms, alignment • Materials science: computational chemistry, molecular dynamics, crystal structures • Charge models: MMFF, Gasteiger, EEM, (E)QEq, QTPIE
  • 8.
  • 9. Known Usage • 45K downloads (from SF) in last 12 months – 1.2K downloads of Windows Python bindings • Paper published in 2011 – 984 citations (Google Scholar) • Pybel paper published in 2008 – 117 citations
  • 10.
  • 11.
  • 12. https://github.com/Magnusnorrby/MolecularRift https://twitter.com/AstraZeneca/status/730775739264536576 Molecular Rift (as used by the King of Sweden) uses Open Babel Norrby, Grebner, Eriksson, Boström. J. Chem. Inf. Model., 2015, 55, 2475
  • 13. Measuring the project’s pulse • Oct 2012 – Last release and move to Github – 112 “forks” on Github – Commits from 59 developers (12 drive-by, 41 in the last year) • 37 pull requests since the start of the year • 52 emails to the general mailing list this year – Of these, 45 were replied to at least once Contributors per month
  • 14. Most committed developers in last 12 months • Geoff Hutchison – Professor, materials chemistry, Uni Pitt, Avogadro • Dmitriy Fomichev – PhD student, comp chemistry, Lobachevsky Uni, Russia • Alexandr Fonari – Assoc developer, Schrödinger, materials science, NWChem, Quantum Espresso • David van der Spoel – Prof, Cell and Mol Biol, Uppsala Uni, Gromacs • David Koes – Assistant Prof, Comp and Sys Biology, Uni Pittsburgh, 3DMol.js, pharmit, pharmer • Jeff Janes – PI, Calibr (California Institute for Biomed Res), PostgreSQL
  • 15. Chemistry file formats • Chemists love inventing new file formats • Every new chemistry application has its own file format – Some exceptions: e.g. Avogadro – De facto standards such as Daylight SMILES and MDL/Symyx/Accelrys/Biovia/Dassault MOL • The ability to read and interconvert chemical file formats is important, both for scientitific and economic reasons – To unlock chemical data for analysis – To avoid vendor lock-in – To develop workflows/pipelines
  • 16. Formats: most recent additions • Siesta [read] – ab initio molecular dynamics • STL [write] – (STereoLithography) 3D printing • Point cloud format [write] – Write VdW surface as points • AOForce [read] – Turbomole vibrational freqs • MDFF [read/write] – MD fitting to density maps • EXYZ [read/write] – Extended XYZ git log --pretty=oneline --name-status | grep "^A" | grep src/formats | grep -v inchi | grep -v libxml | less
  • 17. Formats: most recent additions • Siesta [read] – ab initio molecular dynamics • STL [write] – (STereoLithography) 3D printing • Point cloud format [write] – Write VdW surface as points • AOForce [read] – Turbomole vibrational freqs • MDFF [read/write] – MD fitting to density maps • EXYZ [read/write] – Extended XYZ git log --pretty=oneline --name-status | grep "^A" | grep src/formats | grep -v inchi | grep -v libxml | less • Orca [read/write] – QM package • JSON formats [read/write] – ChemDoodle JSON – PubChem JSON • Confab report [write] – Conformation generation • Dalton [read] – QM package • LPMD [read/write] – MD with interatomic potentials • Smiley [read] – Validating SMILES parser
  • 18. Consider rolling your own plugins • The Open Babel library itself is fairly compact and much of the functionality is implemented as plugins – File formats, descriptors, fingerprints, and arbitrary operations that take molecules and do something • Relatively straightforward to add your own plugins, even if you have never programmed in C++ before – Easier to add a plugin than write your own C++ application – Can use the obabel command-line to call it – Can optionally donate the plugin to the community • Almost anything can be a plugin – I have written an entire conformation generator as a plugin (Confab)
  • 19. The GPL and industry • Companies can use or modify Open Babel, add plugins, and write their own code using it without any problem • If they distribute the resulting software outside the company then they need to provide the source code under the GPL – This clause really only affects software companies developing their own products, not end users in companies
  • 20. Industry involvement Code • OpenEye • eMolecules • Silicos-IT • Kitware • Dalke Scientific • Acpharis • Astex • Materials Design • Schrödinger • Vernalis Note: based on email addresses • Acellera • AMRI • ArQule • Avant-garde materials sim • Avesthagen • Basilea • Bayer • Cambridgesoft • Constellation Pharma • Culgi • Digital Chemistry • Evotec • Givaudin • Global Phasing • GreenPharma • Inhibox • Ingenuity • Invitrogen (now ThermoFisher) • Jubilant Biosys • Lexicon • Ligon Discovery • LHASA • Merck(.de) • Molplex • OmegaChem • PeakDale • Prometic • PsycoGenics • Specs • Symyx/Accelrys • Syngenta • Takasago • Targacept • Thomson Reuters Emails to list
  • 21. Supporting open source • When emailing a list, please give your affiliation – It’s nice to know companies find it useful • Spread the word, give credit in talks • Give feedback – What we’re doing right/wrong – Can help reorder our priorities/reality check • Bug bounty?
  • 22. Future outlook • Dude, there’s a plan?? • New features are driven by needs/interests of individuals – Research interests – Gaps in functionality – Features needed ‘downstream’ by software using the library • Avogadro is driving improved support for QM/MD packages • Generation of 3D structures based on distance geometry • Housekeeping: Kekulization rewrite, implicit valency • Improved performance? Has historically been low on the agenda. • Would be nice to have meetings like RDKit does • What do *you* think we should be focusing on?
  • 24. A cry for help Like mailing lists? openbabel- discuss@lists.sf.net Like forums? http://forums.openbabel.org Like to email a developer directly? Step away from the keyboard :-) Don’t forget to read the docs first and Google it http://openbabel.org/docs Image: Tintin44 (Flickr)

Editor's Notes

  1. OB is like a Swiss army knife, not a…
  2. …spork!
  3. “The 70s are calling. They want their depiction back.”