SlideShare a Scribd company logo
1 of 34
Download to read offline
The Open Chemistry Project
       February 2, 2013
          FOSDEM

     Dr. Marcus D. Hanwell
     Technical Leader, Kitware, Inc.
     marcus.hanwell@kitware.com
     http://openchemistry.org/         1	
  
Outline
•  Kitware
•  History
•  Open Chemistry
  –  Avogadro
  –  MoleQueue
  –  MongoChem
•  Final thoughts


                    2	
  
Kitware
•  Founded in 1998 by 5 former GE Research employees
•  111 employees: 38 with PhDs
•  Privately held, profitable from creation, no debt
•  Rapidly Growing: >30% in 2011, 7M web-visitors/quarter
•  Offices                                 •  2011 Small Business
                                              Administration’s
   –  Albany, NY                              Tibbetts Award
   –  Carrboro, NC
                                           •  HPCWire Readers
   –  Santa Fe, NM                            and Editor’s Choice

   –  Lyon, France                         •  Inc’s 5000 List: 2008
                                              to 2011
Kitware: Core Technologies




                CMake




                CDash




                             4	
  
Beginnings of Open Chemistry
•  The Avogadro project began in 2006
•  One of very few open source 3D chemical editors
     •  Draw/edit structure
     •  Generate input for codes
     •  Analyze output of codes
•    Open source, GPLv2 GUI
•    Google Summer of Code in 2007 (KDE)
•    Used by Kalzium in KDE educational tool
•    Over 300,000 downloads, 20+ translations


                                                     5	
  
Avogadro Paper Published 8/13/12




         http://www.jcheminf.com/content/4/1/17
                                                  6	
  
Introduction to Open Chemistry
•  User-friendly integration with
  –  Computational codes
  –  HPC/cloud resources
  –  Database/informatics resources




                                      7	
  
Vision
•  Advancing the state-of-the-art
•  Tight integration is needed
   •    Computational codes
   •    Clusters/supercomputers
   •    Data repositories
   •    Reduce, reuse, recycle!
•  Facilitate sharing and
   searching of data
•  Embracing data-centric workflows

                                      8	
  
Overview
•  Desktop chemistry application suite
  –  3D structure editor, pre- and post-processing
  –  HPC integration – easily run codes
  –  Cheminformatics to store, index, and analyze
•  Each can work independently
  –  Enhanced functionality when used together
  –  One-click HPC job submission
  –  Easily open structure found in database
  –  Coordination of job submission
                                                     9	
  
Open Chemistry Project Approach
•  Open approach to chemistry software
  –  Open source frameworks
  –  Developed openly
  –  Cross platform
  –  Tested, verified
  –  Contribution model
  –  Supported by Kitware experts
•  BSD licensed to facilitate research/reuse

                                               10	
  
Open Chemistry Development Team

•  Assembled an inter-disciplinary team
•  Domain specialists: quantum chemistry,
   biology, solid-state materials
•  Computer scientists: build systems,
   queuing, graphics, process
•  Marcus, Kyle, David, and Chris.



                                            11	
  
OpenChemistry.org
•    Central site to promote Open Chemistry
•    Hosting of project-specific pages
•    Providing an identity for related projects
•    Promoting shared ownership of projects
     –  Website
     –  Code submission/review
     –  Testing infrastructure
     –  Wiki, mailing lists, news, galleries

                                                  12	
  
http://openchemistry.org/   13	
  
Applications Being Developed
•  Three independent applications
•  Communication handled with local sockets
•  Avogadro 2 – structure editing, input
   generation, output viewing and analysis
•  MoleQueue – running local and remote
   jobs in standalone programs, management
•  MongoChem – Storage of data, searching,
   entry, annotation
                                          14	
  
Open Frameworks
•  AvogadroLibs – core data structures and
   algorithms shared across codes
  •  Split into dedicated libraries, e.g. core, io,
     rendering, qtgui, qtopengl, qtplugins, quantum
  •  Core maintains a minimal dependency set
  •  Intended for use on server, command line,
     and in a full-blown desktop application
•  VTK – specialized chemistry visualization/
   data structures, use of above
                                                  15	
  
Project Diagram: Libraries/Apps




                                  16	
  
Workflow in Open Chemistry

       Log File                         Input File
                    Avogadro2	
  



  Results	
       MongoChem	
           Job	
  Submission	
  



                  MoleQueue	
        Local

                                     Remote
                    Calcula>on	
  

                                                            17	
  
Avogadro2
•  Rewrite of Avogadro
•  Split into libraries and
application (plugin based)
•  Still one of very few open source editors
•  Still using Qt, C++, Eigen, OpenGL, CMake
•  Use AvogadroLibs for core data
•  Introduce client-server dataflow/patterns
•  New, efficient rendering code
•  More liberally licensed – from GPL to BSD

                                               18	
  
Avogadro: Visualization
•    GPU accelerated rendering
•    VTK for advanced visualization
•    Support for 2D and 3D plots of data
•    Optimized data structures
     –  Large data
     –  Streaming
•  Reworked interface
     –  Tighter database/workflow integration

                                                19	
  
Advanced Impostor Rendering
•  Using a scene, vertex buffer objects, and
   OpenGL shading language
•  Impostor techniques
  –  Sphere goes from 100s of triangles to 2!
  –  No artifacts from triangulation
  –  Scales to millions of spheres on modest GPU




                                                   20	
  
Electronic Structure Visualization
•  Read quantum output files
  –  Calculate cubes for molecular orbitals
  –  Show isosurface or volume rendering
  –  Multithreaded C++ code to perform
     calculations – scales very well




                                              21	
  
Scriptable Simulation Input Generator

•  Previous input generators were C++
•  Execute a simple Python script
  –  Script can output JSON with parameters
  –  Input is parameters specified by user
  –  Chemical JSON with full structure
•  New input generator is as simple as
   adding a new Python script
  –  Implement 2-3 entry points and done

                                              22	
  
MoleQueue: Job Management
•  Tighter integration with remote queues
•  Integration with databases
  –  Retain full log of computational jobs
  –  Trigger actions on completion
•  Plugin based system
  –  Easy addition of new codes
  –  Easy addition of new queue systems
•  Provide client API for applications

                                             23	
  
MoleQueue
•  Support configuration for a variety of
   remote clusters and queuing software
MoleQueue: Queue Types
•  Several transports implemented
  –  Command line SSH/plink (Windows)
  –  libssh2 (experimental)
  –  HTTPS (SOAP)
•  Several queue types
  –  Sun Grid Engine
  –  PBS
  –  UIT (ezHPC with largely PBS dialect)

                                            25	
  
Using JSON
•  MongoDB stores data as BSON
    –  JSON: JavaScript Object Notation
    –  BSON: Binary form, type safe
•  JSON is very compact, standardized
{
    “name”: “water”,
    “atoms”: {
      “elementType”: [“H”, “H”, “O”],
    }
    “properties”: { “molecular weight”: 18.0153 }
}

                                                    26	
  
JSON-RPC interface
  Applications can submit jobs via a local
   socket or ZeroMQ connection:
Client request:
{ "jsonrpc": "2.0",
  "method": "submitJob",
  "params": {
     "queue": "Remote cluster PBS",
     "program": "MOPAC",
     "description": "PM6 H2 optimization",
     "inputAsString": "PM6nnH 0.0 0.0 0.0nH 1.0 0.0 0.0n"
  },
  "id": "XXX” }       Server reply:
                      { "jsonrpc": "2.0",
                         "result": {
                            "moleQueueId": 17,
                            "queueId": 123456,
                            "workingDirectory": "/tmp/MoleQueue/17/"
                         },
                         "id": "XXX” }
Chemical JSON
•  Stores molecular structure,
   geometry, identifiers, and
   descriptors as a JSON object
•  Benefits:
  –  More compact than XML/CML
  –  Native language of MongoDB
     and JSON-RPC
  –  Easily converted to a binary
     representation (BSON)



                                    28	
  
MongoChem Overview
•  A desktop cheminformatics tool
  –  Chemical data exploration and analysis
  –  Interactive, editable, and searchable database
•  Leverages several open-source projects
  –  Qt, VTK, MongoDB, Chemkit, Open Babel
•  Designed to look at many molecules
•  Spot patterns, outliers, run many jobs
•  Scaling studies with ~3 million structures
Architecture Overview
•  Native, cross-platform C++ application built with Qt
•  Stores chemical data in a NoSQL MongoDB database
•  Uses VTK for 2D and 3D dataset visualization




                                                          30	
  
ParaViewWeb and Open Chemistry




                             31	
  
Software Process
•  Source code publicly hosted using Git
•  Gerrit for code review
•  CTest/CDash for testing/summary
  –  Gerrit can use CDash@Home
     •  Test proposed changes before merge
•  CDash can now provide binaries
  –  Built nightly, available for direct download
•  Wiki, mailing list, bug tracker

                                                    32	
  
Software Process




                   33	
  
Final Thoughts
•    Real opportunity to make an impact
•    Bringing best practices to chemistry
•    Improve research, industry and teaching
•    Semantic data at the center of our work
     –  Storage
     –  Search
     –  Interaction with computational codes
     –  Comparison with experimental data
•  Liberal, BSD licensed, cross platform codes

                                                 34	
  

More Related Content

What's hot

Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryMarcus Hanwell
 
Evolution of database access technologies in Java-based software projects
Evolution of database access technologies in Java-based software projectsEvolution of database access technologies in Java-based software projects
Evolution of database access technologies in Java-based software projectsTom Mens
 
LDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationLDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationTanu Malik
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...University of California, San Diego
 
PTU: Using Provenance for Repeatability
PTU: Using Provenance for RepeatabilityPTU: Using Provenance for Repeatability
PTU: Using Provenance for RepeatabilityTanu Malik
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsTanu Malik
 
Ipaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, IanIpaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, IanBoris Glavic
 
GlobusWorld 2015
GlobusWorld 2015GlobusWorld 2015
GlobusWorld 2015Tanu Malik
 
Tim Pugh-SPEDDEXES 2014
Tim Pugh-SPEDDEXES 2014Tim Pugh-SPEDDEXES 2014
Tim Pugh-SPEDDEXES 2014aceas13tern
 
Reproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookReproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookKeiichiro Ono
 
Open-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitOpen-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitGreg Landrum
 
Best pratices at BGI for the Challenges in the Era of Big Genomics Data
Best pratices at BGI for the Challenges in the Era of Big Genomics DataBest pratices at BGI for the Challenges in the Era of Big Genomics Data
Best pratices at BGI for the Challenges in the Era of Big Genomics DataXing Xu
 
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBenchWBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBencht_ivanov
 
GeoDataspace: Simplifying Data Management Tasks with Globus
GeoDataspace: Simplifying Data Management Tasks with GlobusGeoDataspace: Simplifying Data Management Tasks with Globus
GeoDataspace: Simplifying Data Management Tasks with GlobusTanu Malik
 
07 data structures_and_representations
07 data structures_and_representations07 data structures_and_representations
07 data structures_and_representationsMarco Quartulli
 
Some "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data frontSome "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data frontGreg Landrum
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 

What's hot (20)

Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
 
Evolution of database access technologies in Java-based software projects
Evolution of database access technologies in Java-based software projectsEvolution of database access technologies in Java-based software projects
Evolution of database access technologies in Java-based software projects
 
LDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationLDV: Light-weight Database Virtualization
LDV: Light-weight Database Virtualization
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use Case
 
Bioinformatics on Azure
Bioinformatics on AzureBioinformatics on Azure
Bioinformatics on Azure
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
 
PTU: Using Provenance for Repeatability
PTU: Using Provenance for RepeatabilityPTU: Using Provenance for Repeatability
PTU: Using Provenance for Repeatability
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC Programs
 
Ipaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, IanIpaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, Ian
 
GlobusWorld 2015
GlobusWorld 2015GlobusWorld 2015
GlobusWorld 2015
 
Tim Pugh-SPEDDEXES 2014
Tim Pugh-SPEDDEXES 2014Tim Pugh-SPEDDEXES 2014
Tim Pugh-SPEDDEXES 2014
 
Reproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookReproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter Notebook
 
io-Chem-BD, una solució per gestionar el Big Data en Química Computacional
io-Chem-BD, una solució per gestionar el Big Data en Química Computacionalio-Chem-BD, una solució per gestionar el Big Data en Química Computacional
io-Chem-BD, una solució per gestionar el Big Data en Química Computacional
 
Open-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitOpen-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKit
 
Best pratices at BGI for the Challenges in the Era of Big Genomics Data
Best pratices at BGI for the Challenges in the Era of Big Genomics DataBest pratices at BGI for the Challenges in the Era of Big Genomics Data
Best pratices at BGI for the Challenges in the Era of Big Genomics Data
 
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBenchWBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
 
GeoDataspace: Simplifying Data Management Tasks with Globus
GeoDataspace: Simplifying Data Management Tasks with GlobusGeoDataspace: Simplifying Data Management Tasks with Globus
GeoDataspace: Simplifying Data Management Tasks with Globus
 
07 data structures_and_representations
07 data structures_and_representations07 data structures_and_representations
07 data structures_and_representations
 
Some "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data frontSome "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data front
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 

Viewers also liked

Chemistry project on casein in mik
Chemistry project on casein in mikChemistry project on casein in mik
Chemistry project on casein in mikVirat Prasad
 
chemistry Project class 12
chemistry Project class 12chemistry Project class 12
chemistry Project class 12Vighnesh Jm
 
Chemistry project on chemistry in everyday life
Chemistry project on chemistry in everyday lifeChemistry project on chemistry in everyday life
Chemistry project on chemistry in everyday lifeShashvat Sharma
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...MongoDB
 
Atom Economy - Comenius Project
Atom Economy - Comenius ProjectAtom Economy - Comenius Project
Atom Economy - Comenius Projectclasse4ach
 
Chemistry Investigatory Project Class 12
Chemistry Investigatory Project Class 12Chemistry Investigatory Project Class 12
Chemistry Investigatory Project Class 12Rushil Aggarwal
 
Copy Of Determination Of The Contents Of Cold Drinks
Copy Of Determination Of The Contents Of Cold DrinksCopy Of Determination Of The Contents Of Cold Drinks
Copy Of Determination Of The Contents Of Cold DrinksHimanshu Sagar
 
Atom economy - "Green Chemistry Project"
Atom economy - "Green Chemistry Project"Atom economy - "Green Chemistry Project"
Atom economy - "Green Chemistry Project"classe4ach
 
Peter Ward: The True Power of SharePoint Designer Workflows
Peter Ward: The True Power of SharePoint Designer WorkflowsPeter Ward: The True Power of SharePoint Designer Workflows
Peter Ward: The True Power of SharePoint Designer WorkflowsSharePoint Saturday NY
 
Winds of change: The shifting face of leadership in business
Winds of change: The shifting face of leadership in businessWinds of change: The shifting face of leadership in business
Winds of change: The shifting face of leadership in businessThe Economist Media Businesses
 
SharePoint Branding Guidance @ SharePoint Saturday Redmond
SharePoint Branding Guidance @ SharePoint Saturday RedmondSharePoint Branding Guidance @ SharePoint Saturday Redmond
SharePoint Branding Guidance @ SharePoint Saturday RedmondKanwal Khipple
 
Universidad nacional chimborazo
Universidad nacional chimborazoUniversidad nacional chimborazo
Universidad nacional chimborazoFernando Cáceres
 
Notice flexible retractable retraflex
Notice flexible retractable retraflexNotice flexible retractable retraflex
Notice flexible retractable retraflexHomexity
 
Monografía EXPOSICION EN EVALUACION
Monografía EXPOSICION EN EVALUACION Monografía EXPOSICION EN EVALUACION
Monografía EXPOSICION EN EVALUACION Banesa Ruiz
 
HAPPYWEEK 197 - 2016.12.05. happyweek 197
HAPPYWEEK 197 - 2016.12.05. happyweek 197HAPPYWEEK 197 - 2016.12.05. happyweek 197
HAPPYWEEK 197 - 2016.12.05. happyweek 197Jiří Černák
 
Presentación de socializacion aamtic
Presentación de socializacion aamticPresentación de socializacion aamtic
Presentación de socializacion aamticPolo Apolo
 

Viewers also liked (20)

Chemistry project on casein in mik
Chemistry project on casein in mikChemistry project on casein in mik
Chemistry project on casein in mik
 
chemistry Project class 12
chemistry Project class 12chemistry Project class 12
chemistry Project class 12
 
Chemistry project on chemistry in everyday life
Chemistry project on chemistry in everyday lifeChemistry project on chemistry in everyday life
Chemistry project on chemistry in everyday life
 
C7 lesson part four
C7 lesson part fourC7 lesson part four
C7 lesson part four
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
 
Atom Economy - Comenius Project
Atom Economy - Comenius ProjectAtom Economy - Comenius Project
Atom Economy - Comenius Project
 
Chemistry Investigatory Project Class 12
Chemistry Investigatory Project Class 12Chemistry Investigatory Project Class 12
Chemistry Investigatory Project Class 12
 
Copy Of Determination Of The Contents Of Cold Drinks
Copy Of Determination Of The Contents Of Cold DrinksCopy Of Determination Of The Contents Of Cold Drinks
Copy Of Determination Of The Contents Of Cold Drinks
 
Atom economy - "Green Chemistry Project"
Atom economy - "Green Chemistry Project"Atom economy - "Green Chemistry Project"
Atom economy - "Green Chemistry Project"
 
Chemistry project
Chemistry projectChemistry project
Chemistry project
 
Peter Ward: The True Power of SharePoint Designer Workflows
Peter Ward: The True Power of SharePoint Designer WorkflowsPeter Ward: The True Power of SharePoint Designer Workflows
Peter Ward: The True Power of SharePoint Designer Workflows
 
Winds of change: The shifting face of leadership in business
Winds of change: The shifting face of leadership in businessWinds of change: The shifting face of leadership in business
Winds of change: The shifting face of leadership in business
 
SharePoint Branding Guidance @ SharePoint Saturday Redmond
SharePoint Branding Guidance @ SharePoint Saturday RedmondSharePoint Branding Guidance @ SharePoint Saturday Redmond
SharePoint Branding Guidance @ SharePoint Saturday Redmond
 
Universidad nacional chimborazo
Universidad nacional chimborazoUniversidad nacional chimborazo
Universidad nacional chimborazo
 
Notice flexible retractable retraflex
Notice flexible retractable retraflexNotice flexible retractable retraflex
Notice flexible retractable retraflex
 
Derecho informatico
Derecho informaticoDerecho informatico
Derecho informatico
 
Monografía EXPOSICION EN EVALUACION
Monografía EXPOSICION EN EVALUACION Monografía EXPOSICION EN EVALUACION
Monografía EXPOSICION EN EVALUACION
 
Word debate
Word debateWord debate
Word debate
 
HAPPYWEEK 197 - 2016.12.05. happyweek 197
HAPPYWEEK 197 - 2016.12.05. happyweek 197HAPPYWEEK 197 - 2016.12.05. happyweek 197
HAPPYWEEK 197 - 2016.12.05. happyweek 197
 
Presentación de socializacion aamtic
Presentación de socializacion aamticPresentación de socializacion aamtic
Presentación de socializacion aamtic
 

Similar to The Open Chemistry Project

IMPACT Interoperability Framework - Clemens Neudecker
IMPACT Interoperability Framework - Clemens NeudeckerIMPACT Interoperability Framework - Clemens Neudecker
IMPACT Interoperability Framework - Clemens NeudeckerIMPACT Centre of Competence
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSSteve Wong
 
Kubernetes meetup bangalore december 2017 - v02
Kubernetes meetup bangalore   december 2017 - v02Kubernetes meetup bangalore   december 2017 - v02
Kubernetes meetup bangalore december 2017 - v02Kumar Gaurav
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Subbu Rama
 
How static analysis supports quality over 50 million lines of C++ code
How static analysis supports quality over 50 million lines of C++ codeHow static analysis supports quality over 50 million lines of C++ code
How static analysis supports quality over 50 million lines of C++ codecppfrug
 
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup BangaloreKubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup BangaloreKrishna-Kumar
 
Adam bosc-071114
Adam bosc-071114Adam bosc-071114
Adam bosc-071114fnothaft
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Christopher Curtin
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuantUniversity
 
Adopting OpenTelemetry
Adopting OpenTelemetryAdopting OpenTelemetry
Adopting OpenTelemetryVincent Behar
 
Kitware: Qt and Scientific Computing
Kitware: Qt and Scientific ComputingKitware: Qt and Scientific Computing
Kitware: Qt and Scientific Computingaccount inactive
 
Enterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a ServiceEnterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a ServiceMongoDB
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven productsLars Albertsson
 
Prometheus - basics
Prometheus - basicsPrometheus - basics
Prometheus - basicsJuraj Hantak
 
GEO Analytics Canada Overview April 2020
GEO Analytics Canada Overview April 2020GEO Analytics Canada Overview April 2020
GEO Analytics Canada Overview April 2020GEO Analytics Canada
 
HPC Web overview - Mobyle Workshop - September 28, 2012
HPC Web overview - Mobyle Workshop - September 28, 2012HPC Web overview - Mobyle Workshop - September 28, 2012
HPC Web overview - Mobyle Workshop - September 28, 2012Hervé Ménager
 
NoSQL on the move
NoSQL on the moveNoSQL on the move
NoSQL on the moveCodemotion
 
The Containers Ecosystem, the OpenStack Magnum Project, the Open Container In...
The Containers Ecosystem, the OpenStack Magnum Project, the Open Container In...The Containers Ecosystem, the OpenStack Magnum Project, the Open Container In...
The Containers Ecosystem, the OpenStack Magnum Project, the Open Container In...Daniel Krook
 

Similar to The Open Chemistry Project (20)

IMPACT Interoperability Framework - Clemens Neudecker
IMPACT Interoperability Framework - Clemens NeudeckerIMPACT Interoperability Framework - Clemens Neudecker
IMPACT Interoperability Framework - Clemens Neudecker
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
Kubernetes meetup bangalore december 2017 - v02
Kubernetes meetup bangalore   december 2017 - v02Kubernetes meetup bangalore   december 2017 - v02
Kubernetes meetup bangalore december 2017 - v02
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
 
How static analysis supports quality over 50 million lines of C++ code
How static analysis supports quality over 50 million lines of C++ codeHow static analysis supports quality over 50 million lines of C++ code
How static analysis supports quality over 50 million lines of C++ code
 
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup BangaloreKubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
 
Adam bosc-071114
Adam bosc-071114Adam bosc-071114
Adam bosc-071114
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
 
Adopting OpenTelemetry
Adopting OpenTelemetryAdopting OpenTelemetry
Adopting OpenTelemetry
 
Kitware: Qt and Scientific Computing
Kitware: Qt and Scientific ComputingKitware: Qt and Scientific Computing
Kitware: Qt and Scientific Computing
 
Enterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a ServiceEnterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a Service
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
Prometheus - basics
Prometheus - basicsPrometheus - basics
Prometheus - basics
 
GEO Analytics Canada Overview April 2020
GEO Analytics Canada Overview April 2020GEO Analytics Canada Overview April 2020
GEO Analytics Canada Overview April 2020
 
HPC Web overview - Mobyle Workshop - September 28, 2012
HPC Web overview - Mobyle Workshop - September 28, 2012HPC Web overview - Mobyle Workshop - September 28, 2012
HPC Web overview - Mobyle Workshop - September 28, 2012
 
NoSQL on the move
NoSQL on the moveNoSQL on the move
NoSQL on the move
 
The Containers Ecosystem, the OpenStack Magnum Project, the Open Container In...
The Containers Ecosystem, the OpenStack Magnum Project, the Open Container In...The Containers Ecosystem, the OpenStack Magnum Project, the Open Container In...
The Containers Ecosystem, the OpenStack Magnum Project, the Open Container In...
 

Recently uploaded

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 

Recently uploaded (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 

The Open Chemistry Project

  • 1. The Open Chemistry Project February 2, 2013 FOSDEM Dr. Marcus D. Hanwell Technical Leader, Kitware, Inc. marcus.hanwell@kitware.com http://openchemistry.org/ 1  
  • 2. Outline •  Kitware •  History •  Open Chemistry –  Avogadro –  MoleQueue –  MongoChem •  Final thoughts 2  
  • 3. Kitware •  Founded in 1998 by 5 former GE Research employees •  111 employees: 38 with PhDs •  Privately held, profitable from creation, no debt •  Rapidly Growing: >30% in 2011, 7M web-visitors/quarter •  Offices •  2011 Small Business Administration’s –  Albany, NY Tibbetts Award –  Carrboro, NC •  HPCWire Readers –  Santa Fe, NM and Editor’s Choice –  Lyon, France •  Inc’s 5000 List: 2008 to 2011
  • 4. Kitware: Core Technologies CMake CDash 4  
  • 5. Beginnings of Open Chemistry •  The Avogadro project began in 2006 •  One of very few open source 3D chemical editors •  Draw/edit structure •  Generate input for codes •  Analyze output of codes •  Open source, GPLv2 GUI •  Google Summer of Code in 2007 (KDE) •  Used by Kalzium in KDE educational tool •  Over 300,000 downloads, 20+ translations 5  
  • 6. Avogadro Paper Published 8/13/12 http://www.jcheminf.com/content/4/1/17 6  
  • 7. Introduction to Open Chemistry •  User-friendly integration with –  Computational codes –  HPC/cloud resources –  Database/informatics resources 7  
  • 8. Vision •  Advancing the state-of-the-art •  Tight integration is needed •  Computational codes •  Clusters/supercomputers •  Data repositories •  Reduce, reuse, recycle! •  Facilitate sharing and searching of data •  Embracing data-centric workflows 8  
  • 9. Overview •  Desktop chemistry application suite –  3D structure editor, pre- and post-processing –  HPC integration – easily run codes –  Cheminformatics to store, index, and analyze •  Each can work independently –  Enhanced functionality when used together –  One-click HPC job submission –  Easily open structure found in database –  Coordination of job submission 9  
  • 10. Open Chemistry Project Approach •  Open approach to chemistry software –  Open source frameworks –  Developed openly –  Cross platform –  Tested, verified –  Contribution model –  Supported by Kitware experts •  BSD licensed to facilitate research/reuse 10  
  • 11. Open Chemistry Development Team •  Assembled an inter-disciplinary team •  Domain specialists: quantum chemistry, biology, solid-state materials •  Computer scientists: build systems, queuing, graphics, process •  Marcus, Kyle, David, and Chris. 11  
  • 12. OpenChemistry.org •  Central site to promote Open Chemistry •  Hosting of project-specific pages •  Providing an identity for related projects •  Promoting shared ownership of projects –  Website –  Code submission/review –  Testing infrastructure –  Wiki, mailing lists, news, galleries 12  
  • 14. Applications Being Developed •  Three independent applications •  Communication handled with local sockets •  Avogadro 2 – structure editing, input generation, output viewing and analysis •  MoleQueue – running local and remote jobs in standalone programs, management •  MongoChem – Storage of data, searching, entry, annotation 14  
  • 15. Open Frameworks •  AvogadroLibs – core data structures and algorithms shared across codes •  Split into dedicated libraries, e.g. core, io, rendering, qtgui, qtopengl, qtplugins, quantum •  Core maintains a minimal dependency set •  Intended for use on server, command line, and in a full-blown desktop application •  VTK – specialized chemistry visualization/ data structures, use of above 15  
  • 17. Workflow in Open Chemistry Log File Input File Avogadro2   Results   MongoChem   Job  Submission   MoleQueue   Local Remote Calcula>on   17  
  • 18. Avogadro2 •  Rewrite of Avogadro •  Split into libraries and application (plugin based) •  Still one of very few open source editors •  Still using Qt, C++, Eigen, OpenGL, CMake •  Use AvogadroLibs for core data •  Introduce client-server dataflow/patterns •  New, efficient rendering code •  More liberally licensed – from GPL to BSD 18  
  • 19. Avogadro: Visualization •  GPU accelerated rendering •  VTK for advanced visualization •  Support for 2D and 3D plots of data •  Optimized data structures –  Large data –  Streaming •  Reworked interface –  Tighter database/workflow integration 19  
  • 20. Advanced Impostor Rendering •  Using a scene, vertex buffer objects, and OpenGL shading language •  Impostor techniques –  Sphere goes from 100s of triangles to 2! –  No artifacts from triangulation –  Scales to millions of spheres on modest GPU 20  
  • 21. Electronic Structure Visualization •  Read quantum output files –  Calculate cubes for molecular orbitals –  Show isosurface or volume rendering –  Multithreaded C++ code to perform calculations – scales very well 21  
  • 22. Scriptable Simulation Input Generator •  Previous input generators were C++ •  Execute a simple Python script –  Script can output JSON with parameters –  Input is parameters specified by user –  Chemical JSON with full structure •  New input generator is as simple as adding a new Python script –  Implement 2-3 entry points and done 22  
  • 23. MoleQueue: Job Management •  Tighter integration with remote queues •  Integration with databases –  Retain full log of computational jobs –  Trigger actions on completion •  Plugin based system –  Easy addition of new codes –  Easy addition of new queue systems •  Provide client API for applications 23  
  • 24. MoleQueue •  Support configuration for a variety of remote clusters and queuing software
  • 25. MoleQueue: Queue Types •  Several transports implemented –  Command line SSH/plink (Windows) –  libssh2 (experimental) –  HTTPS (SOAP) •  Several queue types –  Sun Grid Engine –  PBS –  UIT (ezHPC with largely PBS dialect) 25  
  • 26. Using JSON •  MongoDB stores data as BSON –  JSON: JavaScript Object Notation –  BSON: Binary form, type safe •  JSON is very compact, standardized { “name”: “water”, “atoms”: { “elementType”: [“H”, “H”, “O”], } “properties”: { “molecular weight”: 18.0153 } } 26  
  • 27. JSON-RPC interface Applications can submit jobs via a local socket or ZeroMQ connection: Client request: { "jsonrpc": "2.0", "method": "submitJob", "params": { "queue": "Remote cluster PBS", "program": "MOPAC", "description": "PM6 H2 optimization", "inputAsString": "PM6nnH 0.0 0.0 0.0nH 1.0 0.0 0.0n" }, "id": "XXX” } Server reply: { "jsonrpc": "2.0", "result": { "moleQueueId": 17, "queueId": 123456, "workingDirectory": "/tmp/MoleQueue/17/" }, "id": "XXX” }
  • 28. Chemical JSON •  Stores molecular structure, geometry, identifiers, and descriptors as a JSON object •  Benefits: –  More compact than XML/CML –  Native language of MongoDB and JSON-RPC –  Easily converted to a binary representation (BSON) 28  
  • 29. MongoChem Overview •  A desktop cheminformatics tool –  Chemical data exploration and analysis –  Interactive, editable, and searchable database •  Leverages several open-source projects –  Qt, VTK, MongoDB, Chemkit, Open Babel •  Designed to look at many molecules •  Spot patterns, outliers, run many jobs •  Scaling studies with ~3 million structures
  • 30. Architecture Overview •  Native, cross-platform C++ application built with Qt •  Stores chemical data in a NoSQL MongoDB database •  Uses VTK for 2D and 3D dataset visualization 30  
  • 31. ParaViewWeb and Open Chemistry 31  
  • 32. Software Process •  Source code publicly hosted using Git •  Gerrit for code review •  CTest/CDash for testing/summary –  Gerrit can use CDash@Home •  Test proposed changes before merge •  CDash can now provide binaries –  Built nightly, available for direct download •  Wiki, mailing list, bug tracker 32  
  • 34. Final Thoughts •  Real opportunity to make an impact •  Bringing best practices to chemistry •  Improve research, industry and teaching •  Semantic data at the center of our work –  Storage –  Search –  Interaction with computational codes –  Comparison with experimental data •  Liberal, BSD licensed, cross platform codes 34