SlideShare a Scribd company logo
1 of 19
Download to read offline
© Copyright Ovum. All rights reserved. Ovum is a subsidiary of Informa plc.1
Hadoop, SQL & NoSQL –
No longer an either or
question
Tony Baer
Hadoop Summit 2014
June 4, 2014
© Copyright Ovum. All rights reserved. Ovum is an Informa business.2
 Where we’ve come – Twins separated at birth & joyous reunion
 Why/how the convergence?
 Loose ends
Agenda
© Copyright Ovum. All rights reserved. Ovum is an Informa business.3
SQL
RDBMS
File systems
Hierarchical Data stores
OODBMS
SQL, NoSQL, Hadoop
1970s
1980s
1990s
2000s
2010s
Network Data stores
© Copyright Ovum. All rights reserved. Ovum is an Informa business.4
Early
Development
Commercialization Ecosystem
Formation
1960s 1980s 1990s 2000s
“Prehistoric”
EF Codd
publishes
seminal
RDBMS
model
IBM
System R,
Ingres
DB2,
Oracle,
Teradata,
PC-based
DBMSs
SQL
becomes
de facto
enterprise
standard
data
platform
Tooling
emerges
SQL market
consolidates:
Oracle, DB2,
SQL Server,
Teradata
NewSQL
analytic
platforms
emerge
Mainframe era Midranges &
PCs emerge
Big Data
2014
DBMSs add
multiple
engines
Database timeline
1970s
Client/server &
n-Tier
Ecosystem
Broadens
CODASYL,
IMS
MySQL/
LAMP stack
emerges
J2EE,
.NET
© Copyright Ovum. All rights reserved. Ovum is an Informa business.5
Early Development Commercialization Ecosystem Formation
2003 - 2005 2009 2011 2012 2013
First
Advanced
SQL
platforms
emerge
Hadoop
emerges
Other
NoSQL
platforms
emerge
Cloudera
intros
comm’l
Hadoop
support
Major
vendors
enter Big
Data
market
Tooling
emerges
2nd wave
NewSQL
platforms
emerge
Big Data Tools
emerge
Internet firm early
adopters
Enterprise early
adopters (FS & Media)
Mainstream
adoption
begins
2014
Big Data
Apps
emerge
Big Data platform timeline
Hortonworks
enters
market
MongoDB,
Cassandra
emerge
© Copyright Ovum. All rights reserved. Ovum is an Informa business.6
Platform proliferation =
Data processing silos
SQL
RDBMS
NewSQL
RDBMS
NoSQL
Key-Value
NoSQL
JSON
Hadoop
OLTP
(ACID)
OLTP
(Non-
ACID)
BI
Query &
Report
Analytics
OLTP
(Non-
ACID)
Advanced
Analytics
Operational
Decision
Support
Operational
Decision
Support
MapReduce-
based
Advanced
Analytics
© Copyright Ovum. All rights reserved. Ovum is an Informa business.7
 Where we’ve come – Twins separated at birth & joyous reunion
 Why/how the convergence?
 Loose ends
Agenda
© Copyright Ovum. All rights reserved. Ovum is an Informa business.8
Analytic SLA requirements vary
Batch Periodic Interactive Real-time
Exploratory
Analytics Standard
reporting
Days/Hours Seconds Split seconds
Interactive
query
Search
Streaming
Decision
Support
Modeling
Operational
Decision
Support
Hours/Minutes
© Copyright Ovum. All rights reserved. Ovum is an Informa business.9
Analytics problems cross silos –
Operational examples
 Customer engagement
 Interaction – Customer 360 query in DW
 Behavior – Enrich with sentiment analysis on Hadoop
 Engagement – Manage real-time engagement on NoSQL
database
 Risk mitigation
 Baseline – Model party & transactional risk on DW or
Hadoop
 Enrich – Analyze, rank impact of externalities on Hadoop
 Ingest – Real-time market feeds via streaming in-memory
 Define – Decision processes offline via BPM
 Act – Allow/deny credit on system of record
© Copyright Ovum. All rights reserved. Ovum is an Informa business.10
Architecture –
Common threads
 Aggressive tiering
 Multiple storage engines
 Multiple workload types
 On the horizon:
 Federated query
 Workload/query orchestration
 Loose ends:
 Common security?
© Copyright Ovum. All rights reserved. Ovum is an Informa business.11
SQL Databases adding multiple personas
 IBM DB2
 BLU architecture adds columnar, data skipping, advanced tiering
 New MongoDB-compliant JSON data store
 Oracle Database 12c
 “In-Memory” option adds DRAM-based columnar, extreme compression
 Microsoft SQL Server
 PDW adds columnar indexing
 PolyBase feature adds Hadoop integration
 Teradata
 Teradata 14.10 adds “Intelligent Memory” data tiering, columnar, Hadoop integration
 Aster 6 adds graph, file store, “SNAP” framework for choreographing SQL, MapReduce, graph
& Hadoop processing
 SAP
 “Smart Data Access” federated query over HANA, Sybase IQ, Teradata & Hadoop
© Copyright Ovum. All rights reserved. Ovum is an Informa business.12
Hadoop growing beyond MapReduce
 Apache Hadoop 2.0’s new YARN resource allocation framework allows
multiple workloads
 Interactive SQL – lots of flavors
 Spark – The new MapReduce & more…
 Search
 Streaming
 Loose ends:
 Graph ready for prime time?
© Copyright Ovum. All rights reserved. Ovum is an Informa business.13
Emerging NewSQL + NoSQL databases
 JSON data stores exploding
 Intuitive for representing Internet data
 MongoDB, Couchbase
 IBM, Teradata… potentially Oracle adding JSON
 New transaction stores … not full ACID
 Cassandra for NoSQL (integrated to Hadoop)
 NuoDB, Clustrix, MemSQL & others reinvent OLTP for
distributed Internet apps
 HBase
 DynamoDB, Berkeley DB (Oracle NoSQL database) &
other key-value stores
© Copyright Ovum. All rights reserved. Ovum is an Informa business.14
A variety of overlapping choices
NewSQL
JSON
Graph
Hadoop
SQL
Deep analytics
Stream
Graph
NoSQL
Account/user profiles
Interactive content
Graph
Machine data
JSON
SQL RDBMS
OLTP
DW
JSON
Distributed OLTP
Fast, deep analytics
Active Archiving
SQLRDBMS
NewSQLRDBMS
NoSQLKey-Value
NoSQLJSON
Hadoop
From To
© Copyright Ovum. All rights reserved. Ovum is an Informa business.15
A variety of overlapping choices –
But…
Who owns
the logical
hub?
SQL RDBMS NewSQL
Hadoop NoSQL
OLTP
DW
Active Archiving
JSON
Distributed OLTP
Fast, deep analytics
JSON
Graph
SQL
Deep analytics
Stream
Graph
Account/user profiles
Interactive content
Graph
Machine data
JSON
© Copyright Ovum. All rights reserved. Ovum is an Informa business.16
 Where we’ve come – Twins separated at birth & joyous reunion
 Why/how the convergence?
 Loose ends
Agenda
© Copyright Ovum. All rights reserved. Ovum is an Informa business.17
Loose ends
 Ideally, policy-based federated
query will be the solution
 Who owns federated query?
 Data platform?
 BI tool?
 Application?
 Who owns workload management?
 Who owns security?
Tug of war between data platforms likely
© Copyright Ovum. All rights reserved. Ovum is an Informa business.18
Takeaways
 Analytics no longer limited by platform constraints
 Data platforms are taking multiple personas –
 Platform choice is not either/or
 But
 Analytics are no longer silo’ed
 Execution remains silo’ed
 The brass ring will be a logical hub for
 Policy/SLA-based workload targeting & management
 Security & operations/performance management
© Copyright Ovum. All rights reserved. Ovum is a subsidiary of Informa plc.19
Thank you
Tony Baer
Ovum
(646) 546-5330
@TonyBaer
tony.baer@ovum.com

More Related Content

What's hot

Postgres.foreign.data.wrappers.2015
Postgres.foreign.data.wrappers.2015Postgres.foreign.data.wrappers.2015
Postgres.foreign.data.wrappers.2015EDB
 
Making Sense of Big data with Hadoop
Making Sense of Big data with HadoopMaking Sense of Big data with Hadoop
Making Sense of Big data with HadoopGwen (Chen) Shapira
 
IDERA Live | Working with Complex Data Environments
IDERA Live | Working with Complex Data EnvironmentsIDERA Live | Working with Complex Data Environments
IDERA Live | Working with Complex Data EnvironmentsIDERA Software
 
Hadoop World Vertica
Hadoop World VerticaHadoop World Vertica
Hadoop World VerticaOmer Trajman
 
Postgres: The NoSQL Cake You Can Eat
Postgres: The NoSQL Cake You Can EatPostgres: The NoSQL Cake You Can Eat
Postgres: The NoSQL Cake You Can EatEDB
 
Massive parallel processing database systems mpp
Massive parallel processing database systems mppMassive parallel processing database systems mpp
Massive parallel processing database systems mppDiana Patricia Rey Cabra
 
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos LinardosApache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos LinardosEuangelos Linardos
 
Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDBSandun Perera
 
Accelerating Analytics with EMR on your S3 Data Lake
Accelerating Analytics with EMR on your S3 Data LakeAccelerating Analytics with EMR on your S3 Data Lake
Accelerating Analytics with EMR on your S3 Data LakeAlluxio, Inc.
 
Empowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine LearningEmpowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine LearningDataWorks Summit
 
How Big Data and Hadoop Integrated into BMC ControlM at CARFAX
How Big Data and Hadoop Integrated into BMC ControlM at CARFAXHow Big Data and Hadoop Integrated into BMC ControlM at CARFAX
How Big Data and Hadoop Integrated into BMC ControlM at CARFAXBMC Software
 
DDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for ExascaleDDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for ExascaleIntel IT Center
 
Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitSaptak Sen
 
Introduction to Designing and Building Big Data Applications
Introduction to Designing and Building Big Data ApplicationsIntroduction to Designing and Building Big Data Applications
Introduction to Designing and Building Big Data ApplicationsCloudera, Inc.
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudLeons Petražickis
 

What's hot (20)

Postgres.foreign.data.wrappers.2015
Postgres.foreign.data.wrappers.2015Postgres.foreign.data.wrappers.2015
Postgres.foreign.data.wrappers.2015
 
Vertica
VerticaVertica
Vertica
 
Making Sense of Big data with Hadoop
Making Sense of Big data with HadoopMaking Sense of Big data with Hadoop
Making Sense of Big data with Hadoop
 
IDERA Live | Working with Complex Data Environments
IDERA Live | Working with Complex Data EnvironmentsIDERA Live | Working with Complex Data Environments
IDERA Live | Working with Complex Data Environments
 
DDN Product Update from SC13
DDN Product Update from SC13DDN Product Update from SC13
DDN Product Update from SC13
 
Hadoop World Vertica
Hadoop World VerticaHadoop World Vertica
Hadoop World Vertica
 
Postgres: The NoSQL Cake You Can Eat
Postgres: The NoSQL Cake You Can EatPostgres: The NoSQL Cake You Can Eat
Postgres: The NoSQL Cake You Can Eat
 
Massive parallel processing database systems mpp
Massive parallel processing database systems mppMassive parallel processing database systems mpp
Massive parallel processing database systems mpp
 
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos LinardosApache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
 
Introduction to NuoDB
Introduction to NuoDBIntroduction to NuoDB
Introduction to NuoDB
 
Accelerating Analytics with EMR on your S3 Data Lake
Accelerating Analytics with EMR on your S3 Data LakeAccelerating Analytics with EMR on your S3 Data Lake
Accelerating Analytics with EMR on your S3 Data Lake
 
Empowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine LearningEmpowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine Learning
 
Ddn Vision
Ddn VisionDdn Vision
Ddn Vision
 
IaaS for DBAs in Azure
IaaS for DBAs in AzureIaaS for DBAs in Azure
IaaS for DBAs in Azure
 
How Big Data and Hadoop Integrated into BMC ControlM at CARFAX
How Big Data and Hadoop Integrated into BMC ControlM at CARFAXHow Big Data and Hadoop Integrated into BMC ControlM at CARFAX
How Big Data and Hadoop Integrated into BMC ControlM at CARFAX
 
Flexible Design
Flexible DesignFlexible Design
Flexible Design
 
DDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for ExascaleDDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for Exascale
 
Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop Summit
 
Introduction to Designing and Building Big Data Applications
Introduction to Designing and Building Big Data ApplicationsIntroduction to Designing and Building Big Data Applications
Introduction to Designing and Building Big Data Applications
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
 

Viewers also liked

Toward the Data Cloud
Toward the Data CloudToward the Data Cloud
Toward the Data CloudPaul Miller
 
Development History Data Management in Hadoop
Development History Data Management in HadoopDevelopment History Data Management in Hadoop
Development History Data Management in HadoopJohan Gustavsson
 
Debunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data ManagementDebunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data ManagementImanis Data
 
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellApache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellKoji Kawamura
 
Pré-processamento em Big Data
Pré-processamento em Big DataPré-processamento em Big Data
Pré-processamento em Big DataJoão Gabriel Lima
 
The Cassandra Platform - Christos Diou
The Cassandra Platform - Christos Diou The Cassandra Platform - Christos Diou
The Cassandra Platform - Christos Diou Cassandra Project
 
Making Big Data a First Class citizen in the enterprise
Making Big Data a First Class citizen in the enterpriseMaking Big Data a First Class citizen in the enterprise
Making Big Data a First Class citizen in the enterpriseTony Baer
 
Manual cassandra NoSQL
Manual cassandra NoSQLManual cassandra NoSQL
Manual cassandra NoSQLlignia
 
Apache Cassandra - Base de datos
Apache Cassandra - Base de datosApache Cassandra - Base de datos
Apache Cassandra - Base de datosZteeven Zalinas
 
Fast Data:The Rebirth of Streaming Analytics
Fast Data:The Rebirth of Streaming AnalyticsFast Data:The Rebirth of Streaming Analytics
Fast Data:The Rebirth of Streaming AnalyticsTony Baer
 
Instalacion,Configuracion y Creacion de Una Base de Datos en Apache Cassandra...
Instalacion,Configuracion y Creacion de Una Base de Datos en Apache Cassandra...Instalacion,Configuracion y Creacion de Una Base de Datos en Apache Cassandra...
Instalacion,Configuracion y Creacion de Una Base de Datos en Apache Cassandra...Daniel Briian
 
Elytics - Construindo uma plataforma de big data
Elytics - Construindo uma plataforma de big data Elytics - Construindo uma plataforma de big data
Elytics - Construindo uma plataforma de big data Elo7
 
Great Visualizations and Analytics using Business Intelligence Open Source
Great Visualizations and Analytics using Business Intelligence Open SourceGreat Visualizations and Analytics using Business Intelligence Open Source
Great Visualizations and Analytics using Business Intelligence Open SourceStratebi
 
Hadoop MapReduce Streaming and Pipes
Hadoop MapReduce  Streaming and PipesHadoop MapReduce  Streaming and Pipes
Hadoop MapReduce Streaming and PipesHanborq Inc.
 
Manual apache cassandra y comandos en la shell
Manual apache cassandra y comandos en la shellManual apache cassandra y comandos en la shell
Manual apache cassandra y comandos en la shellKevin López
 
Building a Recommendation Engine Using Diverse Features by Divyanshu Vats
Building a Recommendation Engine Using Diverse Features by Divyanshu VatsBuilding a Recommendation Engine Using Diverse Features by Divyanshu Vats
Building a Recommendation Engine Using Diverse Features by Divyanshu VatsSpark Summit
 

Viewers also liked (20)

Toward the Data Cloud
Toward the Data CloudToward the Data Cloud
Toward the Data Cloud
 
Development History Data Management in Hadoop
Development History Data Management in HadoopDevelopment History Data Management in Hadoop
Development History Data Management in Hadoop
 
Debunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data ManagementDebunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data Management
 
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellApache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in Nutshell
 
Pré-processamento em Big Data
Pré-processamento em Big DataPré-processamento em Big Data
Pré-processamento em Big Data
 
The Cassandra Platform - Christos Diou
The Cassandra Platform - Christos Diou The Cassandra Platform - Christos Diou
The Cassandra Platform - Christos Diou
 
Qcon Rio 2015 - Data Lakes Workshop
Qcon Rio 2015 - Data Lakes WorkshopQcon Rio 2015 - Data Lakes Workshop
Qcon Rio 2015 - Data Lakes Workshop
 
Nosql y cassandra
Nosql y cassandraNosql y cassandra
Nosql y cassandra
 
All things py
All things pyAll things py
All things py
 
Making Big Data a First Class citizen in the enterprise
Making Big Data a First Class citizen in the enterpriseMaking Big Data a First Class citizen in the enterprise
Making Big Data a First Class citizen in the enterprise
 
Manual cassandra NoSQL
Manual cassandra NoSQLManual cassandra NoSQL
Manual cassandra NoSQL
 
Apache Cassandra - Base de datos
Apache Cassandra - Base de datosApache Cassandra - Base de datos
Apache Cassandra - Base de datos
 
Fast Data:The Rebirth of Streaming Analytics
Fast Data:The Rebirth of Streaming AnalyticsFast Data:The Rebirth of Streaming Analytics
Fast Data:The Rebirth of Streaming Analytics
 
Instalacion,Configuracion y Creacion de Una Base de Datos en Apache Cassandra...
Instalacion,Configuracion y Creacion de Una Base de Datos en Apache Cassandra...Instalacion,Configuracion y Creacion de Una Base de Datos en Apache Cassandra...
Instalacion,Configuracion y Creacion de Una Base de Datos en Apache Cassandra...
 
Elytics - Construindo uma plataforma de big data
Elytics - Construindo uma plataforma de big data Elytics - Construindo uma plataforma de big data
Elytics - Construindo uma plataforma de big data
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
Great Visualizations and Analytics using Business Intelligence Open Source
Great Visualizations and Analytics using Business Intelligence Open SourceGreat Visualizations and Analytics using Business Intelligence Open Source
Great Visualizations and Analytics using Business Intelligence Open Source
 
Hadoop MapReduce Streaming and Pipes
Hadoop MapReduce  Streaming and PipesHadoop MapReduce  Streaming and Pipes
Hadoop MapReduce Streaming and Pipes
 
Manual apache cassandra y comandos en la shell
Manual apache cassandra y comandos en la shellManual apache cassandra y comandos en la shell
Manual apache cassandra y comandos en la shell
 
Building a Recommendation Engine Using Diverse Features by Divyanshu Vats
Building a Recommendation Engine Using Diverse Features by Divyanshu VatsBuilding a Recommendation Engine Using Diverse Features by Divyanshu Vats
Building a Recommendation Engine Using Diverse Features by Divyanshu Vats
 

Similar to Hadoop, SQL & NoSQL: No Longer an Either-or Question

2014 july 24_what_ishadoop
2014 july 24_what_ishadoop2014 july 24_what_ishadoop
2014 july 24_what_ishadoopAdam Muise
 
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
 How to use Hadoop for operational and transactional purposes by RODRIGO MERI... How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...Big Data Spain
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop DeveloperEdureka!
 
The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)Cloudera, Inc.
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016StampedeCon
 
Big SQL NYC Event December by Virender
Big SQL NYC Event December by VirenderBig SQL NYC Event December by Virender
Big SQL NYC Event December by Virendervithakur
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & HadoopBlackvard
 
EMC Isilon Database Converged deck
EMC Isilon Database Converged deckEMC Isilon Database Converged deck
EMC Isilon Database Converged deckKeithETD_CTO
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Hadoop and Big Data: Revealed
Hadoop and Big Data: RevealedHadoop and Big Data: Revealed
Hadoop and Big Data: RevealedSachin Holla
 
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeNicolas Morales
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big dealeduarderwee
 
Agile data lake? An oxymoron?
Agile data lake? An oxymoron?Agile data lake? An oxymoron?
Agile data lake? An oxymoron?samthemonad
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Managementrightsize
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionSplunk
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 

Similar to Hadoop, SQL & NoSQL: No Longer an Either-or Question (20)

2014 july 24_what_ishadoop
2014 july 24_what_ishadoop2014 july 24_what_ishadoop
2014 july 24_what_ishadoop
 
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
 How to use Hadoop for operational and transactional purposes by RODRIGO MERI... How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
 
The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)
 
Modernise your EDW - Data Lake
Modernise your EDW - Data LakeModernise your EDW - Data Lake
Modernise your EDW - Data Lake
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
 
Big SQL NYC Event December by Virender
Big SQL NYC Event December by VirenderBig SQL NYC Event December by Virender
Big SQL NYC Event December by Virender
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
 
EMC Isilon Database Converged deck
EMC Isilon Database Converged deckEMC Isilon Database Converged deck
EMC Isilon Database Converged deck
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
EMC config Hadoop
EMC config HadoopEMC config Hadoop
EMC config Hadoop
 
Hadoop and Big Data: Revealed
Hadoop and Big Data: RevealedHadoop and Big Data: Revealed
Hadoop and Big Data: Revealed
 
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor Landscape
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
 
Agile data lake? An oxymoron?
Agile data lake? An oxymoron?Agile data lake? An oxymoron?
Agile data lake? An oxymoron?
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Management
 
Big Data Concepts
Big Data ConceptsBig Data Concepts
Big Data Concepts
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 

Recently uploaded

CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxdhiyaneswaranv1
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsThinkInnovation
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxFinatron037
 

Recently uploaded (16)

CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in Logistics
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptx
 

Hadoop, SQL & NoSQL: No Longer an Either-or Question

  • 1. © Copyright Ovum. All rights reserved. Ovum is a subsidiary of Informa plc.1 Hadoop, SQL & NoSQL – No longer an either or question Tony Baer Hadoop Summit 2014 June 4, 2014
  • 2. © Copyright Ovum. All rights reserved. Ovum is an Informa business.2  Where we’ve come – Twins separated at birth & joyous reunion  Why/how the convergence?  Loose ends Agenda
  • 3. © Copyright Ovum. All rights reserved. Ovum is an Informa business.3 SQL RDBMS File systems Hierarchical Data stores OODBMS SQL, NoSQL, Hadoop 1970s 1980s 1990s 2000s 2010s Network Data stores
  • 4. © Copyright Ovum. All rights reserved. Ovum is an Informa business.4 Early Development Commercialization Ecosystem Formation 1960s 1980s 1990s 2000s “Prehistoric” EF Codd publishes seminal RDBMS model IBM System R, Ingres DB2, Oracle, Teradata, PC-based DBMSs SQL becomes de facto enterprise standard data platform Tooling emerges SQL market consolidates: Oracle, DB2, SQL Server, Teradata NewSQL analytic platforms emerge Mainframe era Midranges & PCs emerge Big Data 2014 DBMSs add multiple engines Database timeline 1970s Client/server & n-Tier Ecosystem Broadens CODASYL, IMS MySQL/ LAMP stack emerges J2EE, .NET
  • 5. © Copyright Ovum. All rights reserved. Ovum is an Informa business.5 Early Development Commercialization Ecosystem Formation 2003 - 2005 2009 2011 2012 2013 First Advanced SQL platforms emerge Hadoop emerges Other NoSQL platforms emerge Cloudera intros comm’l Hadoop support Major vendors enter Big Data market Tooling emerges 2nd wave NewSQL platforms emerge Big Data Tools emerge Internet firm early adopters Enterprise early adopters (FS & Media) Mainstream adoption begins 2014 Big Data Apps emerge Big Data platform timeline Hortonworks enters market MongoDB, Cassandra emerge
  • 6. © Copyright Ovum. All rights reserved. Ovum is an Informa business.6 Platform proliferation = Data processing silos SQL RDBMS NewSQL RDBMS NoSQL Key-Value NoSQL JSON Hadoop OLTP (ACID) OLTP (Non- ACID) BI Query & Report Analytics OLTP (Non- ACID) Advanced Analytics Operational Decision Support Operational Decision Support MapReduce- based Advanced Analytics
  • 7. © Copyright Ovum. All rights reserved. Ovum is an Informa business.7  Where we’ve come – Twins separated at birth & joyous reunion  Why/how the convergence?  Loose ends Agenda
  • 8. © Copyright Ovum. All rights reserved. Ovum is an Informa business.8 Analytic SLA requirements vary Batch Periodic Interactive Real-time Exploratory Analytics Standard reporting Days/Hours Seconds Split seconds Interactive query Search Streaming Decision Support Modeling Operational Decision Support Hours/Minutes
  • 9. © Copyright Ovum. All rights reserved. Ovum is an Informa business.9 Analytics problems cross silos – Operational examples  Customer engagement  Interaction – Customer 360 query in DW  Behavior – Enrich with sentiment analysis on Hadoop  Engagement – Manage real-time engagement on NoSQL database  Risk mitigation  Baseline – Model party & transactional risk on DW or Hadoop  Enrich – Analyze, rank impact of externalities on Hadoop  Ingest – Real-time market feeds via streaming in-memory  Define – Decision processes offline via BPM  Act – Allow/deny credit on system of record
  • 10. © Copyright Ovum. All rights reserved. Ovum is an Informa business.10 Architecture – Common threads  Aggressive tiering  Multiple storage engines  Multiple workload types  On the horizon:  Federated query  Workload/query orchestration  Loose ends:  Common security?
  • 11. © Copyright Ovum. All rights reserved. Ovum is an Informa business.11 SQL Databases adding multiple personas  IBM DB2  BLU architecture adds columnar, data skipping, advanced tiering  New MongoDB-compliant JSON data store  Oracle Database 12c  “In-Memory” option adds DRAM-based columnar, extreme compression  Microsoft SQL Server  PDW adds columnar indexing  PolyBase feature adds Hadoop integration  Teradata  Teradata 14.10 adds “Intelligent Memory” data tiering, columnar, Hadoop integration  Aster 6 adds graph, file store, “SNAP” framework for choreographing SQL, MapReduce, graph & Hadoop processing  SAP  “Smart Data Access” federated query over HANA, Sybase IQ, Teradata & Hadoop
  • 12. © Copyright Ovum. All rights reserved. Ovum is an Informa business.12 Hadoop growing beyond MapReduce  Apache Hadoop 2.0’s new YARN resource allocation framework allows multiple workloads  Interactive SQL – lots of flavors  Spark – The new MapReduce & more…  Search  Streaming  Loose ends:  Graph ready for prime time?
  • 13. © Copyright Ovum. All rights reserved. Ovum is an Informa business.13 Emerging NewSQL + NoSQL databases  JSON data stores exploding  Intuitive for representing Internet data  MongoDB, Couchbase  IBM, Teradata… potentially Oracle adding JSON  New transaction stores … not full ACID  Cassandra for NoSQL (integrated to Hadoop)  NuoDB, Clustrix, MemSQL & others reinvent OLTP for distributed Internet apps  HBase  DynamoDB, Berkeley DB (Oracle NoSQL database) & other key-value stores
  • 14. © Copyright Ovum. All rights reserved. Ovum is an Informa business.14 A variety of overlapping choices NewSQL JSON Graph Hadoop SQL Deep analytics Stream Graph NoSQL Account/user profiles Interactive content Graph Machine data JSON SQL RDBMS OLTP DW JSON Distributed OLTP Fast, deep analytics Active Archiving SQLRDBMS NewSQLRDBMS NoSQLKey-Value NoSQLJSON Hadoop From To
  • 15. © Copyright Ovum. All rights reserved. Ovum is an Informa business.15 A variety of overlapping choices – But… Who owns the logical hub? SQL RDBMS NewSQL Hadoop NoSQL OLTP DW Active Archiving JSON Distributed OLTP Fast, deep analytics JSON Graph SQL Deep analytics Stream Graph Account/user profiles Interactive content Graph Machine data JSON
  • 16. © Copyright Ovum. All rights reserved. Ovum is an Informa business.16  Where we’ve come – Twins separated at birth & joyous reunion  Why/how the convergence?  Loose ends Agenda
  • 17. © Copyright Ovum. All rights reserved. Ovum is an Informa business.17 Loose ends  Ideally, policy-based federated query will be the solution  Who owns federated query?  Data platform?  BI tool?  Application?  Who owns workload management?  Who owns security? Tug of war between data platforms likely
  • 18. © Copyright Ovum. All rights reserved. Ovum is an Informa business.18 Takeaways  Analytics no longer limited by platform constraints  Data platforms are taking multiple personas –  Platform choice is not either/or  But  Analytics are no longer silo’ed  Execution remains silo’ed  The brass ring will be a logical hub for  Policy/SLA-based workload targeting & management  Security & operations/performance management
  • 19. © Copyright Ovum. All rights reserved. Ovum is a subsidiary of Informa plc.19 Thank you Tony Baer Ovum (646) 546-5330 @TonyBaer tony.baer@ovum.com