SlideShare a Scribd company logo
1 of 29
David Menninger of Ventana Research Presents:
 Best Practices with Hadoop - Real World Data

Audio/Telephone: +1 (909) 259-0012
Access Code: 622-064-673
Audio PIN: Shown after joining the Webinar

Hosts: Rich Guth, CMO, Karmasphere
       Charles Zedlewski, VP Product, Cloudera
                                                 1
Housekeeping

 •   Ask questions at any time using the Questions panel
 •   Twitter: #HadoopTrends
 •   Problems? Use the Chat panel
 •   Slides and recording will be available




                                                           2
Speaker: David Menninger
Vice President , Ventana Research
                • Covers analytics, business intelligence and information
                management for Ventana Research. David brings over
                two decades of experience, through which he has
                marketed and brought to market some of the leading
                edge technologies for helping organizations analyze
                data to support a range of action-taking and decision-
                making processes.

                • Prior to joining Ventana Research, David was VP of
                Marketing and Product Management at Vertica Systems,
                Oracle, Applix, InforSense and IRI Software. He helped
                create over half a billion dollars of shareholder value
                while serving in these roles.

                • Email: david.menninger@ventanaresearch.com
                • Twitter: @dmenningervr



                                                                       3
Who We Are

     Mission: To help organizations to profit from all of their data


    How We Do It                          Credentials                                 Technical Team                           Leadership
   We deliver relevant                 The Apache Hadoop                             Unmatched knowledge                   Strong executive team
 products and services.                     experts.                                   and experience.                      with proven abilities.
                                                                                                                         Mike Olson    Jeff
 A distribution of Apache Hadoop    Number 1 commercial , open                    Founders, committers and            CEO           Hammerbacher
  that is tested, certified and       source distribution of Apache                  contributors to Hadoop                            Chief Scientist
                                                                                                                         Kirk Dunn
  supported                           Hadoop
                                                                                    A wealth of experience in the       COO           Amr Awadalla
 Comprehensive support and          Largest contributor to the open                design and delivery of production   Charles       VP Engineering
  professional service offerings      source Hadoop ecosystem                        software                            Zedlewski
                                                                                                                                       Doug Cutting
                                                                                                                         VP, Product
 A suite of management software     Breadth and depth in a team of                                                     Mary
                                                                                                                                       Chief Architect
  for Hadoop operations               open source committers and                                                                       Omer Trajman
                                                                                                                         Rorabaugh
                                      contributors                                                                                     VP, Customer
 Training and certification                                                                                             CFO
                                                                                                                                       Solutions
  programs for developers,           More than 100 customers across
  administrators, managers and        a wide variety of industries
  data scientists
                                     Strong growth in revenue and
                                      new accounts




    4                                              ©2011 Cloudera, Inc. All Rights Reserved. Confidential.
                                                  Reproduction or redistribution without written permission is
                                                                          prohibited.
What we do

                             Consulting Services
                             Cloudera University                                                             Cloudera Partners

                  OPERATORS                                                           ENGINEERS      ANALYSTS        BUSINESS USERS



                                      Cloudera Enterprise
                  Management             Cloudera Management Suite                                                    Enterprise
                                         Cloudera Support                                  IDE’s   BI / Analytics
                     Tools                                                                                             Reporting




    CUSTOMERS


                                                                                                    Adapters
                       Cloudera’s Distribution                                                                       Enterprise Data
                  Including Apache Hadoop (CDH)                                                                       Warehouse
      Web                        +
    Application            SCM Express




                                                                                      Relational
                      Logs                 Files               Web Data
                                                                                      Databases

                                                    BIG DATA




5
                                                ©2011 Cloudera, Inc. All Rights Reserved.
Karmasphere


    Opening Up the Data in Hadoop for the Enterprise




6 © Karmasphere 2011 All rights reserved
Karmasphere Big Data Intelligence Product Suite

                                                    For Data and Business Analysts

                 Graphical environment where Big Data on Hadoop – even unstructured data –
                 can be accessed, discovered, and analyzed via familiar SQL and visualized in
                 Excel and other visualization tools

                                                    FREE for Developers New to Hadoop

                 Graphical development environment that facilitates learning how to
                 prototype, develop and test MapReduce jobs for Hadoop


                                                    For Developers Going into Production

                 Graphical development environment for the complete Hadoop application
                 development lifecycle, adding debugging, packaging and profiling to the
                 capabilities of Community Edition

7 © Karmasphere 2011 All rights reserved
Hadoop and Information Management
Benchmark Research Project
Preliminary Findings
June 23, 2011




                       8
                               ©2011, Ventana Research, Inc.
Agenda


 Why did we undertake this research?

 What is did our research examine?

 What did we find?

 How should you use this information?

 Where do you get more information?




                                        ©2011, Ventana Research, Inc.
                      9
Ventana Research – Overview

   Ventana Research is the leading benchmark research and strategic advisory
   services firm. Our unparalleled analytic insights and best practices guidance
     and are based on our rigorous research-based benchmarking, business,
                      technology and best practices services.

                              Unique Combination of Capabilities
                               • Members (85,000) and Reach to Professionals (3milion)
                               • Research and Reach across all line of business functions
                                 and IT




• Expertise Across Business                                                         • Conduct and Deliver Benchmark
  and Technology                                                                      Research
• Understand Business                                                               • Develop Analytic and Best
  Domain and Processes                                                                Practice Assessments


                              • Formalized Research Coverage of Technology Vendors
                              • Deliver Research on Technology Impact to Business

                                                                                        ©2011, Ventana Research, Inc.
                                                10
Rising Popularity
Popularity Measured by Job Postings




                                ©2011, Ventana Research, Inc.
                 12
Research Objectives

 Gauge both the adoption rate and intentions to use Hadoop
 Determine which elements of the Hadoop ecosystem are the
  most popular
   • Including which distributions, which components and which third-
     party products.
 Examine the infrastructures and strategies being used to
  deploy Hadoop
 Clarify the role of the cloud in enterprise Hadoop deployments
 Elucidate the components of the business case for Hadoop
 Detail use of Hadoop across industries
 Determine the barriers and obstacles to further adoption of
  Hadoop




                                                   ©2008, Ventana Research, Inc.
Respondent Demographics


            Participation by Region            Company Size by Employee
                 Central
                   and                                  count
       Middle     South
                 America   Africa
        East                                                               Small
                   3%       2%
        3%                                                                 14%
      Europe
                                              Very
        7%
                                              Large
                                              35%
   Asia
  Pacific
   16%                                                                                   Midsize
                                                                                          24%


                                     North
                                    America
                                     69%
                                                       Large
                                                       27%


      Total qualified responses: 163


                                                         ©2011, Ventana Research, Inc.
                                    14
Touching Over Half The Big Data Audience

                  Hadoop Usage    Currently in
                                  production
                                     22%




No plans to use
     46%                                  Plan to use
                          54%              within 12
                                            months
                                             12%

                                       Plan to use in
                                       12-24 months
                                            3%
                            Still evaluating
                                   17%

                                               ©2011, Ventana Research, Inc.
                     15
Hadoop Is Generally Additive


Is your Hadoop deployment replacing another technology?



                              Hadoop is supplementing
                     Yes      other established
                     37%      technologies, with RDBMSs
                              still the dominant technology
                              being used or planned to be
 No                           used by more than nine out
63%                           of ten organizations.




                                             ©2011, Ventana Research, Inc.
                      16
Hadoop Is Additive In More Than One Way


  Are there things you're able to do or plan to do with
  large-scale data technologies that you couldn't do
  before deployment?


              87%

                                 52%


               Hadoop             Other

                                             ©2011, Ventana Research, Inc.
                        17
Hadoop Is Additive In More Than One Way


  What are you able to do or what do you plan to do with
  large-scale data technologies that you couldn't do
  before deployment?
                                                                                 94%
        Analyze data at a
      greater level of detail
                                                                                 93%


  Perform types of analytics
                                                                          88%
   that couldn't be done on
    large volumes of data
                                                    71%
            before


                                                                          88%
       Keep more historical
       data (post-process)
                                           60%


   Capture all of the source
                                                                   82%
      data that we are
          collecting                                        Hadoop
                                     47%
        (pre-process)                                       Non-Hadoop




                                                 ©2011, Ventana Research, Inc.
                                18
What Types of Data?

 Hadoop is much more likely to be used for log and event data;
 much less likely to be used for transaction data. It’s also more
 likely to be used for text and multimedia.

    Most Common - Hadoop                    Most Common - Others

 • Application logs                     • Customer/member data
 • Other types of event data            • Transaction data
 • Other log files                      • Application logs
 • Web logs                             • Online retail transactions
 • Transaction data                     • Network monitoring/traffic
 • Network monitoring/traffic           • Call detail records

                               What types of large-scale data does your organization analyze?


                                                                 ©2011, Ventana Research, Inc.
                          19
What Types of Data?
Q28 What types of large-scale data does your organization analyze?
                                                                                                          59%
                 Customer/member data                                                                                  68%
                                                                                      44%
 Transactional data from applications (for…                                                                            68%
                                                                                                                        69%
                         Application logs                                      37%
                                                                                                                 64%
               Other types of event data                            23%
                                                                                     41%
      Network monitoring/network traffic                                     33%
                                                                             33%
                Online retail transactions                                    34%
                                                                                                 51%
                           Other log files                               26%
                                                                          28%
                      Call Detail Records                                    32%
                                Web logs                                                   46%
                                                                   21%
                                                                               36%
  Text data from social media and online…                    15%
                                                                               36%
                             Search logs                   11%
                                                             18%
                        Trade/quote data                   15%
                Intelligence/defense data                    18%
                                                         11%
                                                               21%
        Multimedia (audio/video/images)                9%
                                                      8%
                                 Weather            3%
                                                   3%                                               Hadoop
                        Smartmeter data               6%
                                                   3%
                                                                                                    Non Hadoop
                   Other (please specify)            5%

                                                                                       ©2011, Ventana Research, Inc.
                                              20
What Types of Applications?

        What types of large-scale data applications is your
        organization running?
                                         60%
    Query and reporting
                                                       89%

Consolidation of multiple                 63%              Hadoop is most often
data sources for analysis                       71%        used for advanced
     Custom/production                        65%          analyses and is more
         application                          68%          likely to be used to
                                        56%                analyze unstructured
       Data preparation
                                         60%               data and for data
                                               69%         sandboxing than other
     Advanced analyses
                                  47%                      technologies. It is less
    Analysis or indexing          46%
                                                           likely to be used for
    of unstructured data    32%                            query and reporting.
                                              Hadoop
      Data sandbox/               44%
   Data experimentation     32%               Non-Hadoop



                                                                 ©2011, Ventana Research, Inc.
                             21
Where Sourced?


 From which source(s) did you access Hadoop software?
        Apache                63%

      Cloudera               55%

       Amazon     11%               The Apache Hadoop
                                    distribution, most prevalent
           IBM    8%                followed closely by
                                    Cloudera. Nearly half the
         Yahoo    8%
                                    organizations are using
      Facebook    5%                more than one distribution.
  Other (please
                  5%
    specify)

    Don't know    5%




                                                ©2011, Ventana Research, Inc.
                        22
Which Components?


  WhichDistributed File System…
  Hadoop Hadoop-related projects do you use of plan 79%
                                                     to
  use?             MapReduce                      76%


                     Hbase                                        61%


                       Hive                                 53%


                 Zookeeper                            45%


                        Pig                           45%


                     Flume                      34%


                     Sqoop                26%


                      Oozie         18%


                      Avro      16%


                 Don't know   11%




                                                            ©2011, Ventana Research, Inc.
                        23
Hadoop Organizations are More Confident


  How confident are you in your organization's ability to
  manage large-scale data?



    Hadoop                      43%                                37%                        18%               2%




 Non Hadoop           23%                         32%                     35%                         11%




              0%     10%        20%       30%     40%     50%     60%    70%    80%            90%              100%

               Very confident         Confident     Somewhat confident   Not very confident


                                                                                ©2011, Ventana Research, Inc.
                                             24
Report Higher Levels of Benefits
Q27 What are the primary benefits of using your current technologies for
analyzing large-scale data sets?
                                                                                              79%
       Allow us to retain and analyze more data                                      71%
                                                                                                    85%
                  Increase the speed of analysis                             63%
                                                                  51%
                  Produce more accurate results                                66%
                                                                              64%
          Reduce or eliminate manual processes                       56%
                                                                           62%
Cost savings - reduced implementation time/fees                    53%
Reduce the time required for data collection and                                67%
                 preparation                                      49%
 Higher customer retention from better analysis                     54%
               of customer data                                     54%
                                                                                      72%
     Utilize computing resources more efficiently             46%
                                                                                                82%
                      Cost savings - license fees           40%
                                                                  49%
                     Reduce effort/staff required                 49%
                                                                                67%
    We are able to create new products/services       32%
        Improved margins resulting from better              41%
                    algorithms                       30%                           Hadoop
                                                    26%                            Non-Hadoop
Improved clickthrough, cross-selling or upselling     30%

                                                                    ©2011, Ventana Research, Inc.
                                             25
Research Can Help Answer Your Questions


  Is Hadoop a fad or here to stay?
  Which distributions/components are being used?
    Apache?
    Cloudera?
    Other?
  Are your peers using Hadoop and for what purpose?
  Identify and avoid some of the obstacles to successful
  deployments.
What Should You Do?


 Already using Hadoop?
    Compare you usage with others
    Are you using all the components you should be?
    Have you considered all application areas?
    Is your usage tactical (cost saving) or strategic (new
     capabilities)?
 Not Using or Evaluating Hadoop?
    Consider whether you should be
    Did your organization need some “proof”?




                                                ©2011, Ventana Research, Inc.
                         27
Where to Get More Information


 Free webinar and report:        Contact us with questions:
  Ventana Research will host
   a webinar with the final
   results and analysis.

  Report of our findings will
   be distributed by the
   sponsors and will be
   available on our website:
                                              Ventana Research
   www.VentanaResearch.com/HIM                    925-474-0060
                                      info@ventanaresearch.com
                                       www.ventanaresearch.com



                                               ©2011, Ventana Research, Inc.
                          28
Q&A
Ask questions using the Questions panel

Tweet
• #HadoopTrends
• @dmenningervr
• @Cloudera
• @Karmasphere

Thank you for participating!

                                          29

More Related Content

What's hot

Hudson CIO Series: 6 Reasons for Cloud Computing
Hudson CIO Series: 6 Reasons for Cloud ComputingHudson CIO Series: 6 Reasons for Cloud Computing
Hudson CIO Series: 6 Reasons for Cloud Computingguest51aa87
 
The Cloud according to VMware
The Cloud according to VMwareThe Cloud according to VMware
The Cloud according to VMwareOpSource
 
All Clouds are Not Created Equal: A Logical Approach to Cloud Adoption in Y...
All Clouds are Not Created Equal:  A Logical Approach to Cloud Adoption in  Y...All Clouds are Not Created Equal:  A Logical Approach to Cloud Adoption in  Y...
All Clouds are Not Created Equal: A Logical Approach to Cloud Adoption in Y...IBM India Smarter Computing
 
"Преимущества облачных решений от Cisco" (Обзор облачной стратегии Cisco, Пр...
 "Преимущества облачных решений от Cisco" (Обзор облачной стратегии Cisco, Пр... "Преимущества облачных решений от Cisco" (Обзор облачной стратегии Cisco, Пр...
"Преимущества облачных решений от Cisco" (Обзор облачной стратегии Cisco, Пр...Cisco Russia
 
ClientSummit2010_CloudWorkshop
ClientSummit2010_CloudWorkshopClientSummit2010_CloudWorkshop
ClientSummit2010_CloudWorkshopRazorfish
 
Towards the extinction of mega data centres? To which extent should the Clou...
 Towards the extinction of mega data centres? To which extent should the Clou... Towards the extinction of mega data centres? To which extent should the Clou...
Towards the extinction of mega data centres? To which extent should the Clou...Thierry Coupaye
 
Cisco cloud strategy cisco
Cisco cloud strategy ciscoCisco cloud strategy cisco
Cisco cloud strategy ciscoOpenSourceCamp
 
Genesys Notifications to Azure
Genesys Notifications to AzureGenesys Notifications to Azure
Genesys Notifications to AzureNoralogix
 
Avner algom feb 7 2012
Avner algom feb 7 2012Avner algom feb 7 2012
Avner algom feb 7 2012Avner Algom
 
Managing Security and Delivering Performance in the Cloud
Managing Security and Delivering Performance in the Cloud Managing Security and Delivering Performance in the Cloud
Managing Security and Delivering Performance in the Cloud Software Park Thailand
 
Citrix and HPE Team to Make Sense of the Core-Cloud-Edge Architecture
Citrix and HPE Team to Make Sense of the Core-Cloud-Edge ArchitectureCitrix and HPE Team to Make Sense of the Core-Cloud-Edge Architecture
Citrix and HPE Team to Make Sense of the Core-Cloud-Edge ArchitectureDana Gardner
 
The marriage between Cloud and ITSM
The marriage between Cloud and ITSMThe marriage between Cloud and ITSM
The marriage between Cloud and ITSMITpreneurs
 
Understanding Cloud Computing & Its Relevance to Financial Software Solutions
Understanding Cloud Computing & Its Relevance to Financial Software SolutionsUnderstanding Cloud Computing & Its Relevance to Financial Software Solutions
Understanding Cloud Computing & Its Relevance to Financial Software SolutionsZannettos Zannettou
 
Achieving Cloud Enterprise Agility
Achieving Cloud Enterprise AgilityAchieving Cloud Enterprise Agility
Achieving Cloud Enterprise AgilitySteven_Jackson
 
Cloud Computing for Banking - Accenture
Cloud Computing for Banking - AccentureCloud Computing for Banking - Accenture
Cloud Computing for Banking - AccentureKim Jensen
 

What's hot (19)

Hudson CIO Series: 6 Reasons for Cloud Computing
Hudson CIO Series: 6 Reasons for Cloud ComputingHudson CIO Series: 6 Reasons for Cloud Computing
Hudson CIO Series: 6 Reasons for Cloud Computing
 
The Cloud according to VMware
The Cloud according to VMwareThe Cloud according to VMware
The Cloud according to VMware
 
All Clouds are Not Created Equal: A Logical Approach to Cloud Adoption in Y...
All Clouds are Not Created Equal:  A Logical Approach to Cloud Adoption in  Y...All Clouds are Not Created Equal:  A Logical Approach to Cloud Adoption in  Y...
All Clouds are Not Created Equal: A Logical Approach to Cloud Adoption in Y...
 
Hybride Cloud Strategy
Hybride Cloud StrategyHybride Cloud Strategy
Hybride Cloud Strategy
 
Value Journal - February 2021
Value Journal - February 2021Value Journal - February 2021
Value Journal - February 2021
 
"Преимущества облачных решений от Cisco" (Обзор облачной стратегии Cisco, Пр...
 "Преимущества облачных решений от Cisco" (Обзор облачной стратегии Cisco, Пр... "Преимущества облачных решений от Cisco" (Обзор облачной стратегии Cisco, Пр...
"Преимущества облачных решений от Cisco" (Обзор облачной стратегии Cisco, Пр...
 
ClientSummit2010_CloudWorkshop
ClientSummit2010_CloudWorkshopClientSummit2010_CloudWorkshop
ClientSummit2010_CloudWorkshop
 
Towards the extinction of mega data centres? To which extent should the Clou...
 Towards the extinction of mega data centres? To which extent should the Clou... Towards the extinction of mega data centres? To which extent should the Clou...
Towards the extinction of mega data centres? To which extent should the Clou...
 
Cisco cloud strategy cisco
Cisco cloud strategy ciscoCisco cloud strategy cisco
Cisco cloud strategy cisco
 
Genesys Notifications to Azure
Genesys Notifications to AzureGenesys Notifications to Azure
Genesys Notifications to Azure
 
Avner algom feb 7 2012
Avner algom feb 7 2012Avner algom feb 7 2012
Avner algom feb 7 2012
 
Managing Security and Delivering Performance in the Cloud
Managing Security and Delivering Performance in the Cloud Managing Security and Delivering Performance in the Cloud
Managing Security and Delivering Performance in the Cloud
 
Advance Group
Advance Group Advance Group
Advance Group
 
Citrix and HPE Team to Make Sense of the Core-Cloud-Edge Architecture
Citrix and HPE Team to Make Sense of the Core-Cloud-Edge ArchitectureCitrix and HPE Team to Make Sense of the Core-Cloud-Edge Architecture
Citrix and HPE Team to Make Sense of the Core-Cloud-Edge Architecture
 
The marriage between Cloud and ITSM
The marriage between Cloud and ITSMThe marriage between Cloud and ITSM
The marriage between Cloud and ITSM
 
The Value of 'Cloud' in the Business Technology Ecosystem
The Value of 'Cloud' in the Business Technology EcosystemThe Value of 'Cloud' in the Business Technology Ecosystem
The Value of 'Cloud' in the Business Technology Ecosystem
 
Understanding Cloud Computing & Its Relevance to Financial Software Solutions
Understanding Cloud Computing & Its Relevance to Financial Software SolutionsUnderstanding Cloud Computing & Its Relevance to Financial Software Solutions
Understanding Cloud Computing & Its Relevance to Financial Software Solutions
 
Achieving Cloud Enterprise Agility
Achieving Cloud Enterprise AgilityAchieving Cloud Enterprise Agility
Achieving Cloud Enterprise Agility
 
Cloud Computing for Banking - Accenture
Cloud Computing for Banking - AccentureCloud Computing for Banking - Accenture
Cloud Computing for Banking - Accenture
 

Viewers also liked

Een custommade netwerk tweede taalonderwijs
Een custommade netwerk tweede taalonderwijsEen custommade netwerk tweede taalonderwijs
Een custommade netwerk tweede taalonderwijsSURF Events
 
Studentenvisie op digitalisering
Studentenvisie op digitaliseringStudentenvisie op digitalisering
Studentenvisie op digitaliseringSURF Events
 
CEO-020-領導的意義Ok
CEO-020-領導的意義OkCEO-020-領導的意義Ok
CEO-020-領導的意義Okhandbook
 
HR-017-社會新鮮人生涯規劃
HR-017-社會新鮮人生涯規劃HR-017-社會新鮮人生涯規劃
HR-017-社會新鮮人生涯規劃handbook
 
Effectieve en efficiënte practica met LabBuddy
Effectieve en efficiënte practica met LabBuddyEffectieve en efficiënte practica met LabBuddy
Effectieve en efficiënte practica met LabBuddySURF Events
 
Onderwijsverandering en innovatie: van visie naar praktijk
Onderwijsverandering en innovatie: van visie naar praktijkOnderwijsverandering en innovatie: van visie naar praktijk
Onderwijsverandering en innovatie: van visie naar praktijkSURF Events
 
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...Massimo Gaetano Panunzio
 
FEATURED SESSIE: Active academic blended learning
 FEATURED SESSIE: Active academic blended learning FEATURED SESSIE: Active academic blended learning
FEATURED SESSIE: Active academic blended learningSURF Events
 
Apache Hadoop YARN and the Docker Ecosystem
Apache Hadoop YARN and the Docker EcosystemApache Hadoop YARN and the Docker Ecosystem
Apache Hadoop YARN and the Docker EcosystemDataWorks Summit
 
Changeout Two Page
Changeout Two PageChangeout Two Page
Changeout Two Pagemroeske
 
Analyze and Visualize Git Log for Fun and Profit - DevTernity 2015
Analyze and Visualize Git Log for Fun and Profit - DevTernity 2015Analyze and Visualize Git Log for Fun and Profit - DevTernity 2015
Analyze and Visualize Git Log for Fun and Profit - DevTernity 2015Raimonds Simanovskis
 
Ds 016 精密機械設計總體設計
Ds 016 精密機械設計總體設計Ds 016 精密機械設計總體設計
Ds 016 精密機械設計總體設計handbook
 
Programming WebSockets with Glassfish and Grizzly
Programming WebSockets with Glassfish and GrizzlyProgramming WebSockets with Glassfish and Grizzly
Programming WebSockets with Glassfish and GrizzlyC2B2 Consulting
 
Adaptief leren en Rekenblokken
Adaptief leren en RekenblokkenAdaptief leren en Rekenblokken
Adaptief leren en RekenblokkenSURF Events
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationDataWorks Summit
 
HR-096-職能為本的領導與管理發展
HR-096-職能為本的領導與管理發展HR-096-職能為本的領導與管理發展
HR-096-職能為本的領導與管理發展handbook
 
Dlaczego (i jak) się uczyć
Dlaczego (i jak) się uczyćDlaczego (i jak) się uczyć
Dlaczego (i jak) się uczyćAnna Pietras
 
Hadoop and Graph Data Management: Challenges and Opportunities
Hadoop and Graph Data Management: Challenges and OpportunitiesHadoop and Graph Data Management: Challenges and Opportunities
Hadoop and Graph Data Management: Challenges and OpportunitiesDaniel Abadi
 

Viewers also liked (20)

Een custommade netwerk tweede taalonderwijs
Een custommade netwerk tweede taalonderwijsEen custommade netwerk tweede taalonderwijs
Een custommade netwerk tweede taalonderwijs
 
Studentenvisie op digitalisering
Studentenvisie op digitaliseringStudentenvisie op digitalisering
Studentenvisie op digitalisering
 
CEO-020-領導的意義Ok
CEO-020-領導的意義OkCEO-020-領導的意義Ok
CEO-020-領導的意義Ok
 
HR-017-社會新鮮人生涯規劃
HR-017-社會新鮮人生涯規劃HR-017-社會新鮮人生涯規劃
HR-017-社會新鮮人生涯規劃
 
Effectieve en efficiënte practica met LabBuddy
Effectieve en efficiënte practica met LabBuddyEffectieve en efficiënte practica met LabBuddy
Effectieve en efficiënte practica met LabBuddy
 
Onderwijsverandering en innovatie: van visie naar praktijk
Onderwijsverandering en innovatie: van visie naar praktijkOnderwijsverandering en innovatie: van visie naar praktijk
Onderwijsverandering en innovatie: van visie naar praktijk
 
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...
Turbo charge-your-analytics-with-ibm-netezza-and-revolution-r-enterprise-pres...
 
Velodati
VelodatiVelodati
Velodati
 
FEATURED SESSIE: Active academic blended learning
 FEATURED SESSIE: Active academic blended learning FEATURED SESSIE: Active academic blended learning
FEATURED SESSIE: Active academic blended learning
 
Apache Hadoop YARN and the Docker Ecosystem
Apache Hadoop YARN and the Docker EcosystemApache Hadoop YARN and the Docker Ecosystem
Apache Hadoop YARN and the Docker Ecosystem
 
Changeout Two Page
Changeout Two PageChangeout Two Page
Changeout Two Page
 
Analyze and Visualize Git Log for Fun and Profit - DevTernity 2015
Analyze and Visualize Git Log for Fun and Profit - DevTernity 2015Analyze and Visualize Git Log for Fun and Profit - DevTernity 2015
Analyze and Visualize Git Log for Fun and Profit - DevTernity 2015
 
Ds 016 精密機械設計總體設計
Ds 016 精密機械設計總體設計Ds 016 精密機械設計總體設計
Ds 016 精密機械設計總體設計
 
Programming WebSockets with Glassfish and Grizzly
Programming WebSockets with Glassfish and GrizzlyProgramming WebSockets with Glassfish and Grizzly
Programming WebSockets with Glassfish and Grizzly
 
Adaptief leren en Rekenblokken
Adaptief leren en RekenblokkenAdaptief leren en Rekenblokken
Adaptief leren en Rekenblokken
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
HR-096-職能為本的領導與管理發展
HR-096-職能為本的領導與管理發展HR-096-職能為本的領導與管理發展
HR-096-職能為本的領導與管理發展
 
Erfelijkheid
ErfelijkheidErfelijkheid
Erfelijkheid
 
Dlaczego (i jak) się uczyć
Dlaczego (i jak) się uczyćDlaczego (i jak) się uczyć
Dlaczego (i jak) się uczyć
 
Hadoop and Graph Data Management: Challenges and Opportunities
Hadoop and Graph Data Management: Challenges and OpportunitiesHadoop and Graph Data Management: Challenges and Opportunities
Hadoop and Graph Data Management: Challenges and Opportunities
 

Similar to Best Practices with Hadoop - Ventana Research Presentation on Real World Data

Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesCloudera, Inc.
 
Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?Hortonworks
 
Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Cloudera, Inc.
 
Amr Awadallah, unSEXY Presentation
Amr Awadallah, unSEXY PresentationAmr Awadallah, unSEXY Presentation
Amr Awadallah, unSEXY Presentation500 Startups
 
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7Cloudera, Inc.
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Inside Analysis
 
Aspirea sales presentation
Aspirea sales presentationAspirea sales presentation
Aspirea sales presentationMayank Singh
 
Cor source solutions on premise to on demand saas u 2 2012
Cor source solutions on premise to on demand saas u 2 2012Cor source solutions on premise to on demand saas u 2 2012
Cor source solutions on premise to on demand saas u 2 2012CorSource
 
Net@Work Client Presentation with Security
Net@Work Client Presentation with Security Net@Work Client Presentation with Security
Net@Work Client Presentation with Security Ray Glass
 
Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Hortonworks
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Inside Analysis
 
Composite Information Server
Composite Information ServerComposite Information Server
Composite Information Servertempledf
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...TheInevitableCloud
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderainevitablecloud
 
Model Your Hadoop Hive Databases with Embarcadero ER/Studio
Model Your Hadoop Hive Databases with Embarcadero ER/StudioModel Your Hadoop Hive Databases with Embarcadero ER/Studio
Model Your Hadoop Hive Databases with Embarcadero ER/StudioEmbarcadero Technologies
 
Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformHortonworks
 

Similar to Best Practices with Hadoop - Ventana Research Presentation on Real World Data (20)

Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
 
Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?
 
Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop
 
Amr Awadallah, unSEXY Presentation
Amr Awadallah, unSEXY PresentationAmr Awadallah, unSEXY Presentation
Amr Awadallah, unSEXY Presentation
 
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7
 
Private cloud in a box
Private cloud in a boxPrivate cloud in a box
Private cloud in a box
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0
 
Aspirea sales presentation
Aspirea sales presentationAspirea sales presentation
Aspirea sales presentation
 
Cor source solutions on premise to on demand saas u 2 2012
Cor source solutions on premise to on demand saas u 2 2012Cor source solutions on premise to on demand saas u 2 2012
Cor source solutions on premise to on demand saas u 2 2012
 
Net@Work Client Presentation with Security
Net@Work Client Presentation with Security Net@Work Client Presentation with Security
Net@Work Client Presentation with Security
 
Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
2012 06 hortonworks paris hug
2012 06 hortonworks paris hug2012 06 hortonworks paris hug
2012 06 hortonworks paris hug
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?
 
Composite Information Server
Composite Information ServerComposite Information Server
Composite Information Server
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-cloudera
 
Model Your Hadoop Hive Databases with Embarcadero ER/Studio
Model Your Hadoop Hive Databases with Embarcadero ER/StudioModel Your Hadoop Hive Databases with Embarcadero ER/Studio
Model Your Hadoop Hive Databases with Embarcadero ER/Studio
 
Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data Platform
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Recently uploaded (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

Best Practices with Hadoop - Ventana Research Presentation on Real World Data

  • 1. David Menninger of Ventana Research Presents: Best Practices with Hadoop - Real World Data Audio/Telephone: +1 (909) 259-0012 Access Code: 622-064-673 Audio PIN: Shown after joining the Webinar Hosts: Rich Guth, CMO, Karmasphere Charles Zedlewski, VP Product, Cloudera 1
  • 2. Housekeeping • Ask questions at any time using the Questions panel • Twitter: #HadoopTrends • Problems? Use the Chat panel • Slides and recording will be available 2
  • 3. Speaker: David Menninger Vice President , Ventana Research • Covers analytics, business intelligence and information management for Ventana Research. David brings over two decades of experience, through which he has marketed and brought to market some of the leading edge technologies for helping organizations analyze data to support a range of action-taking and decision- making processes. • Prior to joining Ventana Research, David was VP of Marketing and Product Management at Vertica Systems, Oracle, Applix, InforSense and IRI Software. He helped create over half a billion dollars of shareholder value while serving in these roles. • Email: david.menninger@ventanaresearch.com • Twitter: @dmenningervr 3
  • 4. Who We Are Mission: To help organizations to profit from all of their data How We Do It Credentials Technical Team Leadership We deliver relevant The Apache Hadoop Unmatched knowledge Strong executive team products and services. experts. and experience. with proven abilities. Mike Olson Jeff  A distribution of Apache Hadoop  Number 1 commercial , open  Founders, committers and CEO Hammerbacher that is tested, certified and source distribution of Apache contributors to Hadoop Chief Scientist Kirk Dunn supported Hadoop  A wealth of experience in the COO Amr Awadalla  Comprehensive support and  Largest contributor to the open design and delivery of production Charles VP Engineering professional service offerings source Hadoop ecosystem software Zedlewski Doug Cutting VP, Product  A suite of management software  Breadth and depth in a team of Mary Chief Architect for Hadoop operations open source committers and Omer Trajman Rorabaugh contributors VP, Customer  Training and certification CFO Solutions programs for developers,  More than 100 customers across administrators, managers and a wide variety of industries data scientists  Strong growth in revenue and new accounts 4 ©2011 Cloudera, Inc. All Rights Reserved. Confidential. Reproduction or redistribution without written permission is prohibited.
  • 5. What we do  Consulting Services  Cloudera University Cloudera Partners OPERATORS ENGINEERS ANALYSTS BUSINESS USERS Cloudera Enterprise Management  Cloudera Management Suite Enterprise  Cloudera Support IDE’s BI / Analytics Tools Reporting CUSTOMERS Adapters Cloudera’s Distribution Enterprise Data Including Apache Hadoop (CDH) Warehouse Web + Application SCM Express Relational Logs Files Web Data Databases BIG DATA 5 ©2011 Cloudera, Inc. All Rights Reserved.
  • 6. Karmasphere Opening Up the Data in Hadoop for the Enterprise 6 © Karmasphere 2011 All rights reserved
  • 7. Karmasphere Big Data Intelligence Product Suite For Data and Business Analysts Graphical environment where Big Data on Hadoop – even unstructured data – can be accessed, discovered, and analyzed via familiar SQL and visualized in Excel and other visualization tools FREE for Developers New to Hadoop Graphical development environment that facilitates learning how to prototype, develop and test MapReduce jobs for Hadoop For Developers Going into Production Graphical development environment for the complete Hadoop application development lifecycle, adding debugging, packaging and profiling to the capabilities of Community Edition 7 © Karmasphere 2011 All rights reserved
  • 8. Hadoop and Information Management Benchmark Research Project Preliminary Findings June 23, 2011 8 ©2011, Ventana Research, Inc.
  • 9. Agenda Why did we undertake this research? What is did our research examine? What did we find? How should you use this information? Where do you get more information? ©2011, Ventana Research, Inc. 9
  • 10. Ventana Research – Overview Ventana Research is the leading benchmark research and strategic advisory services firm. Our unparalleled analytic insights and best practices guidance and are based on our rigorous research-based benchmarking, business, technology and best practices services. Unique Combination of Capabilities • Members (85,000) and Reach to Professionals (3milion) • Research and Reach across all line of business functions and IT • Expertise Across Business • Conduct and Deliver Benchmark and Technology Research • Understand Business • Develop Analytic and Best Domain and Processes Practice Assessments • Formalized Research Coverage of Technology Vendors • Deliver Research on Technology Impact to Business ©2011, Ventana Research, Inc. 10
  • 12. Popularity Measured by Job Postings ©2011, Ventana Research, Inc. 12
  • 13. Research Objectives Gauge both the adoption rate and intentions to use Hadoop Determine which elements of the Hadoop ecosystem are the most popular • Including which distributions, which components and which third- party products. Examine the infrastructures and strategies being used to deploy Hadoop Clarify the role of the cloud in enterprise Hadoop deployments Elucidate the components of the business case for Hadoop Detail use of Hadoop across industries Determine the barriers and obstacles to further adoption of Hadoop ©2008, Ventana Research, Inc.
  • 14. Respondent Demographics Participation by Region Company Size by Employee Central and count Middle South America Africa East Small 3% 2% 3% 14% Europe Very 7% Large 35% Asia Pacific 16% Midsize 24% North America 69% Large 27% Total qualified responses: 163 ©2011, Ventana Research, Inc. 14
  • 15. Touching Over Half The Big Data Audience Hadoop Usage Currently in production 22% No plans to use 46% Plan to use 54% within 12 months 12% Plan to use in 12-24 months 3% Still evaluating 17% ©2011, Ventana Research, Inc. 15
  • 16. Hadoop Is Generally Additive Is your Hadoop deployment replacing another technology? Hadoop is supplementing Yes other established 37% technologies, with RDBMSs still the dominant technology being used or planned to be No used by more than nine out 63% of ten organizations. ©2011, Ventana Research, Inc. 16
  • 17. Hadoop Is Additive In More Than One Way Are there things you're able to do or plan to do with large-scale data technologies that you couldn't do before deployment? 87% 52% Hadoop Other ©2011, Ventana Research, Inc. 17
  • 18. Hadoop Is Additive In More Than One Way What are you able to do or what do you plan to do with large-scale data technologies that you couldn't do before deployment? 94% Analyze data at a greater level of detail 93% Perform types of analytics 88% that couldn't be done on large volumes of data 71% before 88% Keep more historical data (post-process) 60% Capture all of the source 82% data that we are collecting Hadoop 47% (pre-process) Non-Hadoop ©2011, Ventana Research, Inc. 18
  • 19. What Types of Data? Hadoop is much more likely to be used for log and event data; much less likely to be used for transaction data. It’s also more likely to be used for text and multimedia. Most Common - Hadoop Most Common - Others • Application logs • Customer/member data • Other types of event data • Transaction data • Other log files • Application logs • Web logs • Online retail transactions • Transaction data • Network monitoring/traffic • Network monitoring/traffic • Call detail records What types of large-scale data does your organization analyze? ©2011, Ventana Research, Inc. 19
  • 20. What Types of Data? Q28 What types of large-scale data does your organization analyze? 59% Customer/member data 68% 44% Transactional data from applications (for… 68% 69% Application logs 37% 64% Other types of event data 23% 41% Network monitoring/network traffic 33% 33% Online retail transactions 34% 51% Other log files 26% 28% Call Detail Records 32% Web logs 46% 21% 36% Text data from social media and online… 15% 36% Search logs 11% 18% Trade/quote data 15% Intelligence/defense data 18% 11% 21% Multimedia (audio/video/images) 9% 8% Weather 3% 3% Hadoop Smartmeter data 6% 3% Non Hadoop Other (please specify) 5% ©2011, Ventana Research, Inc. 20
  • 21. What Types of Applications? What types of large-scale data applications is your organization running? 60% Query and reporting 89% Consolidation of multiple 63% Hadoop is most often data sources for analysis 71% used for advanced Custom/production 65% analyses and is more application 68% likely to be used to 56% analyze unstructured Data preparation 60% data and for data 69% sandboxing than other Advanced analyses 47% technologies. It is less Analysis or indexing 46% likely to be used for of unstructured data 32% query and reporting. Hadoop Data sandbox/ 44% Data experimentation 32% Non-Hadoop ©2011, Ventana Research, Inc. 21
  • 22. Where Sourced? From which source(s) did you access Hadoop software? Apache 63% Cloudera 55% Amazon 11% The Apache Hadoop distribution, most prevalent IBM 8% followed closely by Cloudera. Nearly half the Yahoo 8% organizations are using Facebook 5% more than one distribution. Other (please 5% specify) Don't know 5% ©2011, Ventana Research, Inc. 22
  • 23. Which Components? WhichDistributed File System… Hadoop Hadoop-related projects do you use of plan 79% to use? MapReduce 76% Hbase 61% Hive 53% Zookeeper 45% Pig 45% Flume 34% Sqoop 26% Oozie 18% Avro 16% Don't know 11% ©2011, Ventana Research, Inc. 23
  • 24. Hadoop Organizations are More Confident How confident are you in your organization's ability to manage large-scale data? Hadoop 43% 37% 18% 2% Non Hadoop 23% 32% 35% 11% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Very confident Confident Somewhat confident Not very confident ©2011, Ventana Research, Inc. 24
  • 25. Report Higher Levels of Benefits Q27 What are the primary benefits of using your current technologies for analyzing large-scale data sets? 79% Allow us to retain and analyze more data 71% 85% Increase the speed of analysis 63% 51% Produce more accurate results 66% 64% Reduce or eliminate manual processes 56% 62% Cost savings - reduced implementation time/fees 53% Reduce the time required for data collection and 67% preparation 49% Higher customer retention from better analysis 54% of customer data 54% 72% Utilize computing resources more efficiently 46% 82% Cost savings - license fees 40% 49% Reduce effort/staff required 49% 67% We are able to create new products/services 32% Improved margins resulting from better 41% algorithms 30% Hadoop 26% Non-Hadoop Improved clickthrough, cross-selling or upselling 30% ©2011, Ventana Research, Inc. 25
  • 26. Research Can Help Answer Your Questions Is Hadoop a fad or here to stay? Which distributions/components are being used?  Apache?  Cloudera?  Other? Are your peers using Hadoop and for what purpose? Identify and avoid some of the obstacles to successful deployments.
  • 27. What Should You Do? Already using Hadoop?  Compare you usage with others  Are you using all the components you should be?  Have you considered all application areas?  Is your usage tactical (cost saving) or strategic (new capabilities)? Not Using or Evaluating Hadoop?  Consider whether you should be  Did your organization need some “proof”? ©2011, Ventana Research, Inc. 27
  • 28. Where to Get More Information Free webinar and report: Contact us with questions:  Ventana Research will host a webinar with the final results and analysis.  Report of our findings will be distributed by the sponsors and will be available on our website: Ventana Research www.VentanaResearch.com/HIM 925-474-0060 info@ventanaresearch.com www.ventanaresearch.com ©2011, Ventana Research, Inc. 28
  • 29. Q&A Ask questions using the Questions panel Tweet • #HadoopTrends • @dmenningervr • @Cloudera • @Karmasphere Thank you for participating! 29