SlideShare a Scribd company logo
1 of 7
What is RDD?
• RDD means Resilient distributed dataset.
• Spark revolves around the concept of RDD which is a fault-
tolerant collection of elements that can be operated in parallel.
• There are two ways to create RDDs, it can be created by
parallelizing an existing collection in your driver program, or
referencing a dataset in an external storage system such as
(HDFS, Hbase, or any datasource offering Hadoop format)
RDDs & its Operations:-
• There are basically two types of RDDs operations in spark.
1. Transformations.
2. Actions.
Transformations
• The RDD transformations are some functions that takes one
RDD as input and form one or more than one RDD as an
output .
• As all RDDs are immutable then the main RDD will not be
changed.
• It is lazy operation though it creates some RDDs but they can
executes when an action is called.
Types of RDD Transformation:
• To improve the computation performance, we can set some
transformations as pipelined. It helps to optimize process.
• There are two kinds of transformations:
1. Narrow Transformation
2. Wide Transformation
Narrow Transformation
• Narrow transformations are
generated as a result of
Map, Filter or these kind of
operations
• It originates from a single
partition in a parent RDD .
Only some partitions are
used to find result.
Wide Transformation
• Wide Transformations are
generated as a result of
GroupBykey(),
ReduceBykey() or these kind
of operations.
• In these case to form a data
partition, it can take data from
more than one partitions.
• It is also known as shuffle
partition.
Thank You

More Related Content

What's hot

Qr code (quick response code)
Qr code (quick response code)Qr code (quick response code)
Qr code (quick response code)Rohan Sawant
 
GeoServer, an introduction for beginners
GeoServer, an introduction for beginnersGeoServer, an introduction for beginners
GeoServer, an introduction for beginnersGeoSolutions
 
Maps and Meaning: Graph-based Entity Resolution in Apache Spark & GraphX
Maps and Meaning: Graph-based Entity Resolution in Apache Spark & GraphXMaps and Meaning: Graph-based Entity Resolution in Apache Spark & GraphX
Maps and Meaning: Graph-based Entity Resolution in Apache Spark & GraphXDatabricks
 
Working with ArcGIS Online
Working with ArcGIS OnlineWorking with ArcGIS Online
Working with ArcGIS OnlineEsri
 
Gis in telecomm
Gis in telecommGis in telecomm
Gis in telecommAtiqa khan
 
PostGIS and Spatial SQL
PostGIS and Spatial SQLPostGIS and Spatial SQL
PostGIS and Spatial SQLTodd Barr
 
DSpace for Cultural Heritage: adding support for images visualization,audio/v...
DSpace for Cultural Heritage: adding support for images visualization,audio/v...DSpace for Cultural Heritage: adding support for images visualization,audio/v...
DSpace for Cultural Heritage: adding support for images visualization,audio/v...4Science
 
GEOGRAPHIC INFORMATION SYSTEM.pptx
GEOGRAPHIC INFORMATION SYSTEM.pptxGEOGRAPHIC INFORMATION SYSTEM.pptx
GEOGRAPHIC INFORMATION SYSTEM.pptxFizaNaaz8
 
ArcGIS Enterprise: Web GIS en tu infraestructura
ArcGIS Enterprise: Web GIS en tu infraestructuraArcGIS Enterprise: Web GIS en tu infraestructura
ArcGIS Enterprise: Web GIS en tu infraestructuraEsri España
 
Bridging Between CAD & GIS: 8 Ways to Automate Data Integration
Bridging Between CAD & GIS: 8 Ways to Automate Data IntegrationBridging Between CAD & GIS: 8 Ways to Automate Data Integration
Bridging Between CAD & GIS: 8 Ways to Automate Data IntegrationSafe Software
 
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jTransforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jDatabricks
 
The GRASS GIS software (with QGIS) - GIS Seminar
The GRASS GIS software (with QGIS) - GIS SeminarThe GRASS GIS software (with QGIS) - GIS Seminar
The GRASS GIS software (with QGIS) - GIS SeminarMarkus Neteler
 
The Art and Science of DDS Data Modelling
The Art and Science of DDS Data ModellingThe Art and Science of DDS Data Modelling
The Art and Science of DDS Data ModellingAngelo Corsaro
 
공간정보연구원 PostGIS 강의교재
공간정보연구원 PostGIS 강의교재공간정보연구원 PostGIS 강의교재
공간정보연구원 PostGIS 강의교재JungHwan Yun
 
Is Your Organization Ready to Embrace a Digital Twin?
Is Your Organization Ready to Embrace a Digital Twin?Is Your Organization Ready to Embrace a Digital Twin?
Is Your Organization Ready to Embrace a Digital Twin?Cognizant
 

What's hot (20)

Steganography
SteganographySteganography
Steganography
 
Qr code (quick response code)
Qr code (quick response code)Qr code (quick response code)
Qr code (quick response code)
 
GeoServer, an introduction for beginners
GeoServer, an introduction for beginnersGeoServer, an introduction for beginners
GeoServer, an introduction for beginners
 
Maps and Meaning: Graph-based Entity Resolution in Apache Spark & GraphX
Maps and Meaning: Graph-based Entity Resolution in Apache Spark & GraphXMaps and Meaning: Graph-based Entity Resolution in Apache Spark & GraphX
Maps and Meaning: Graph-based Entity Resolution in Apache Spark & GraphX
 
Working with ArcGIS Online
Working with ArcGIS OnlineWorking with ArcGIS Online
Working with ArcGIS Online
 
Gis in telecomm
Gis in telecommGis in telecomm
Gis in telecomm
 
PostGIS and Spatial SQL
PostGIS and Spatial SQLPostGIS and Spatial SQL
PostGIS and Spatial SQL
 
DSpace for Cultural Heritage: adding support for images visualization,audio/v...
DSpace for Cultural Heritage: adding support for images visualization,audio/v...DSpace for Cultural Heritage: adding support for images visualization,audio/v...
DSpace for Cultural Heritage: adding support for images visualization,audio/v...
 
Barcode
BarcodeBarcode
Barcode
 
GEOGRAPHIC INFORMATION SYSTEM.pptx
GEOGRAPHIC INFORMATION SYSTEM.pptxGEOGRAPHIC INFORMATION SYSTEM.pptx
GEOGRAPHIC INFORMATION SYSTEM.pptx
 
ArcGIS Enterprise: Web GIS en tu infraestructura
ArcGIS Enterprise: Web GIS en tu infraestructuraArcGIS Enterprise: Web GIS en tu infraestructura
ArcGIS Enterprise: Web GIS en tu infraestructura
 
Bridging Between CAD & GIS: 8 Ways to Automate Data Integration
Bridging Between CAD & GIS: 8 Ways to Automate Data IntegrationBridging Between CAD & GIS: 8 Ways to Automate Data Integration
Bridging Between CAD & GIS: 8 Ways to Automate Data Integration
 
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jTransforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
 
The GRASS GIS software (with QGIS) - GIS Seminar
The GRASS GIS software (with QGIS) - GIS SeminarThe GRASS GIS software (with QGIS) - GIS Seminar
The GRASS GIS software (with QGIS) - GIS Seminar
 
The Art and Science of DDS Data Modelling
The Art and Science of DDS Data ModellingThe Art and Science of DDS Data Modelling
The Art and Science of DDS Data Modelling
 
Phone phreaking
Phone phreakingPhone phreaking
Phone phreaking
 
공간정보연구원 PostGIS 강의교재
공간정보연구원 PostGIS 강의교재공간정보연구원 PostGIS 강의교재
공간정보연구원 PostGIS 강의교재
 
Is Your Organization Ready to Embrace a Digital Twin?
Is Your Organization Ready to Embrace a Digital Twin?Is Your Organization Ready to Embrace a Digital Twin?
Is Your Organization Ready to Embrace a Digital Twin?
 
PPT steganography
PPT steganographyPPT steganography
PPT steganography
 
Fingerprint Biometrics
Fingerprint BiometricsFingerprint Biometrics
Fingerprint Biometrics
 

Similar to What is an RDD in Spark

Rdd transformations bda
Rdd transformations bdaRdd transformations bda
Rdd transformations bdaShaishavShah8
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxRahul Borate
 
WHAT IS HADOOP AND ITS COMPONENTS?
WHAT IS HADOOP AND ITS COMPONENTS? WHAT IS HADOOP AND ITS COMPONENTS?
WHAT IS HADOOP AND ITS COMPONENTS? nakshatraL
 
Geek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and ScalaGeek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and ScalaAtif Akhtar
 
Apache Spark for Beginners
Apache Spark for BeginnersApache Spark for Beginners
Apache Spark for BeginnersAnirudh
 
Secrets of Spark's success - Deenar Toraskar, Think Reactive
Secrets of Spark's success - Deenar Toraskar, Think Reactive Secrets of Spark's success - Deenar Toraskar, Think Reactive
Secrets of Spark's success - Deenar Toraskar, Think Reactive huguk
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark Mostafa
 
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Mark Rittman
 
Some thoughts on apache spark & shark
Some thoughts on apache spark & sharkSome thoughts on apache spark & shark
Some thoughts on apache spark & sharkViet-Trung TRAN
 
A Step to programming with Apache Spark
A Step to programming with Apache SparkA Step to programming with Apache Spark
A Step to programming with Apache SparkKnoldus Inc.
 
Introduction to Apache Spark
Introduction to Apache Spark Introduction to Apache Spark
Introduction to Apache Spark Juan Pedro Moreno
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark FundamentalsZahra Eskandari
 

Similar to What is an RDD in Spark (20)

Rdd transformations bda
Rdd transformations bdaRdd transformations bda
Rdd transformations bda
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
 
WHAT IS HADOOP AND ITS COMPONENTS?
WHAT IS HADOOP AND ITS COMPONENTS? WHAT IS HADOOP AND ITS COMPONENTS?
WHAT IS HADOOP AND ITS COMPONENTS?
 
Apache Spark on HDinsight Training
Apache Spark on HDinsight TrainingApache Spark on HDinsight Training
Apache Spark on HDinsight Training
 
Geek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and ScalaGeek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and Scala
 
Spark
SparkSpark
Spark
 
Apache Spark for Beginners
Apache Spark for BeginnersApache Spark for Beginners
Apache Spark for Beginners
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
 
Secrets of Spark's success - Deenar Toraskar, Think Reactive
Secrets of Spark's success - Deenar Toraskar, Think Reactive Secrets of Spark's success - Deenar Toraskar, Think Reactive
Secrets of Spark's success - Deenar Toraskar, Think Reactive
 
SQL Server 2012 and Big Data
SQL Server 2012 and Big DataSQL Server 2012 and Big Data
SQL Server 2012 and Big Data
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
 
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
 
Some thoughts on apache spark & shark
Some thoughts on apache spark & sharkSome thoughts on apache spark & shark
Some thoughts on apache spark & shark
 
A Step to programming with Apache Spark
A Step to programming with Apache SparkA Step to programming with Apache Spark
A Step to programming with Apache Spark
 
Introduction to Apache Spark
Introduction to Apache Spark Introduction to Apache Spark
Introduction to Apache Spark
 
Big Data training
Big Data trainingBig Data training
Big Data training
 
Cppt Hadoop
Cppt HadoopCppt Hadoop
Cppt Hadoop
 
Cppt
CpptCppt
Cppt
 
Cppt
CpptCppt
Cppt
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
 

More from ShaishavShah8

Diffie hellman key algorithm
Diffie hellman key algorithmDiffie hellman key algorithm
Diffie hellman key algorithmShaishavShah8
 
Clipping computer graphics
Clipping  computer graphicsClipping  computer graphics
Clipping computer graphicsShaishavShah8
 
Classification of debuggers sp
Classification of debuggers spClassification of debuggers sp
Classification of debuggers spShaishavShah8
 
Parallel and perspective projection in 3 d cg
Parallel and perspective projection in 3 d cgParallel and perspective projection in 3 d cg
Parallel and perspective projection in 3 d cgShaishavShah8
 
Asymptotic notations ada
Asymptotic notations adaAsymptotic notations ada
Asymptotic notations adaShaishavShah8
 
Classical cyphers python programming
Classical cyphers python programmingClassical cyphers python programming
Classical cyphers python programmingShaishavShah8
 
Logics for non monotonic reasoning-ai
Logics for non monotonic reasoning-aiLogics for non monotonic reasoning-ai
Logics for non monotonic reasoning-aiShaishavShah8
 
Introduction to data warehouse dmbi
Introduction to data warehouse dmbiIntroduction to data warehouse dmbi
Introduction to data warehouse dmbiShaishavShah8
 
Introduction to xml, uses of xml wt
Introduction to xml, uses of xml wtIntroduction to xml, uses of xml wt
Introduction to xml, uses of xml wtShaishavShah8
 
Applications of huffman coding dcdr
Applications of huffman coding dcdrApplications of huffman coding dcdr
Applications of huffman coding dcdrShaishavShah8
 
Cookie management using jsp a java
Cookie management using jsp  a javaCookie management using jsp  a java
Cookie management using jsp a javaShaishavShah8
 
Introduction to data warehouse
Introduction to data warehouseIntroduction to data warehouse
Introduction to data warehouseShaishavShah8
 

More from ShaishavShah8 (18)

Diffie hellman key algorithm
Diffie hellman key algorithmDiffie hellman key algorithm
Diffie hellman key algorithm
 
Constructor oopj
Constructor oopjConstructor oopj
Constructor oopj
 
Clipping computer graphics
Clipping  computer graphicsClipping  computer graphics
Clipping computer graphics
 
Classification of debuggers sp
Classification of debuggers spClassification of debuggers sp
Classification of debuggers sp
 
Parallel and perspective projection in 3 d cg
Parallel and perspective projection in 3 d cgParallel and perspective projection in 3 d cg
Parallel and perspective projection in 3 d cg
 
Asymptotic notations ada
Asymptotic notations adaAsymptotic notations ada
Asymptotic notations ada
 
Arrays in java oopj
Arrays in java oopjArrays in java oopj
Arrays in java oopj
 
Classical cyphers python programming
Classical cyphers python programmingClassical cyphers python programming
Classical cyphers python programming
 
Logics for non monotonic reasoning-ai
Logics for non monotonic reasoning-aiLogics for non monotonic reasoning-ai
Logics for non monotonic reasoning-ai
 
Introduction to data warehouse dmbi
Introduction to data warehouse dmbiIntroduction to data warehouse dmbi
Introduction to data warehouse dmbi
 
Lan, wan, man mcwc
Lan, wan, man mcwcLan, wan, man mcwc
Lan, wan, man mcwc
 
Introduction to xml, uses of xml wt
Introduction to xml, uses of xml wtIntroduction to xml, uses of xml wt
Introduction to xml, uses of xml wt
 
Agile process se
Agile process seAgile process se
Agile process se
 
Applications of huffman coding dcdr
Applications of huffman coding dcdrApplications of huffman coding dcdr
Applications of huffman coding dcdr
 
Cookie management using jsp a java
Cookie management using jsp  a javaCookie management using jsp  a java
Cookie management using jsp a java
 
Login control .net
Login control .netLogin control .net
Login control .net
 
LAN, WAN, MAN
LAN, WAN, MANLAN, WAN, MAN
LAN, WAN, MAN
 
Introduction to data warehouse
Introduction to data warehouseIntroduction to data warehouse
Introduction to data warehouse
 

Recently uploaded

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

What is an RDD in Spark

  • 1. What is RDD? • RDD means Resilient distributed dataset. • Spark revolves around the concept of RDD which is a fault- tolerant collection of elements that can be operated in parallel. • There are two ways to create RDDs, it can be created by parallelizing an existing collection in your driver program, or referencing a dataset in an external storage system such as (HDFS, Hbase, or any datasource offering Hadoop format)
  • 2. RDDs & its Operations:- • There are basically two types of RDDs operations in spark. 1. Transformations. 2. Actions.
  • 3. Transformations • The RDD transformations are some functions that takes one RDD as input and form one or more than one RDD as an output . • As all RDDs are immutable then the main RDD will not be changed. • It is lazy operation though it creates some RDDs but they can executes when an action is called.
  • 4. Types of RDD Transformation: • To improve the computation performance, we can set some transformations as pipelined. It helps to optimize process. • There are two kinds of transformations: 1. Narrow Transformation 2. Wide Transformation
  • 5. Narrow Transformation • Narrow transformations are generated as a result of Map, Filter or these kind of operations • It originates from a single partition in a parent RDD . Only some partitions are used to find result.
  • 6. Wide Transformation • Wide Transformations are generated as a result of GroupBykey(), ReduceBykey() or these kind of operations. • In these case to form a data partition, it can take data from more than one partitions. • It is also known as shuffle partition.