SlideShare a Scribd company logo
1 of 18
Bank Customer
Churn Prediction
Leveraging Machine Learning for
Enhanced Customer Retention Presented by : Saurav Singh
Introduction
• The Banking sector is evolving rapidly and is very well influenced
by technological advancements, changing consumer preferences,
and a competitive market.
• Customer churn, which is the phenomenon of customers
discontinuing their relationship with a bank, poses unique
challenges and opportunities. When a bank loses customers, it can
seriously affect how much money it makes and its market standing.
• Machine learning, with its predictive capabilities, offers a
transformative approach to understanding and mitigating the
challenges posed by customer churn.
Through data-driven insights and predictive modeling, this presentation aims to showcase my
Machine Learning Capstone Project focused on predicting customer churn in the Banking Sector.
Dataset
Information
Here are the key details about the dataset used in this project:
• Number of records: Our dataset comprises a robust collection of data,
consisting of 10,000 records. Each record represents a unique entry,
contributing to the richness and depth of our analysis.
• Features/Columns: The dataset is characterized by a diverse set of features,
each providing valuable insights into customer behavior, preferences, and
interactions. In total, there are 14 features/columns that form the basis of our
predictive modeling.
Column Names
• Row Number
• Customer ID
• Surname
• Credit Score
• Geography
• Gender
• Age
• Tenure
• Balance
• Number of Products
• Has Credit Card
• Is Active Member
• Estimated Salary
• Churned
Exploratory Data Analysis (EDA)
• Exploring the data allowed us to gain a comprehensive overview of
the data's structure. It uncovered potential patterns, helped us
identify key trends and get essential insights from the dataset.
• Throughout the EDA process, we analyzed the distribution of
individual features, investigated correlations, and explored any
inherent relationships between variables.
• Visualizations also played a crucial role in providing a clear
representation of the data, offering insights into customer behavior
and identifying the factors that may contribute to customer churn.
• First, we made sure there were no Null values and Duplicates in the dataset. And luckily,
there weren't any. Our dataset was clean to begin with.
• Then, we checked our columns to see if they were providing any useful information for us
to work with. We found out that columns like “RowNumber”, “CustomerID” and “Surname”
weren't contributing much to the predictions. Hence, we decided to drop them during
preprocessing.
• The "Geography" and "Gender" columns in our dataset were categorical variables. For
them to work with our model, it was necessary to convert these categorical features into a
numerical format.
• To ensure consistent scales for numerical features, we decided to employ Standard Scaler
during preprocessing.
Exploratory Data Analysis (EDA)
Visualizations
Our target variable 'Churned' exhibits class
imbalance, with one class dominating the other.
This issue of data imbalance needs to be addressed.
The above plot reveals a substantial
customer presence in France, surpassing
other regions by a significant margin.
• The dataset contains more Male entries than Female entries.
• The number of credit card owners is significantly higher than those who don’t own a credit card.
• Credit Card owners have a higher Churn Rate than Non-Credit Card owners.
• The distribution of Active and Inactive members is almost the same.
• Inactive members have a higher Churn Rate than Active members.
• The distribution of people with Credit Score ranging from 601 to 700 is higher than any other group.
• The distribution of people with Age ranging from 31 to 40 is higher than any other Age Group.
Upon inspecting the heatmap, we can see that there is no significant correlation observed
among the columns. As a result, no columns will be dropped solely based on correlation.
Preprocessing
• First, “RowNumber” , “CustomerID” and “Surname” columns were dropped as they
didn’t provide any useful information for our predictions.
• Then, we encoded the Categorical data into Numerical data with the help of One-Hot
Encoding Technique. It assigns binary numeric values to each unique class present in
columns with categorical data.
Splitting the data into X and
y• In this step, we partitioned the dataset into two components: X and y.
• The variable X encompasses all independent variables, representing the features
that contribute to our predictions.
• On the other hand, y encapsulates the dependent variable or target variable,
serving as the outcome we aim to predict.
Train-Test Split
• We then split the dataset into training data and testing data.
• We did an 80:20 split, meaning 80% of our data is Training Data and 20% of our data is
Testing Data. So, our test size was set to 0.2.
• We took Random State as 123. This guaranteed the reproducibility of our results across
different runs.
• We also used Stratify = y to ensure that our Target Variable (y) is distributed
proportionally.
Standard Scaler
• We used Standard Scaler to standardize the features of the dataset.
• This ensured that the consistency between the features of the dataset was maintained.
• Standardization is crucial for certain machine learning algorithms, promoting optimal
model performance by mitigating the influence of varying magnitudes among features
Over-Sampling with SMOTE
• We had data imbalance within our target variable. Initially, we evaluated our model's
accuracy in the presence of this imbalance.
• Then, to rectify the issue of imbalance, we implemented the Synthetic Minority Over-
Sampling Technique (SMOTE) as an oversampling method.
• We then compared the model accuracies before and after addressing the data imbalance using
SMOTE, providing valuable insights into the impact of this preprocessing technique.
• Distribution of our y_train before oversampling :
• Distribution of our y_train after oversampling:
Not Churned Churned
6370 1630
Not Churned Churned
6370 6370
Applying Machine
Learning Algorithms
This Bank Customer Churn problem we have here is a Binary Classification problem.
Models used:
• Logistic Regression : Logistic Regression is a powerful tool in binary classification. Its very good at modeling
the probability of an event occurring, making it suitable for scenarios where understanding the likelihood of
customers churning is essential.
• Support Vector Machine (SVC) : Support Vector Classification is a robust algorithm employed for classification
tasks, especially when there's a need for clear separation between classes. In the context of customer churn
prediction, it draws distinct decision boundaries between loyal and potential churned customers.
• Naive Bayes : Naive Bayes is a probabilistic classification algorithm known for its simplicity and efficiency. It
assumes that features are independent, making calculations easier. Its often used when simplicity and speed
are crucial.
Evaluation Metrics
Without Oversampling
(SMOTE)
With Oversampling (SMOTE)
Model Accuracy Precision Recall F1-Score
LOGI 81.2 59.62 23.58 33.80
SVC 86.5 80.44 44.47 57.27
NB 82.1 59.53 37.59 46.08
Model Accuracy Precision Recall F1-Score
LOGI_OS 70.75 37.42 65.11 47.53
SVC_OS 80.75 51.88 74.44 61.15
NB_OS 71.70 38.91 68.55 49.64
• We can see that Oversampling makes a huge difference.
• After Oversampling, the accuracy and precision of our models have decreased a bit
but Recall and F1-Score have increased.
Model Selection and Considerations
• SVC outperforms Logistic Regression and Naive Bayes in all metrics, demonstrating
higher Accuracy, Precision, Recall, and F1-Score. It seems to be a promising model for
our task.
• Based on the provided metrics, SVC stands out as the best-performing model overall. It
achieves a good balance between precision and recall, making it suitable for our
customer churn prediction task.
• While metrics like Accuracy and Precision are essential, Recall is particularly crucial in
Customer Churn Prediction, as it indicates the ability to identify customers who are
likely to Churn. And Support Vector Classification provided us the best Recall value.
• Hence, we will go with Support Vector Classification as our final model as it is quite
evident that it performs best for our Bank Customer Churn problem.
Conclusion
• With the help of several insights, patterns and trends in our data, we’ve used Machine Learning to
address the intricate challenge of predicting Customer Churn.
• This project offers significant benefits to banks:
 By predicting potential churners, banks can adopt proactive strategies to retain valuable
customers. This involves personalized interventions, loyalty programs, and targeted
communication to address customer concerns and enhance satisfaction.
 By focusing efforts on customers at a higher risk of churn, banks can streamline operations,
reduce marketing costs, and improve overall efficiency.
 Anticipating and mitigating customer churn contributes directly to revenue optimization.
 Understanding the factors influencing customer churn enables banks to tailor their services to
meet individual needs. This level of personalization fosters stronger customer relationships,
increases loyalty, and enhances the overall banking experience.
Thank You !

More Related Content

Similar to Bank Customer Churn Prediction- Saurav Singh.pptx

Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Personal Loan Risk Assessment
Personal Loan Risk Assessment Personal Loan Risk Assessment
Personal Loan Risk Assessment Kunal Kashyap
 
Wooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit CustomersWooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit CustomersLucinda Linde
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language Aayush Kumar
 
Auxilium Advanced Analytics Brochure 2019
Auxilium Advanced Analytics Brochure 2019Auxilium Advanced Analytics Brochure 2019
Auxilium Advanced Analytics Brochure 2019Michael Van Luven
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckSasha Lazarevic
 
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionBig Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionMatt Stubbs
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price PromotionIRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price PromotionIRJET Journal
 
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.Souma Maiti
 
Data Analytics Using R - Report
Data Analytics Using R - ReportData Analytics Using R - Report
Data Analytics Using R - ReportAkanksha Gohil
 
Lead Scoring Group Case Study Presentation.pdf
Lead Scoring Group Case Study Presentation.pdfLead Scoring Group Case Study Presentation.pdf
Lead Scoring Group Case Study Presentation.pdfKrishP2
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 

Similar to Bank Customer Churn Prediction- Saurav Singh.pptx (20)

Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Personal Loan Risk Assessment
Personal Loan Risk Assessment Personal Loan Risk Assessment
Personal Loan Risk Assessment
 
Wooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit CustomersWooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit Customers
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
Auxilium Advanced Analytics Brochure 2019
Auxilium Advanced Analytics Brochure 2019Auxilium Advanced Analytics Brochure 2019
Auxilium Advanced Analytics Brochure 2019
 
Navigant qfas april 2015
Navigant qfas april 2015Navigant qfas april 2015
Navigant qfas april 2015
 
Navigant qfas april 2015
Navigant qfas april 2015Navigant qfas april 2015
Navigant qfas april 2015
 
Navigant qfas april 2015
Navigant qfas april 2015Navigant qfas april 2015
Navigant qfas april 2015
 
Business Analytics.pptx
Business Analytics.pptxBusiness Analytics.pptx
Business Analytics.pptx
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionBig Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price PromotionIRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
 
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
 
Predictive modeling
Predictive modelingPredictive modeling
Predictive modeling
 
Data Analytics Using R - Report
Data Analytics Using R - ReportData Analytics Using R - Report
Data Analytics Using R - Report
 
Lead Scoring Group Case Study Presentation.pdf
Lead Scoring Group Case Study Presentation.pdfLead Scoring Group Case Study Presentation.pdf
Lead Scoring Group Case Study Presentation.pdf
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Day 1 (Lecture 2): Business Analytics
Day 1 (Lecture 2): Business AnalyticsDay 1 (Lecture 2): Business Analytics
Day 1 (Lecture 2): Business Analytics
 

More from Boston Institute of Analytics

Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.orgEnhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.orgBoston Institute of Analytics
 
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRFExploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRFBoston Institute of Analytics
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...Boston Institute of Analytics
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
NLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile PricesNLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile PricesBoston Institute of Analytics
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationBoston Institute of Analytics
 
Combating Fraudulent Transactions: A Deep Dive into Credit Card Fraud Detection
Combating Fraudulent Transactions: A Deep Dive into Credit Card Fraud DetectionCombating Fraudulent Transactions: A Deep Dive into Credit Card Fraud Detection
Combating Fraudulent Transactions: A Deep Dive into Credit Card Fraud DetectionBoston Institute of Analytics
 
Predicting Liver Disease in India: A Machine Learning Approach
Predicting Liver Disease in India: A Machine Learning ApproachPredicting Liver Disease in India: A Machine Learning Approach
Predicting Liver Disease in India: A Machine Learning ApproachBoston Institute of Analytics
 
Employee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project PresentationEmployee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project PresentationBoston Institute of Analytics
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 

More from Boston Institute of Analytics (20)

Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.orgEnhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
 
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRFExploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Detecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven ApproachDetecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven Approach
 
Predicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning ApproachPredicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning Approach
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
NLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile PricesNLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile Prices
 
Analyzing Movie Reviews : Machine learning project
Analyzing Movie Reviews : Machine learning projectAnalyzing Movie Reviews : Machine learning project
Analyzing Movie Reviews : Machine learning project
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 
Combating Fraudulent Transactions: A Deep Dive into Credit Card Fraud Detection
Combating Fraudulent Transactions: A Deep Dive into Credit Card Fraud DetectionCombating Fraudulent Transactions: A Deep Dive into Credit Card Fraud Detection
Combating Fraudulent Transactions: A Deep Dive into Credit Card Fraud Detection
 
Predicting Liver Disease in India: A Machine Learning Approach
Predicting Liver Disease in India: A Machine Learning ApproachPredicting Liver Disease in India: A Machine Learning Approach
Predicting Liver Disease in India: A Machine Learning Approach
 
Employee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project PresentationEmployee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project Presentation
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 

Recently uploaded

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 

Recently uploaded (20)

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 

Bank Customer Churn Prediction- Saurav Singh.pptx

  • 1.
  • 2. Bank Customer Churn Prediction Leveraging Machine Learning for Enhanced Customer Retention Presented by : Saurav Singh
  • 3. Introduction • The Banking sector is evolving rapidly and is very well influenced by technological advancements, changing consumer preferences, and a competitive market. • Customer churn, which is the phenomenon of customers discontinuing their relationship with a bank, poses unique challenges and opportunities. When a bank loses customers, it can seriously affect how much money it makes and its market standing. • Machine learning, with its predictive capabilities, offers a transformative approach to understanding and mitigating the challenges posed by customer churn. Through data-driven insights and predictive modeling, this presentation aims to showcase my Machine Learning Capstone Project focused on predicting customer churn in the Banking Sector.
  • 4. Dataset Information Here are the key details about the dataset used in this project: • Number of records: Our dataset comprises a robust collection of data, consisting of 10,000 records. Each record represents a unique entry, contributing to the richness and depth of our analysis. • Features/Columns: The dataset is characterized by a diverse set of features, each providing valuable insights into customer behavior, preferences, and interactions. In total, there are 14 features/columns that form the basis of our predictive modeling. Column Names • Row Number • Customer ID • Surname • Credit Score • Geography • Gender • Age • Tenure • Balance • Number of Products • Has Credit Card • Is Active Member • Estimated Salary • Churned
  • 5. Exploratory Data Analysis (EDA) • Exploring the data allowed us to gain a comprehensive overview of the data's structure. It uncovered potential patterns, helped us identify key trends and get essential insights from the dataset. • Throughout the EDA process, we analyzed the distribution of individual features, investigated correlations, and explored any inherent relationships between variables. • Visualizations also played a crucial role in providing a clear representation of the data, offering insights into customer behavior and identifying the factors that may contribute to customer churn.
  • 6. • First, we made sure there were no Null values and Duplicates in the dataset. And luckily, there weren't any. Our dataset was clean to begin with. • Then, we checked our columns to see if they were providing any useful information for us to work with. We found out that columns like “RowNumber”, “CustomerID” and “Surname” weren't contributing much to the predictions. Hence, we decided to drop them during preprocessing. • The "Geography" and "Gender" columns in our dataset were categorical variables. For them to work with our model, it was necessary to convert these categorical features into a numerical format. • To ensure consistent scales for numerical features, we decided to employ Standard Scaler during preprocessing. Exploratory Data Analysis (EDA)
  • 7. Visualizations Our target variable 'Churned' exhibits class imbalance, with one class dominating the other. This issue of data imbalance needs to be addressed. The above plot reveals a substantial customer presence in France, surpassing other regions by a significant margin.
  • 8. • The dataset contains more Male entries than Female entries. • The number of credit card owners is significantly higher than those who don’t own a credit card. • Credit Card owners have a higher Churn Rate than Non-Credit Card owners. • The distribution of Active and Inactive members is almost the same. • Inactive members have a higher Churn Rate than Active members.
  • 9. • The distribution of people with Credit Score ranging from 601 to 700 is higher than any other group. • The distribution of people with Age ranging from 31 to 40 is higher than any other Age Group.
  • 10. Upon inspecting the heatmap, we can see that there is no significant correlation observed among the columns. As a result, no columns will be dropped solely based on correlation.
  • 11. Preprocessing • First, “RowNumber” , “CustomerID” and “Surname” columns were dropped as they didn’t provide any useful information for our predictions. • Then, we encoded the Categorical data into Numerical data with the help of One-Hot Encoding Technique. It assigns binary numeric values to each unique class present in columns with categorical data. Splitting the data into X and y• In this step, we partitioned the dataset into two components: X and y. • The variable X encompasses all independent variables, representing the features that contribute to our predictions. • On the other hand, y encapsulates the dependent variable or target variable, serving as the outcome we aim to predict.
  • 12. Train-Test Split • We then split the dataset into training data and testing data. • We did an 80:20 split, meaning 80% of our data is Training Data and 20% of our data is Testing Data. So, our test size was set to 0.2. • We took Random State as 123. This guaranteed the reproducibility of our results across different runs. • We also used Stratify = y to ensure that our Target Variable (y) is distributed proportionally. Standard Scaler • We used Standard Scaler to standardize the features of the dataset. • This ensured that the consistency between the features of the dataset was maintained. • Standardization is crucial for certain machine learning algorithms, promoting optimal model performance by mitigating the influence of varying magnitudes among features
  • 13. Over-Sampling with SMOTE • We had data imbalance within our target variable. Initially, we evaluated our model's accuracy in the presence of this imbalance. • Then, to rectify the issue of imbalance, we implemented the Synthetic Minority Over- Sampling Technique (SMOTE) as an oversampling method. • We then compared the model accuracies before and after addressing the data imbalance using SMOTE, providing valuable insights into the impact of this preprocessing technique. • Distribution of our y_train before oversampling : • Distribution of our y_train after oversampling: Not Churned Churned 6370 1630 Not Churned Churned 6370 6370
  • 14. Applying Machine Learning Algorithms This Bank Customer Churn problem we have here is a Binary Classification problem. Models used: • Logistic Regression : Logistic Regression is a powerful tool in binary classification. Its very good at modeling the probability of an event occurring, making it suitable for scenarios where understanding the likelihood of customers churning is essential. • Support Vector Machine (SVC) : Support Vector Classification is a robust algorithm employed for classification tasks, especially when there's a need for clear separation between classes. In the context of customer churn prediction, it draws distinct decision boundaries between loyal and potential churned customers. • Naive Bayes : Naive Bayes is a probabilistic classification algorithm known for its simplicity and efficiency. It assumes that features are independent, making calculations easier. Its often used when simplicity and speed are crucial.
  • 15. Evaluation Metrics Without Oversampling (SMOTE) With Oversampling (SMOTE) Model Accuracy Precision Recall F1-Score LOGI 81.2 59.62 23.58 33.80 SVC 86.5 80.44 44.47 57.27 NB 82.1 59.53 37.59 46.08 Model Accuracy Precision Recall F1-Score LOGI_OS 70.75 37.42 65.11 47.53 SVC_OS 80.75 51.88 74.44 61.15 NB_OS 71.70 38.91 68.55 49.64 • We can see that Oversampling makes a huge difference. • After Oversampling, the accuracy and precision of our models have decreased a bit but Recall and F1-Score have increased.
  • 16. Model Selection and Considerations • SVC outperforms Logistic Regression and Naive Bayes in all metrics, demonstrating higher Accuracy, Precision, Recall, and F1-Score. It seems to be a promising model for our task. • Based on the provided metrics, SVC stands out as the best-performing model overall. It achieves a good balance between precision and recall, making it suitable for our customer churn prediction task. • While metrics like Accuracy and Precision are essential, Recall is particularly crucial in Customer Churn Prediction, as it indicates the ability to identify customers who are likely to Churn. And Support Vector Classification provided us the best Recall value. • Hence, we will go with Support Vector Classification as our final model as it is quite evident that it performs best for our Bank Customer Churn problem.
  • 17. Conclusion • With the help of several insights, patterns and trends in our data, we’ve used Machine Learning to address the intricate challenge of predicting Customer Churn. • This project offers significant benefits to banks:  By predicting potential churners, banks can adopt proactive strategies to retain valuable customers. This involves personalized interventions, loyalty programs, and targeted communication to address customer concerns and enhance satisfaction.  By focusing efforts on customers at a higher risk of churn, banks can streamline operations, reduce marketing costs, and improve overall efficiency.  Anticipating and mitigating customer churn contributes directly to revenue optimization.  Understanding the factors influencing customer churn enables banks to tailor their services to meet individual needs. This level of personalization fosters stronger customer relationships, increases loyalty, and enhances the overall banking experience.