SlideShare a Scribd company logo
1 of 19
DA 5230 – Statistical & Machine Learning
Lecture 2 – Introduction to Statistical Learning
Maninda Edirisooriya
manindaw@uom.lk
Machine Learning Overview
Source: https://en.wikipedia.org/wiki/Deep_learning#/media/File:AI-ML-DL.svg
Machine Learning Overview
• Intelligence: Understanding the nature to generate useful information
• Artificial Intelligence (AI): Mimicking the Intelligence in
animals/humans by man-made machines
• Machine Learning (ML): Consuming data by machines to achieve
Artificial Intelligence
• Deep Learning (DL): Machine Learning using multiple layers of nature
inspired neurons (in Deep Neural Networks)
AI vs ML
• AI may consist of theory and rule based intelligence
• Expert Systems
• Control Systems
• Algorithms
• And Machine Learning Systems
• ML is developed by mainly using available data where AI can also be
developed with any data by using a fixed set of rules
• ML systems are almost free from fixed rules added by experts where
data will design the system
• Domain knowledge is less required
• ML does not contain if-else statements (a common misconception)
What is Statistical Learning (SL)?
• Using statistics to understand the nature with data
• Have well established proven mathematical methods while ML can
sometimes be a form of Alchemy with data where focus is more on
results
• Is the base of ML where the statistics used in some ML models
may not have well studied yet
• Has a higher interpretability as proven with mathematics
• Has a blur line between with ML
SL vs ML
Statistical Learning Machine Learning
Focus Primarily focuses on understanding and modeling the
relationships between variables in data using
statistical methods. It aims to make inferences and
predictions based on these relationships.
A broader field that encompasses various techniques for
building predictive models and making decisions without being
overly concerned with the underlying statistical assumptions. It
is often used for tasks such as classification, regression,
clustering, and more.
Foundation Rooted in statistical theory and often uses classical
statistical techniques like linear regression, logistic
regression, and analysis of variance.
Draws from a wider range of techniques, including traditional
statistics but also incorporates methods like decision trees,
support vector machines, neural networks, and more. It is less
reliant on statistical theory and more focused on empirical
performance.
Assumptions Methods often make explicit assumptions about the
underlying data distribution, such as normality or
linearity. These assumptions help in making
inferences about population parameters.
Models are often designed to be more flexible and adaptive,
which can make them less reliant on strict data distribution
assumptions.
Interpretability Models tend to be more interpretable, meaning it is
easier to understand how the model arrives at its
predictions. This interpretability is important in fields
where understanding the underlying relationships is
crucial.
While interpretability can be a concern in some machine
learning models (e.g., deep neural networks), many machine
learning models are designed with a primary focus on
predictive accuracy rather than interpretability.
Course Structure
• Machine Learning will be the main focus
• You should be able to do ML stuff yourself from the available data
• You should be familiar with every phase of the ML lifecycle
• Statistical background will be explained depending on your progress
of the above requirement
• ML will be first taught with simpler mathematics and intuition and
then will be explained with statistical fundamentals
• You will first be able to work on ML projects and then the theory
behind it will be learned with statistics
For Your Reference
• Machine Learning can be self-learned with the free course
https://www.coursera.org/specializations/machine-learning-introduction
• You can learn more about Statistical Learning from the free book about
Python based SL at https://www.statlearning.com
• Learn Python, Numpy, Pandas and scikit-learn from online tutorials and
Youtube videos
• You can also clarify tricky ML/SL problems with ChatGPT
• Anyway, note that some online tutorials, videos and ChatGPT may provide
incorrect information where you should be careful when learning from
these resources
• Never use ChatGPT for answering Quizzes or Exams! (at least until the AI
takes over the world)
What we want from Machine Learning?
• Say we have some collected data
• We want a computer/machine to learn from those data and get the insight of that data
into a model
• Our expectation is to use that model to predict/make inferences on newly provided data
• This is like you teach a kid to learn a certain pattern from example pictures and ask him
later to draw/classify similar pictures
• After the model is made (known as “trained”) you want to make sure the model has
learned the insights with a sufficient accuracy
• For that requirement, you train the model with only a part of the given data and use the
remaining data to check (known as “test”) the accuracy of the model
• Model will be used for our needs (to predict/make inferences) only if the tests are
passed. Otherwise, we have to look back about the problem and may have to start from
data collection
What we do in Machine Learning?
• We find a dataset
• In Supervised ML we have labeled data (i.e.: data has both X values and Y values)
• In Un-supervised ML we have un-labeled data (i.e.: data has only X values but no Y
values)
• We select a suitable ML algorithm for modeling (e.g.: Linear Regression)
• We train a model with most of the data (say 80% of the total data) using
that algorithm
• We test (check the accuracy of) the trained model with the remaining data
(say 20% of the total data)
• If the tests are passing (i.e. the trained model is accurate enough) we can
use the model to label more un-labeled data (in supervised ML) or making
inferences on more data (in unsupervised ML).
• Otherwise, we have to iterate the above process until the tests are passed
Supervised Machine Learning
• Now, let’s further look more detail into Supervised Machine Learning
• There are two types of fields/variables/parameters in a Supervised
ML dataset
1. Independent variables/features/predictors/X values
2. Dependent variable/target variable/response/Y value
• Data sets will contain a set of records where each record contains
data in a certain set of X values and a one Y value
• E.g.: X1 - GPA X2 - income X3 – IQ Y– life_expectency
3.41 3000 105 72
2.32 1800 86 65
3.82 6000 130 86
3.56 4800 112 ?
Given For training/testing
Need to predict
Supervised Machine Learning
X1 - GPA X2 - income X3 – IQ Y– life_expectancy
3.41 3000 105 72
2.32 1800 86 65
X1 - GPA X2 - income X3 – IQ
3.56 4800 112
Y– life_expectency
76
ML Model
Training
Trained ML
Model
Predicting
X1 - GPA X2 - income X3 – IQ Y– life_expectency
3.82 6000 130 86
Testing
1
3
2
Accuracy = 80%
Supervised Machine Learning
• You are given to train a model to identify how X1, X2, X3 relates to Y by the
definition of the function f.
• Where, Y = f(X1, X2, X3 ) or simply, Y = f(X)
• Once the model is trained it will model an estimator for f, named as መ
f which
is not the exact f as the model is just an approximation of the true f
• When predicting Y values for new X data, it will generate ෡
Y, an estimator
for Y due to መ
f
• Due to this error (i.e. ෡
Y ≠ Y) there will be an error 𝜀
• Now the trained model will be መ
f(X) where,
መ
f(X) = ෡
Y = f(X) + 𝜀
Model’s error
True function to be approximated
Predicted values from the model
Approximated model function
Supervised Machine Learning
• There are mainly 2 types of Supervised Machine Learning problems
• Regression problems
• Classification problems
• This difference comes from the data type we are going to predict (Y)
• If the Y is a continuous number such as temperature or length it is a
regression problem
• Else if the Y is a discreate finite number such as gender or country it is a
classification problem
Supervised Machine Learning – Example 1
• Problem: A real estate company wants to estimate the sales price of a house
given the following details of last 100 houses sold as data, with parameters
including the sale price,
• Area of the house
• Area of the land
• Number of rooms
• Number of floors
• Distance to the main road
• Solution: This is a supervised learning regression problem where sales price is
the Y parameter and other parameters of the given dataset as X parameters
Supervised Machine Learning – Example 2
• Problem: A doctor wants to diagnose a cancer as malignant or benign using
the data of 500 tumors with labeled data,
• Length of the tumor
• Age of the patient
• Having a cancer patient in family
• Solution: This is a supervised learning classification problem where malignant
or benign nature is the Boolean Y parameter and other parameters of the
given dataset are the X parameters. Here, length of the tumor and age of the
patient are float in type X variables while having a cancer patient in family is a
Boolean X variable.
Un-supervised Machine Learning
• Now, let’s look more detail into Un-supervised Machine Learning
• There is only one type of fields/variables/parameters in a Supervised
ML dataset
• Independent variables/features/X values
• No dependent variables
• There are several types of Un-supervised Machine Learning problems
• Clustering
• Dimensionality reduction
• Anomaly detection
• …
Un-supervised Machine Learning – Example 1
• Problem: A web site owner wants to categorize its past 1000 visitors into 10
types based on the following data,
• Visited hour of the day
• Visit time
• Most preferred product
• Web browser used
• Country of the IP address
• Solution: As there are no labelled data (Y parameters) this is an unsupervised
learning clustering problem where the given parameters of the given dataset
are X parameters. We can use K-means clustering to cluster the X parameters
into 10 classes
Questions?

More Related Content

Similar to Lecture 2 - Introduction to Machine Learning, a lecture in subject module Statistical & Machine Learning

An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statisticsSpotle.ai
 
Unit-V Machine Learning.ppt
Unit-V Machine Learning.pptUnit-V Machine Learning.ppt
Unit-V Machine Learning.pptSharpmark256
 
Chapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdfChapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdfAschalewAyele2
 
mining sirdar , overman, assistant managerppt.ppt
mining sirdar , overman, assistant managerppt.pptmining sirdar , overman, assistant managerppt.ppt
mining sirdar , overman, assistant managerppt.pptUttamVishwakarma7
 
CodeLess Machine Learning
CodeLess Machine LearningCodeLess Machine Learning
CodeLess Machine LearningSharjeel Imtiaz
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahulKirtoniya
 
BIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNINGBIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNINGUmair Shafique
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxiaeronlineexm
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needGibDevs
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxChitrachitrap
 
Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIPramit Choudhary
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updatedVajira Thambawita
 
Week_1 Machine Learning introduction.pptx
Week_1 Machine Learning introduction.pptxWeek_1 Machine Learning introduction.pptx
Week_1 Machine Learning introduction.pptxmuhammadsamroz
 

Similar to Lecture 2 - Introduction to Machine Learning, a lecture in subject module Statistical & Machine Learning (20)

An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statistics
 
Unit-V Machine Learning.ppt
Unit-V Machine Learning.pptUnit-V Machine Learning.ppt
Unit-V Machine Learning.ppt
 
Chapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdfChapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdf
 
ML
MLML
ML
 
machine learning
machine learningmachine learning
machine learning
 
mining sirdar , overman, assistant managerppt.ppt
mining sirdar , overman, assistant managerppt.pptmining sirdar , overman, assistant managerppt.ppt
mining sirdar , overman, assistant managerppt.ppt
 
CodeLess Machine Learning
CodeLess Machine LearningCodeLess Machine Learning
CodeLess Machine Learning
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
ML_Module_1.pdf
ML_Module_1.pdfML_Module_1.pdf
ML_Module_1.pdf
 
Machine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdfMachine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdf
 
module 6 (1).ppt
module 6 (1).pptmodule 6 (1).ppt
module 6 (1).ppt
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
 
BIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNINGBIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNING
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptx
 
Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AI
 
NCCU: The Story of Data Science and Machine Learning Workshop - A Tutorial in...
NCCU: The Story of Data Science and Machine Learning Workshop - A Tutorial in...NCCU: The Story of Data Science and Machine Learning Workshop - A Tutorial in...
NCCU: The Story of Data Science and Machine Learning Workshop - A Tutorial in...
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
 
Week_1 Machine Learning introduction.pptx
Week_1 Machine Learning introduction.pptxWeek_1 Machine Learning introduction.pptx
Week_1 Machine Learning introduction.pptx
 

More from Maninda Edirisooriya

Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...Maninda Edirisooriya
 
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...Maninda Edirisooriya
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Maninda Edirisooriya
 
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Maninda Edirisooriya
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Maninda Edirisooriya
 
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...Maninda Edirisooriya
 
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...Maninda Edirisooriya
 
Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...
Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...
Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...Maninda Edirisooriya
 
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Maninda Edirisooriya
 
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Maninda Edirisooriya
 
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...Maninda Edirisooriya
 
Analyzing the effectiveness of mobile and web channels using WSO2 BAM
Analyzing the effectiveness of mobile and web channels using WSO2 BAMAnalyzing the effectiveness of mobile and web channels using WSO2 BAM
Analyzing the effectiveness of mobile and web channels using WSO2 BAMManinda Edirisooriya
 

More from Maninda Edirisooriya (18)

Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
 
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
 
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
 
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
 
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
 
Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...
Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...
Lecture 6 - Logistic Regression, a lecture in subject module Statistical & Ma...
 
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
 
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
 
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
 
Analyzing the effectiveness of mobile and web channels using WSO2 BAM
Analyzing the effectiveness of mobile and web channels using WSO2 BAMAnalyzing the effectiveness of mobile and web channels using WSO2 BAM
Analyzing the effectiveness of mobile and web channels using WSO2 BAM
 
WSO2 BAM - Your big data toolbox
WSO2 BAM - Your big data toolboxWSO2 BAM - Your big data toolbox
WSO2 BAM - Your big data toolbox
 
Training Report
Training ReportTraining Report
Training Report
 
GViz - Project Report
GViz - Project ReportGViz - Project Report
GViz - Project Report
 
Mortivation
MortivationMortivation
Mortivation
 
Hafnium impact 2008
Hafnium impact 2008Hafnium impact 2008
Hafnium impact 2008
 
ChatCrypt
ChatCryptChatCrypt
ChatCrypt
 

Recently uploaded

Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 

Recently uploaded (20)

9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 

Lecture 2 - Introduction to Machine Learning, a lecture in subject module Statistical & Machine Learning

  • 1. DA 5230 – Statistical & Machine Learning Lecture 2 – Introduction to Statistical Learning Maninda Edirisooriya manindaw@uom.lk
  • 2. Machine Learning Overview Source: https://en.wikipedia.org/wiki/Deep_learning#/media/File:AI-ML-DL.svg
  • 3. Machine Learning Overview • Intelligence: Understanding the nature to generate useful information • Artificial Intelligence (AI): Mimicking the Intelligence in animals/humans by man-made machines • Machine Learning (ML): Consuming data by machines to achieve Artificial Intelligence • Deep Learning (DL): Machine Learning using multiple layers of nature inspired neurons (in Deep Neural Networks)
  • 4. AI vs ML • AI may consist of theory and rule based intelligence • Expert Systems • Control Systems • Algorithms • And Machine Learning Systems • ML is developed by mainly using available data where AI can also be developed with any data by using a fixed set of rules • ML systems are almost free from fixed rules added by experts where data will design the system • Domain knowledge is less required • ML does not contain if-else statements (a common misconception)
  • 5. What is Statistical Learning (SL)? • Using statistics to understand the nature with data • Have well established proven mathematical methods while ML can sometimes be a form of Alchemy with data where focus is more on results • Is the base of ML where the statistics used in some ML models may not have well studied yet • Has a higher interpretability as proven with mathematics • Has a blur line between with ML
  • 6. SL vs ML Statistical Learning Machine Learning Focus Primarily focuses on understanding and modeling the relationships between variables in data using statistical methods. It aims to make inferences and predictions based on these relationships. A broader field that encompasses various techniques for building predictive models and making decisions without being overly concerned with the underlying statistical assumptions. It is often used for tasks such as classification, regression, clustering, and more. Foundation Rooted in statistical theory and often uses classical statistical techniques like linear regression, logistic regression, and analysis of variance. Draws from a wider range of techniques, including traditional statistics but also incorporates methods like decision trees, support vector machines, neural networks, and more. It is less reliant on statistical theory and more focused on empirical performance. Assumptions Methods often make explicit assumptions about the underlying data distribution, such as normality or linearity. These assumptions help in making inferences about population parameters. Models are often designed to be more flexible and adaptive, which can make them less reliant on strict data distribution assumptions. Interpretability Models tend to be more interpretable, meaning it is easier to understand how the model arrives at its predictions. This interpretability is important in fields where understanding the underlying relationships is crucial. While interpretability can be a concern in some machine learning models (e.g., deep neural networks), many machine learning models are designed with a primary focus on predictive accuracy rather than interpretability.
  • 7. Course Structure • Machine Learning will be the main focus • You should be able to do ML stuff yourself from the available data • You should be familiar with every phase of the ML lifecycle • Statistical background will be explained depending on your progress of the above requirement • ML will be first taught with simpler mathematics and intuition and then will be explained with statistical fundamentals • You will first be able to work on ML projects and then the theory behind it will be learned with statistics
  • 8. For Your Reference • Machine Learning can be self-learned with the free course https://www.coursera.org/specializations/machine-learning-introduction • You can learn more about Statistical Learning from the free book about Python based SL at https://www.statlearning.com • Learn Python, Numpy, Pandas and scikit-learn from online tutorials and Youtube videos • You can also clarify tricky ML/SL problems with ChatGPT • Anyway, note that some online tutorials, videos and ChatGPT may provide incorrect information where you should be careful when learning from these resources • Never use ChatGPT for answering Quizzes or Exams! (at least until the AI takes over the world)
  • 9. What we want from Machine Learning? • Say we have some collected data • We want a computer/machine to learn from those data and get the insight of that data into a model • Our expectation is to use that model to predict/make inferences on newly provided data • This is like you teach a kid to learn a certain pattern from example pictures and ask him later to draw/classify similar pictures • After the model is made (known as “trained”) you want to make sure the model has learned the insights with a sufficient accuracy • For that requirement, you train the model with only a part of the given data and use the remaining data to check (known as “test”) the accuracy of the model • Model will be used for our needs (to predict/make inferences) only if the tests are passed. Otherwise, we have to look back about the problem and may have to start from data collection
  • 10. What we do in Machine Learning? • We find a dataset • In Supervised ML we have labeled data (i.e.: data has both X values and Y values) • In Un-supervised ML we have un-labeled data (i.e.: data has only X values but no Y values) • We select a suitable ML algorithm for modeling (e.g.: Linear Regression) • We train a model with most of the data (say 80% of the total data) using that algorithm • We test (check the accuracy of) the trained model with the remaining data (say 20% of the total data) • If the tests are passing (i.e. the trained model is accurate enough) we can use the model to label more un-labeled data (in supervised ML) or making inferences on more data (in unsupervised ML). • Otherwise, we have to iterate the above process until the tests are passed
  • 11. Supervised Machine Learning • Now, let’s further look more detail into Supervised Machine Learning • There are two types of fields/variables/parameters in a Supervised ML dataset 1. Independent variables/features/predictors/X values 2. Dependent variable/target variable/response/Y value • Data sets will contain a set of records where each record contains data in a certain set of X values and a one Y value • E.g.: X1 - GPA X2 - income X3 – IQ Y– life_expectency 3.41 3000 105 72 2.32 1800 86 65 3.82 6000 130 86 3.56 4800 112 ? Given For training/testing Need to predict
  • 12. Supervised Machine Learning X1 - GPA X2 - income X3 – IQ Y– life_expectancy 3.41 3000 105 72 2.32 1800 86 65 X1 - GPA X2 - income X3 – IQ 3.56 4800 112 Y– life_expectency 76 ML Model Training Trained ML Model Predicting X1 - GPA X2 - income X3 – IQ Y– life_expectency 3.82 6000 130 86 Testing 1 3 2 Accuracy = 80%
  • 13. Supervised Machine Learning • You are given to train a model to identify how X1, X2, X3 relates to Y by the definition of the function f. • Where, Y = f(X1, X2, X3 ) or simply, Y = f(X) • Once the model is trained it will model an estimator for f, named as መ f which is not the exact f as the model is just an approximation of the true f • When predicting Y values for new X data, it will generate ෡ Y, an estimator for Y due to መ f • Due to this error (i.e. ෡ Y ≠ Y) there will be an error 𝜀 • Now the trained model will be መ f(X) where, መ f(X) = ෡ Y = f(X) + 𝜀 Model’s error True function to be approximated Predicted values from the model Approximated model function
  • 14. Supervised Machine Learning • There are mainly 2 types of Supervised Machine Learning problems • Regression problems • Classification problems • This difference comes from the data type we are going to predict (Y) • If the Y is a continuous number such as temperature or length it is a regression problem • Else if the Y is a discreate finite number such as gender or country it is a classification problem
  • 15. Supervised Machine Learning – Example 1 • Problem: A real estate company wants to estimate the sales price of a house given the following details of last 100 houses sold as data, with parameters including the sale price, • Area of the house • Area of the land • Number of rooms • Number of floors • Distance to the main road • Solution: This is a supervised learning regression problem where sales price is the Y parameter and other parameters of the given dataset as X parameters
  • 16. Supervised Machine Learning – Example 2 • Problem: A doctor wants to diagnose a cancer as malignant or benign using the data of 500 tumors with labeled data, • Length of the tumor • Age of the patient • Having a cancer patient in family • Solution: This is a supervised learning classification problem where malignant or benign nature is the Boolean Y parameter and other parameters of the given dataset are the X parameters. Here, length of the tumor and age of the patient are float in type X variables while having a cancer patient in family is a Boolean X variable.
  • 17. Un-supervised Machine Learning • Now, let’s look more detail into Un-supervised Machine Learning • There is only one type of fields/variables/parameters in a Supervised ML dataset • Independent variables/features/X values • No dependent variables • There are several types of Un-supervised Machine Learning problems • Clustering • Dimensionality reduction • Anomaly detection • …
  • 18. Un-supervised Machine Learning – Example 1 • Problem: A web site owner wants to categorize its past 1000 visitors into 10 types based on the following data, • Visited hour of the day • Visit time • Most preferred product • Web browser used • Country of the IP address • Solution: As there are no labelled data (Y parameters) this is an unsupervised learning clustering problem where the given parameters of the given dataset are X parameters. We can use K-means clustering to cluster the X parameters into 10 classes