3. Logistic Regression and the Perceptron Algorithm: A friendly
introduction
Luis Serrano
https://www.youtube.com/watch?v=jbluHIgBmBo
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 3
4. Agenda
ꙮ Classifier
ꙮ Applications
ꙮ How to get dataset
ꙮ How to find the equation of the decision boundary
ꙮ How to train the classifier
ꙮ Why to move the decision boundary
ꙮ Application
ꙮ Perceptron Coding
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 4
5. Introduction
ꙮ How do we teach a computer to determine if a sentence is happy or
sad
▪ branch of machine learning called classification
▪ predicting labels according to the features of our data
▪ predict categories, such as sad/happy, yes/no, or dog/cat/bird
ꙮ There are many classification models
ꙮ The perceptron is a type of model, or classifier, which takes as input
the features, and returns a 1 or a 0, which can be interpreted as the
answer to a yes/no question.
ꙮ A way to build a good perceptron that fits our data, is via the
perceptron algorithm
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 5
6. Applications – sentiment Analysis
ꙮ Sentiment analysis is the branch of machine learning dedicated to
predicting the sentiment of a certain piece of text
▪ I feel wonderful today – Happy sentence
▪ I am so sad, this is terrible – Terrible sentence
ꙮ Sentiment analysis is used in many practical applications
▪ Companies analyzing the conversations between customers and technical support, to
see if the service is good.
▪ Companies analyzing their online reviews and tweets, to see if their products or
marketing strategies are well received.
▪ Twitter analyzing the overall mood of the population after a certain event.
▪ Stock brokers analyzing the overall sentiment of users towards a company, to decide
if they buy or sell their stock.
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 6
7. Sentiment analysis
ꙮ How would we build a classifier for
sentiment analysis?
ꙮ One way to do it is to attach a score to every
word, in a way that the happy words have
positive scores, and the sad words have
negative scores.
ꙮ Furthermore, let’s make it so that the
happier (or sadder) the word, the higher (or
lower) its score is, so that the word
‘exhilarating’ has a higher score than good,
and ‘awful’ has a lower score than ‘bad’
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 7
9. Sentiment analysis
ꙮ How do we know we did a good job?
▪ How do we know the score of every single word?
▪ The scores we came up with are based on our perception. How do we know
these are the good scores?
▪ What if we have to build a sentiment analysis in a language that we don’t
speak?
▪ The sentences “I am not sad, I am happy.”, and “I am not happy, I am sad.”
have completely different sentiments, yet they have the exact same words, so
they would score the same, no matter the classifier. How do we account for
this?
▪ What happens with punctuation? What about sarcastic comments? What
about the tone of voice, time of the day, or the overall situation? Too many
things to take into account
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 9
10. Sentiment analysis
ꙮ To do an acceptable work…
▪ Run the perceptron algorithm to train the classifier over a set of data
▪ since we are running an algorithm in a dataset, we don’t need to speak the
language
▪ All we need is to know what words tend to appear more in happy or sad sentences
▪ Dataset has many sentences, and each sentence has a label attached which
says ‘happy’ or ‘sad’
▪ Our classifier will not take into account the order of words, punctuation, or
any other external features, and thus, it will make some mistakes
▪ Goal is to have a classifier that is correct most of the time
▪ Beyond the scope: Recurrent neural networks, long short-term memory
networks (LSTM), and hidden Markov model
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 10
11. Simple Dataset - Alien planet
ꙮ We do not know their language
▪ Observe their mood
▪ Listening to what they say
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 11
12. Simple Dataset - Alien planet
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 12
When aack > Beep : Happy
13. Perceptron
ꙮ A perceptron is a classification model which consists of a set of weights, or scores, one
for every feature, and a threshold.
ꙮ The perceptron multiplies each weight by its corresponding score, and adds them,
obtaining a score.
ꙮ If this score is greater than or equal to the threshold, then the perceptron returns a
‘yes’, or a ‘true’, or the value 1.
ꙮ If it is smaller, then it returns a ‘no’, or a ‘false’, or the value 0. In our case, the features
are the words ‘aack’ and ‘beep’, the weights are the scores for each word (+1 and -1),
and the threshold is 0
https://youtube.com/c/mostafaelhosseini 13
14. Another Dataset - Alien planet
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 14
15. Another Dataset - Alien planet
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 15
16. Another Dataset - Alien planet
ꙮ what happens when an alien says absolutely nothing. Is that alien
happy or sad?
ꙮ Therefore, we can see the bias as the inherent mood of the alien
ꙮ What if a model had a positive bias? Then an alien who says nothing
would score positively, and thus, the alien would be predicted as
happy
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 16
17. Another Dataset - Alien planet
ꙮ if the dataset is impossible to
split using a line
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 17
18. Alphabets with 3 words
ꙮ More words : more
dimension
▪ What if it has thousands of
words
▪ Words: features
ꙮ Decision boundary : it
would not be a line any
more
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 18
19. How to compare classifiers? The error
function
▪ The error of a classifier is the number of points it classifies incorrectly
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 19
22. How to find a good classifier?
▪ Begin with a random
classifier
▪ Loop many times:
▪ Improve the algorithm a
small amount
▪ Output a good classifier.
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 22
▪ What do we mean by
making the classifier a little
better?
▪ How do we know this
algorithm gives us a good
classifier?
▪ How many times do we
run the loop?
23. Perceptron Trick
ꙮ The perceptron trick is a tiny step that helps us go from a classifier, to
a slightly better classifier
ꙮ focus on one point, and ask the question: “How can I make this
classifier a little bit better for this one point?
▪ If the point is correctly classified, we won’t touch the classifier
▪ If the point is misclassified, we will move the line a slight amount towards the
point.
▪ Why? Because if the point is in the wrong side of the line, then we’d actually
like the line to move over the point, putting it in the correct side
▪ When you plug values of misclassified point into equation, you will get an
error, and by moving the line toward the point, we reduce the error
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 23
24. How do you move a line?
ꙮ The general form of an
equation is: 𝑎𝑥 + 𝑏𝑦 = 𝑐
▪ If increase 𝑐 – then your line is
translated up direction
▪ If you decrease 𝑐 – then your
line is translated down direction
▪ 2x + 3y - 6 = 0 black line
▪ 2x + 3y - 8 = 0 Red line
▪ 2x + 3y - 4 = 0 Blue line
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 24
25. Translation : Up – Down
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 25
32. Applications
ꙮ Spam Email filter
▪ words in the email.
▪ length of the email, size of attachments,
▪ number of senders,
▪ if any of our contacts is a sender (categorical variable),
▪ and many others.
ꙮ Computer Vision
▪ Image recognition (dog / cat)
▪ The features of the model are simply the pixels of the image, and the label is
precisely a variable that tells if the image is a dog or not
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 32
33. Applications
ꙮ Recommendation systems
▪ recommending a video/movie/song/product to a user boils down to a yes/no as to answer
▪ Will the user click on the video/movie we’re recommending?
▪ Will the user finish the video/movie we’re recommending?
▪ Will the user listen to the song we’re recommending?
▪ Will the user buy the product we’re recommending?
▪ Netflix – YouTube – Amazon
ꙮ Healthcare
▪ Does the patient suffer from a particular illness?
▪ Will a certain treatment work for a patient?
▪ The features for these models will normally be the medical history of the patient, as well as
their family medical history
▪ one needs a great deal of accuracy! It’s a different thing to recommend a video that a user
won’t watch, than to recommend the wrong treatment for a patient
Mostafa A. Elhosseini https://youtube.com/c/mostafaelhosseini 33