2. *Introduction to Naïve Bayes*
Naïve Bayes is a simple yet powerful probabilistic
machine learning algorithm that is often used for
classification tasks.
It is based on Bayes' theorem, which is a
fundamental concept in probability theory. Naïve
Bayes is particularly popular for text classification
tasks, such as spam filtering and sentiment analysis,
but it can be applied to a wide range of problems.
3. Bayes' Theorem
At the core of Naïve Bayes is Bayes' theorem, which describes
the probability of an event based on prior knowledge of
conditions that might be related to the event. The formula is as
follows:
4. Bayes' Theorem
where:
P(A∣B) is the probability of event A given that event B has
occurred.
P(B∣A) is the probability of event B given that event A has
occurred.
P(A) is the prior probability of event A.
P(B) is the prior probability of event B.
6. Historical Context
The Naïve Bayes algorithm has its roots in the work
of Reverend Thomas Bayes, an 18th-century
statistician and theologian. However, the algorithm
itself, as well as its application to machine learning,
gained prominence much later in the 20th century.
Here's a brief historical context:
7. Applications of Naive Bayes
Text Classification.
Sentiment analysis.
Recommendation system.
Spam filtering.
Face Recognition.
Weather Prediction.
Medical Diagnosis.
10. Naïve Assumption:
The Naïve Bayes algorithm has its roots in the work
of Reverend Thomas Bayes, an 18th-century
statistician and theologian. However, the algorithm
itself, as well as its application to machine learning,
gained prominence much later in the 20th century.
Here's a brief historical context:
12. **Advantages:**
1. **Simplicity:** Easy to implement and understand.
2. **Efficiency:** Fast training and prediction times.
3. **Robustness:** Works well with small datasets
and irrelevant features.
13. **Disadvantages:**
:**
1. **Assumption of Independence:** Independence
assumption may not hold in real-world scenarios.
2. **Limited Expressiveness:** May not capture
complex relationships in data.