What happens if a category has a zero frequency within a class, and how is this issue commonly addressed (Naive Bayes)?

If a feature appears zero times within a particular class, the computed likelihood score for an observation belonging to that class will be zero, even if the conditional probabilities are much higher for all other features, implying the observation should be assigned to that class. To address this issue, a common practice is to add a small number ? to the frequencies of each feature within each class in order to prevent a zero likelihood score being assigned to any class. 

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad
Find out all the ways that you can
Contribute