What is KL Divergence?

The Kullback-Leibler (KL) Divergence is a method of quantifying the similarity between two statistical distributions. It has applications in both supervised learning, where it can be used as a loss function to measure actual labels against a distribution of predicted labels, as well as in unsupervised learning in algorithms such as Gaussian Mixture Models that rely on the Expectation-Maximization technique. For two distributions p and q, the KL Divergence is given by

The KL Divergence is 0 if the two distributions p and q are identical to each other and increases in value as the distributions become more different. Also, it is not symmetric, meaning KL(p || q) is not always equal to KL (q || p), and thus it is not a proper distance metric. 

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad