How does the EM algorithm (in the context of GMM) compare to K-Means?

Both EM and K-Means are iterative algorithms that have applications in unsupervised learning, but the methods of optimization differ between the algorithms.

K-Means iteratively assigns points to the nearest cluster centroid to minimize the Within Cluster Sum of Squares, while the assignments in EM are based on maximizing the likelihood of an underlying probability distribution. Thus, K-Means uses a distance-based approach, while EM uses a probabilistic-based approach for the optimization.

Also, K-Means is a hard clustering algorithm where each observation is only assigned to one cluster, while EM provides probabilities of belonging to each cluster in addition to actual labels. 

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad