The Rand Index can be used for comparing the results from multiple clustering algorithms. It is simply a ratio of the number of agreeing pairs of observations, which include both those assigned to the same cluster in both iterations as well as those assigned to different clusters in both iterations, over the total number of pairs of data (nC2). Values close to 0 indicate strong discordance between algorithms, meaning the observations are assigned to clusters by random chance, and a value of 1 indicates perfect agreement of cluster assignments, which is preferred.
What is Rand Index?
Help us improve this post by suggesting in comments below:
– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic
Partner Ad