What is Term Frequency (TF)? 

A Term Frequency matrix consists of the IDs for the documents in the corpus for the rows and all of the words in the vocabulary in the columns. A given entry in a TF matrix is interpreted as the number of occurrences of word w in document d. If the value is 0, that word does not appear in document d. In a large corpus, there will likely be many words as part of the vocabulary, so this is usually a large sparse matrix. 

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad
Find out all the ways that you can
Contribute