Word Co-occurrence Augmented Topic Model in Short Text
作者 Guan-Bin Chen (Guan-Bin Chen)Hung-Yu Kao (Hung-Yu Kao)
Topic models learn topics base on the amount of the word co-occurrence in the documents. The word co-occurrence is a degree which describes how often the two words appear together. BTM, discovers topics from bi-terms in the whole corpus to overcome the lack of local word co-occurrence information. However, BTM will make the common words be performed excessively because BTM identifies the word co-occurrence information by the bi-term frequency in corpus-level. Thus, we propose a PMI-βpriors methods on BTM. Our PMI-βpriors method can adjust the co-occurrence score to prevent the common words problem. Next, we will describe the detail of our method of PMI-βpriors.
起訖頁 164-166
關鍵詞 短文本主題模型文件分類文件分群Short TextTopic ModelDocument ClusteringDocument Classification
刊名 ROCLING論文集  
期數 2015 (2015期)
出版單位 中華民國計算語言學學會
