英文摘要 |
People in Taiwan have been alerted by the problems of food safety for the past few years; therefore, people have paid more attention to health news. This study aims to find the critical terms in the on-line health news and predict votes for the "Like" of the news based on text mining and business intelligence algorithms. In addition, in order to deal with the possible big data from on-line news, this study proposes a framework of big data by parallel processing on multiple data nodes. The results show that the support vector machine with 50 concept dimensions has the best prediction accuracy. When the amount of data becomes huge, the performance of distributed computing structure will improve significantly. The proposed approach can help managers of on-line news to choose or invest more popular health news thus to attract more potential readers. The proposed structure and analytic results can also provide a better understanding of big data for the future studies. |