Text Summarization for News Articles by Machine Learning Techniques

Authors

  • Hew Zi Jian School of Computer Sciences, 11800, Universiti Sains Malaysia, Pulau Pinang, Malaysia.

Abstract

Text summarizing is very instrumental in natural language text comprehension systems to constructing a text summary using more abstract, condensed knowledge structures. Extractive text summarization is therefore built on language processing to extract the essence sentences of a long text article to produce a summary. Though the known manual process had recorded achievement over time and recently, several machine learning models for extractive text summarization had also been proposed. However, there is a lack of research that benchmark the comparative performance of these machine learning models. This paper, therefore, helps to
identify the champion machine learning model in text summarization for news articles and to identify the best text preprocessing method in the machine learning of text summarization. CNN/Daily Mail database is employed for the comparative study of text summarization using chosen classifiers. Random Forest (RF) classifier provides with a champion performance of Rouge-l score, Rouge-2 score and Rouge-L score as 8.2845, 2.884, and 7.9694 respectively.

Keywords:

Classifier, CNN/Daily Mail, Machine Learning, News Article, Text Summarization

Downloads

Published

2022-12-31

How to Cite

Hew Zi Jian. (2022). Text Summarization for News Articles by Machine Learning Techniques. Applied Mathematics and Computational Intelligence (AMCI), 11(1), 174–196. Retrieved from https://ejournal.unimap.edu.my/index.php/amci/article/view/134