Cleansing, organizing and training: Two guidelines for generating attractive news headlines for social media


n this paper, we explore the challenge of automatically generating attractive news headlines for social media, which can be good introductions to make news articles go viral. To this end, we propose a novel method for identifying key sentences that are useful for generating viral news headlines from a given news article. This problem can be formulated as supervised sequence labelling that employs the most popular microblog post mentioning the news article as supervised information, and a recurrent neural network (RNN) can be used for this purpose. However, weshow that a näive implementation of this approach does not work well, due to noises contained in both microblog messages as the ground-truth headlines and news articles, and data cleansing and organizing for news articles and ground-truth headlines are rather criticalfor improving the performance of sequence labelling. We then propose a method for organizing a dataset for training accurate models for supervised sequence labeling. The experimental results demonstrate that our proposed method greatly improve the accuracy of key sentence identification.

Computational + Journalism Symposium (C+J)