pen icon Colloque
quote

BlogSum: A Blog Summarization Approach based on Discourse Relations

SM

Membre a labase

Shamima Mithun

Résumé du colloque

The availability of huge amounts of online opinions has introduced the need to develop systems to automatically summarize opinionated texts, such as blogs. However, as the 2008 Text Analysis Conference (TAC) has shown, automatic opinion summarizers still need improvements both in terms of summary content and linguistic qualities (e.g., coherence). To improve the state of the art, we have proposed a new approach to utilize intra-sentential discourse relations with the help of schemata to improve both the content and the coherency of extractive summaries. Given a query and a set of related documents, sentences from the documents are first ranked based on their content relatedness to the query, then are analyzed to identify which discourse relations they contain. Discourse schemata are then used to select and organize the most relevant sentences into a summary that answers the original query in the most coherent manner. The approach has been implemented into a system called BlogSum. So far, we have evaluated its results based on summary content. Using the TAC 2008 data, we have compared BlogSum-generated summaries with and without applying discourse structuring. In addition, we have also compared BlogSum-generated summaries with the TAC 2008 participants’ system-generated summaries. In both cases, the evaluation results show that our approach improves the summary’s content. The evaluation of the linguistic qualities of the summaries remains to be done.

Contexte

host icon Hôte : Université de Sherbrooke, Université Bishop’s

Découvrez d'autres communications scientifiques

Autres communications du même congressiste :