Presentation of the research paper "Assessing the Quality of Human-Generated Summaries with Weakly Supervised Learning"
This paper explores how to automatically measure the quality of human-generated summaries, based on a Norwegian corpus of real estate condition reports and their corresponding summaries. The proposed approach proceeds in two steps. First, the real estate reports and their associated summaries are automatically labelled using a set of heuristic rules gathered from human experts and aggregated using weak supervision. The aggregated labels are then employed to learn a neural model that takes a document and its summary as inputs and outputs a score reflecting the predicted quality of the summary. The neural model maps the document and its summary to a shared "summary content space" and computes the cosine similarity between the two document embeddings to predict the final summary quality score. The best performance is achieved by a CNN-based model with an accuracy (measured against the aggregated labels obtained via weak supervision) of 89.5%, compared to 72.6% for the best unsupervised model. Manual inspection of examples indicate that the weak supervision labels do capture important indicators of summary quality, but the correlation of those labels with human judgements remains to be validated. Our models of summary quality predict that approximately 30% of the real estate reports in the corpus have a summary of poor quality.
Arild Brandrud Næss is associate professor of statistics at NTNU Business School. His research focuses mainly on natural language processing (NLP), with particular emphasis on applications within finance and economics. Næss attained his MSc in Industrial Mathematics at NTNU’s Department of Mathematical Sciences and his PhD in speech technology from NTNU’s Department of Electronic Systems. He has also spent two years as a visiting researcher at Toyota Technological Institute at Chicago.