Accelerate your data journey. Join our AI Community!

Publication

Natural Language Processing

Summarizing News by Abstractive Approach

Author(s): Edward Ma

Natural Language Processing

Abstractive Summarization

In NLP, there are two approaches to do the text summarization. The first one, the extractive approach, is a simple approach that is extracting keywords or sentences from an article. There are some limitations and proved that the performance is not very good. This approach suffers from irrelevance and redundancy. The second one, the abstractive approach, is generating new sentences base on a given article. It needs more advanced techniques but achieving better results.

Photo by Kelly Sikkema on Unsplash

This has been applied mainly for text. Abstractive methods build an internal semantic representation of the original content, and then use this representation to create a summary that is closer to what a human might express. Abstraction may transform the extracted content by paraphrasing sections of the source document, to condense a text more strongly than extraction. Such transformation, however, is computationally much more challenging than extraction, involving both natural language processing and often a deep understanding of the domain of the original text in cases where the original document relates to a special field of knowledge.

Daily Usage

We have lots of use cases on leveraging text summarization in our daily life. One of the valid usages is news summarization. Detailed news may include several paragraphs and over 1000 words. It takes around several mins to read through the whole news. It is hard for people to digest a huge amount of local and international news that covers lots of topics such as financial, sports, etc. Therefore, news summarization assists people to have a high-level understanding of it quickly. Instead of spending 5 mins reading news that may not relevant to ourselves, we may just 30 seconds getting the rough idea from the news summary.

Photo by Obi Onyeador on Unsplash

Another usage is finding relevant research papers. The abstract section helps us to get a rough idea of what problem do practitioners want to solve and what is the solution. Otherwise, we may need to read through 10 pages in order to decide whether this paper is relevant.

Can we summarize news through a machine learning model?

Benefit from new technology in NLP, summarizing news is definitely possible. We can leverage the state-of-the-art NLP architecture such as sequence-to-sequence and transformer. Also, you need a dataset that including abstract and detailed news articles. Finally, you need a powerful machine to train a news summarization model.

Another way is leveraging API to get the summarized news. You only need to provide content of news and you can get the abstraction without any machine learning code.

Working hours during the pandemic

Here is the generated summary from my API of this news

API Showcase

in the UK, Austria, Canada and the United States has seen a rise in working hours since the pandemic hit Europe last week. Home-working employees are now more likely to put in more hours than before, according to new research.

Latino-owned businesses growth

Here is another generated summary from my API of this news

API Showcase

the challenges facing Latinos to secure capital from national banks, according to a new study. Latino-owned businesses are growing faster than the national average across several industries, growing 34 percent over the last 10 years compared to just 1 percent for all other small businesses.

Take Away

I trained a news summarization deep learning model and established a web server to provide this service. Drop me an email or message if you want to try this API service.

Like to learn?

I am Data Scientist in Bay Area. Focusing on the state-of-the-art in Data Science, Artificial Intelligence, especially in NLP and platform related. Feel free to connect with me on LinkedIn or Github.

Extension Reading


Summarizing News by Abstractive Approach was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓