Summarizing News by Abstractive Approach
Last Updated on February 8, 2021 by Editorial Team
Author(s): Edward Ma
Natural Language Processing
Abstractive Summarization
In NLP, there are two approaches to do the text summarization. The first one, the extractive approach, is a simple approach that is extracting keywords or sentences from an article. There are some limitations and proved that the performance is not very good. This approach suffers from irrelevance and redundancy. The second one, the abstractive approach, is generating new sentences base on a given article. It needs more advanced techniques but achieving betterΒ results.
This has been applied mainly for text. Abstractive methods build an internal semantic representation of the original content, and then use this representation to create a summary that is closer to what a human might express. Abstraction may transform the extracted content by paraphrasing sections of the source document, to condense a text more strongly than extraction. Such transformation, however, is computationally much more challenging than extraction, involving both natural language processing and often a deep understanding of the domain of the original text in cases where the original document relates to a special field of knowledge.
Daily Usage
We have lots of use cases on leveraging text summarization in our daily life. One of the valid usages is news summarization. Detailed news may include several paragraphs and over 1000 words. It takes around several mins to read through the whole news. It is hard for people to digest a huge amount of local and international news that covers lots of topics such as financial, sports, etc. Therefore, news summarization assists people to have a high-level understanding of it quickly. Instead of spending 5 mins reading news that may not relevant to ourselves, we may just 30 seconds getting the rough idea from the newsΒ summary.
Another usage is finding relevant research papers. The abstract section helps us to get a rough idea of what problem do practitioners want to solve and what is the solution. Otherwise, we may need to read through 10 pages in order to decide whether this paper is relevant.
Can we summarize news through a machine learningΒ model?
Benefit from new technology in NLP, summarizing news is definitely possible. We can leverage the state-of-the-art NLP architecture such as sequence-to-sequence and transformer. Also, you need a dataset that including abstract and detailed news articles. Finally, you need a powerful machine to train a news summarization model.
Another way is leveraging API to get the summarized news. You only need to provide content of news and you can get the abstraction without any machine learningΒ code.
Working hours during theΒ pandemic
Here is the generated summary from my API of thisΒ news
in the UK, Austria, Canada and the United States has seen a rise in working hours since the pandemic hit Europe last week. Home-working employees are now more likely to put in more hours than before, according to new research.
Latino-owned businesses growth
Here is another generated summary from my API of thisΒ news
the challenges facing Latinos to secure capital from national banks, according to a new study. Latino-owned businesses are growing faster than the national average across several industries, growing 34 percent over the last 10 years compared to just 1 percent for all other small businesses.
Take Away
I trained a news summarization deep learning model and established a web server to provide this service. Drop me an email or message if you want to try this APIΒ service.
Like toΒ learn?
I am Data Scientist in Bay Area. Focusing on the state-of-the-art in Data Science, Artificial Intelligence, especially in NLP and platform related. Feel free to connect with me on LinkedIn orΒ Github.
Extension Reading
- Summarize document by combing extractive and abstractive steps
- Explantation of extractive way of summarization
Summarizing News by Abstractive Approach was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Published via Towards AI