Extractive Text Summarization using TextRank Algorithm (Basic Logic)

Overview

  • Understanding to Text Summarization Methods
  • Understanding to TextRank and PageRank Algorithm Process
  • Work Principle of TextRank

Introduction

Analyzing and coming to a conclusion about a text can be considered a hassle or an annoying task by many people. A good reader can get these elements by simply focusing on important stuff that they consider essential to understanding the text. If you fall into the former category when you are tasked with a job like summarizing a text or getting the main idea behind it, you may simply don’t want to or can’t do it. Whether it be time or mental related, it is understandable that you don’t want to do this task. This is where NLP with its text summarization techniques can help you. With these techniques, it is plausible that you can find a suitable solution for your articles, news, and many other text-related documents when you want to summarize or get the main idea behind them.

Text Summarization Methods

Text summarization can be applied in two different categories:

  1. Extractive Text Summarization: This method extracts a duplicate summary from the original text by breaking down certain expressions and sentences from the text.

TextRank Algorithm

TextRank is a text processing graph-based ranking model that can be used to identify the most important sentences in the text. TextRank’s basic concept is to give a score to each sentence for their importance, then sort them accordingly. The first sentence that is shown is to be believed as the main idea of the text, also can be understood as its summary.

Work Principle of TextRank

An Example of TextRank Text Summarization

As it is with every other deep learning project, extracting the data from its original form is the first step. Cleaning should also be done to get the most accurate solution possible. The texts within the data will be split into sentences in order to make the ranking decision easier. After that, the words that have been used in those texts will be counted with the intention of importance ranking. The sentence with the most used words will be a front runner.

bag_of_words
similarity of sentences
result

References

https://www.researchgate.net/publication/257947528_Text_SummarizationAn_Overview

Bilgi University / Computer Engineering

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store