Search In this Thesis
   Search In this Thesis  
العنوان
A Study of Using Metaheuristic Algorithms in Automated Extractive Text Summarization /
المؤلف
Eshak, Marina Esam.
هيئة الاعداد
باحث / مارينا عصام اسحق
مشرف / محب رمزي جرجس
مشرف / ممدوح محمد جمعه
مشرف / أحمد سويلم أحمد
الموضوع
Algorithms. Computational intelligence. Speech processing systems.
تاريخ النشر
2023.
عدد الصفحات
93 p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
علوم الحاسب الآلي
تاريخ الإجازة
30/11/2023
مكان الإجازة
جامعة المنيا - كلية العلوم - علوم الحاسب
الفهرس
Only 14 pages are availabe for public view

from 123

from 123

Abstract

Due to the vast amount of information available online, including the rise of online publishing and easy access to the Internet, there is now an overwhelming volume of electronic documents. This growth in online information poses challenges for users in finding relevant content, leading to fatigue from reading extensive texts and potentially missing out on interesting and important documents. To overcome these difficulties, automatic text summarization (ATS) has gained significant attention as a means to assist users and computer systems in efficiently processing large amounts of text and extracting relevant knowledge. ATS systems are capable of generating summaries of documents, condensing the main information into concise texts.
Text summarization (TS) can be categorized according to the number of input documents, distinguishing between single-document summarization (SDS) and multi-document summarization (MDS). SDS involves extracting information from a single document, while MDS involves extracting information from multiple documents on the same topic.
Furthermore, TS can be categorized based on the type of summary output, differentiating between extractive and abstractive text summarization. In extractive summarization, the most relevant information is directly extracted from the given text. In contrast, abstractive summarization involves rephrasing or generating new sentences based on a set of relevant concepts from the given text.
The process of generating a summary using ATS can be formulated as an optimization problem. Metaheuristic search algorithms, such as Genetic algorithms (GAs), and heuristic search algorithms, such as Simulated Annealing (SA), have been successfully employed to solve optimization problems.
Therefore, the objective of this study is to assess the performance of ATS systems that utilize metaheuristic and heuristic search algorithms in automated extractive text summarization. To achieve this objective, this thesis proposes several methods for solving the SDS problem, including GA-based, SA-based, hybrid GA-SA-based, and GA-GLS (GA with Guided Local Search) based methods. These methods aim to generate high-quality summaries that contain the essential information from a given document.
One crucial component in the proposed methods is the utilization of an objective function (to be maximized) to evaluate the quality of the generated solutions (summaries). This objective function is defined as a weighted sum that combines five distinct features: similarity with the title, sentence position, sentence length, coverage, and cohesion. Furthermore, the proposed methods utilize a binary vector to represent a solution (summary). Consequently, if a document consists of n sentences, the binary vector comprises n elements, with each element corresponding to a sentence in the document. A value of 1 indicates that the sentence belongs to the summary represented by the vector, while a value of 0 indicates otherwise.
The efficacy of the proposed SDS methods has been evaluated through experiments, including a comparative analysis. These experiments involved applying the methods to sample articles from the CNN corpus. The evaluation process employed co-occurrence statistical metrics, specifically the ROUGE metrics, which measure Recall, Precision, and F-measure scores. Additionally, three content-based metrics, namely Fitness, Readability, and Cohesion, were used to assess the summaries.