Author: Moustafa, Mahmoud Mohamed Yusuf./ Title: Arabic Dialect Identification Using Deep Learning Techniques \

Search In this Thesis

العنوان

Arabic Dialect Identification Using Deep Learning Techniques \

المؤلف

Moustafa, Mahmoud Mohamed Yusuf.

هيئة الاعداد

باحث / محمود محمد يوسف مصطفى

hodakingdom2006@gmail.com

مشرف / نجوى مصطفى المكي

nagwamakky@gmail.com

مشرف / مروان عبد الحميد تركي

marwantorki@gmail.com

مناقش / محمد عبد الحميد إسماعيل أحمد

drmaismail@gmail.com

مناقش / صالح عبد الشكور الشهابي

الموضوع

Computer Engineering.

تاريخ النشر

2023.

عدد الصفحات

54 p. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

هندسة النظم والتحكم

تاريخ الإجازة

1/1/2023

مكان الإجازة

جامعة الاسكندريه - كلية الهندسة - ندسة الحاسب والنظم

الفهرس

Only 14 pages are availabe for public view

from

Abstract

Given the challenges and complexities introduced while dealing with Dialect Arabic (DA) variations, Transformer based models, e.g., BERT, outperformed other models in dealing with the DA identification task. However, to fine-tune these models, a large corpus is required. Getting a large number high quality labeled examples for some Dialect Arabic classes is challenging and time-consuming. In this thesis, the Dialect Arabic Identification task is addressed. Semi- Supervised Generative Adversarial Networks (SS-GAN) are used to extend the Transformer-based models, ARBERT and MARBERT, with unlabeled data in a generative adversarial setting. The proposed model enabled producing high-quality embeddings for the Dialect Arabic examples and aided the model to better generalize for the downstream classification task given a few labeled examples. The proposed model was experimented in 2 different setups: (1) GANBERT where we extended BERT with the Semi-Supervised GAN component. (2) 2-stages setup in which we trained the GAN extended model for some epochs and then, having a second stage of BERT-based model training. Experimental results showed that the proposed model reached better performance and faster convergence when only a few labeled examples are available.