Search In this Thesis
   Search In this Thesis  
العنوان
Arabic Dialect Identification Using Deep Learning Techniques \
المؤلف
Moustafa, Mahmoud Mohamed Yusuf.
هيئة الاعداد
باحث / محمود محمد يوسف مصطفى
hodakingdom2006@gmail.com
مشرف / نجوى مصطفى المكي
nagwamakky@gmail.com
مشرف / مروان عبد الحميد تركي
marwantorki@gmail.com
مناقش / محمد عبد الحميد إسماعيل أحمد
drmaismail@gmail.com
مناقش / صالح عبد الشكور الشهابي
الموضوع
Computer Engineering.
تاريخ النشر
2023.
عدد الصفحات
54 p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
هندسة النظم والتحكم
تاريخ الإجازة
1/1/2023
مكان الإجازة
جامعة الاسكندريه - كلية الهندسة - ندسة الحاسب والنظم
الفهرس
Only 14 pages are availabe for public view

from 74

from 74

Abstract

Given the challenges and complexities introduced while dealing with Dialect Arabic (DA) variations, Transformer based models, e.g., BERT, outperformed other models in dealing with the DA identification task. However, to fine-tune these models, a large corpus is required. Getting a large number high quality labeled examples for some Dialect Arabic classes is challenging and time-consuming. In this thesis, the Dialect Arabic Identification task is addressed. Semi- Supervised Generative Adversarial Networks (SS-GAN) are used to extend the Transformer-based models, ARBERT and MARBERT, with unlabeled data in a generative adversarial setting. The proposed model enabled producing high-quality embeddings for the Dialect Arabic examples and aided the model to better generalize for the downstream classification task given a few labeled examples. The proposed model was experimented in 2 different setups: (1) GANBERT where we extended BERT with the Semi-Supervised GAN component. (2) 2-stages setup in which we trained the GAN extended model for some epochs and then, having a second stage of BERT-based model training. Experimental results showed that the proposed model reached better performance and faster convergence when only a few labeled examples are available.