اختيار الموقع            تسجيل دخول
 

هندسة اللغة:
بيانات الدورية
أعداد قيد الطبع
  هندسة اللغة:
  
 

[9001819.] رقم البحث : 9001819 -
Modern Standard Arabic Grammar Automatic Extraction from Penn Arabic Tree bank Using Natural Language Toolkit /
تخصص البحث : Language Analysis and Comprehension
  هندسة اللغة: / عدد (1) - مجلد (5) - ابريل 2018
  تاريخ تقديم البحث 11/04/2018
  تاريخ قبول البحث 11/04/2018
  عدد صفحات البحث 10 pages
  Amira Abdelhalim ( Amira.Abdelhalim@yahoo.com - ) - مؤلف رئيسي
  Sameh Alansary ( s.alansary@alexu.edu.eg - )
  Keywords: Observational Based Grammar - Automatic Grammar Extraction- Rule Based Grammar – Enhancing Arabic Grammar Parsing - Statistically Directed Symbolic Parsing
  Abstract: This paper presents a methodology for rule based bottom up parsing technique forModern Standard Arabic (MSA) in Context Free Grammar (CFG) formalism in Phrase Structure Grammar (PSG) representation, where the grammar is automatically extracted from a syntactically annotated corpus.The extracted grammar is used to build an automatic lexicon and grammar rules module. Furthermore, the extracted CFG is further transformed into Probabilistic Context Free Grammar (PCFG) that could be used in a hybrid approach, which is also calculated automatically. The used corpus is the Penn Arabic Treebank(PATB)and algorithm implementation is performed with Natural Language Processing Toolkit (NLTK).The parser showed that automatic extraction of grammar improved the grammar building phase in both coverage of structures and time needed, but still needs further manual constrains addition. Automatic extraction of grammar is able to enhance rule based grammar parsers and it will enable a new paradigm of statistically directed symbolic parsing.
  Download Paper

[9001820.] رقم البحث : 9001820 -
Query Expansion for Arabic Information Retrieval Model: Performance Analysis and Modification /
تخصص البحث : NLP for Information Retrieval
  هندسة اللغة: / عدد (1) - مجلد (5) - ابريل 2018
  تاريخ تقديم البحث 11/04/2018
  تاريخ قبول البحث 11/04/2018
  عدد صفحات البحث 14 Pages
  Ayat Elnahaas ( eng_ayatelnahas@yahoo.com - ) - مؤلف رئيسي
  Nawal Alfishawy ( nelfishawy@hotmail.com - )
  Mohamed Nour ( mnour@eri.sci.eg - )
  Gamal Attiya ( gamal.attiya@yahoo.com - )
  Maha Tolba ( maha_saad_tolba@yahoo.com - )
  Keywords: Arabic Documents, Indexing, Vector Space Model, Query Expansion, Semantics, and Relevance Feedback.
  Abstract- Information retrieval aims to find all relevant documents responding to a query from textual data. A good information retrieval system should retrieve only those documents that satisfy the user query. Although several models were developed, most of Arabic information retrieval models do not satisfy the user needs. This is because the Arabic language is more powerful and has complex morphology as well as high polysemy. This paper first investigates the most recent Arabic information retrieval model and then presents two different approaches to enhance the effectiveness of the adopted model. The main idea of the proposed approaches is to modify and/or expand the user query. The first approach expands user query by using semantics of words according to an Arabic dictionary. The second approach modifies and/or expands user query by adding some useful information from the pseudo relevance feedback. In other words, the query is modified by selecting relevant textual keywords for expanding the query and weeding out the non-related textual words. The adopted retrieval model and the two proposed approaches are implemented, tested, compared, and evaluated considering Arabic document collection. The obtained results show that the proposed approaches enhance the effectiveness of the Arabic information retrieval model by about 15% to 35%.
  Download Paper

[9001821.] رقم البحث : 9001821 -
Detecting Veracity in selected Speeches of Egyptian Presidents (1956-2015) and American Presidents (1981-2015): A Psycholinguistic Corpus-based Study /
تخصص البحث : Language Analysis and Comprehension
  هندسة اللغة: / عدد (1) - مجلد (5) - ابريل 2018
  تاريخ تقديم البحث 11/04/2018
  تاريخ قبول البحث 11/04/2018
  عدد صفحات البحث 12 Pages
  Marina S. Badawy ( marinasameh1990@gmail.com - ) - مؤلف رئيسي
  Khaled A. ElGhamry ( elghamryk@gmail.com - )
  Radwa M. Kotait ( radwa_kotait@alsun.asu.edu.eg - )
  Keywords: Psychometrics; Deception; Egyptian presidential speeches; American presidential speeches
  Abstract: Language and psychology have much in common in the sense that each of these two disciplines can be used to study the features of the other one. Words that people use every day carry a lot of explicit linguistic features which signal some implicit psychological traits, personal characteristics, social relations and cognitive processes. Since more attention has been recently paid to such an area of research, this study investigates linguistic and psychological features used as indicators of veracity or deception. The study examines selected speeches of the last five Egyptian and American presidents using a computerized content analysis tool called LIWC (Linguistic Inquiry and Word Count). Drawing on previous researches that examine the psychometrics of the language, this study follows an eclectic approach adopting Newman et al.’s model combined with other cues to deception concluded by other studies. The analysis detects the frequency of using pronouns, negation, exclusive words, details, conjunctions and big words in the speeches in order to reveal instances of deception
  Download Paper

[9001824.] رقم البحث : 9001824 -
Spoken Arabic Dialect Identification Using Motif Discovery /
تخصص البحث : Speech Processing, Recognition and Synthesis
  هندسة اللغة: / عدد (1) - مجلد (5) - ابريل 2018
  تاريخ تقديم البحث 14/04/2018
  تاريخ قبول البحث 14/04/2018
  عدد صفحات البحث 12 Pages
  Mohsen Mofta ( mohsen.moftah@barmagyat.com - ) - مؤلف رئيسي
  Mohamed Waleed Fakhr ( waleedf@aast.edu - )
  Salwa El Ramly ( salwahelramly@gmail.com - )
  Key words: motif discovery, dialect identification, language identification, GMM-UBM, time series
  Abstract: In traditional Dialect Identification (DID) approaches, regardless of the level and type of features used for identification, they use either predefined references such as phones, phonemes, or even acoustic sounds that characterize a language/dialect, or involve some sort of transcription of the input data. The transcription may be manual or automatic using tools such as ASRs, Tokenizers, or Phone Recognizers. In this paper, we introduce a new approach based on analyzing the speech signal directly and extracting the features that characterize the dialect without any predefined references and without any sort of transcription. The main idea is that we find the repeated sequences (motifs) of the dialect by treating the speech signal as a times series, so we can apply motif discovery techniques to extract the repeated sequences directly from the speech signal. For motif extraction, we adopted an extremely fast parameter-free Self-Join motif discovery algorithm called Scalable Time series Ordered-search Matrix Profile (STOMP). We implemented the new approach in two stages; in the first we built a base line system in which we extracted 12 Mel Frequency Cepstral Coefficients (MFCC) from each motif, in the second stage we built an improved system using 39 coefficients by adding 13 Delta coefficients, 13 Delta-Delta coefficients, and 1 Log Energy coefficient. In both systems, we used Gaussian Mixture Model-Universal Background Model (GMM-UBM) as a classifier. We applied our new approach on three different motif lengths 500ms, 1000ms, and 1500ms using 1gmm component up to 2048gmm components. We downloaded the data set from Qatar-Computing-Research- Institute domain. We carried out our experiments on different Arabic dialects: the Egyptian (EGY), Gulf (GLF), Levantine (LEV), and North African (NOR).The base line results were very competitive with the traditional, more sophisticated approaches, while the improved system showed very good result. The improvement was so significant that we can consider the new approach as competitive, simple, and dialect-independent approach.
  Download Paper

 


Powered by Future Library Software.All rights reserved © CITC - Mansoura University. Sponsored by Mansoura University Privacy Policy