Search In this Thesis
   Search In this Thesis  
العنوان
Outlier Data Management Using Clustering Techniques /
المؤلف
Risk, Hamada Mohamed Mohamed Elsayed.
هيئة الاعداد
باحث / حماده محمد محمد السيد رزق
مشرف / امانى محمود سرحان
مشرف / شيرين مصطفى الجوخى
مشرف / لا يوجد
مشرف / لا يوجد
الموضوع
Computers and Control Engineering.
تاريخ النشر
2016.
عدد الصفحات
p 128. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
هندسة النظم والتحكم
تاريخ الإجازة
1/1/2016
مكان الإجازة
جامعة طنطا - كلية الهندسه - هندسه الحاسبات والتحكم الالى
الفهرس
Only 14 pages are availabe for public view

from 153

from 153

Abstract

Outlier detection algorithms settle down the throne of data mining field. Several applications rely on outlier detection such as intrusion detection,fraud detection, medical and public health data,image processing, etc.Clustering-based outlier detection methods are considered as the most challenging outlier detection approaches.Clustering methods are not
developed originally for outlier detection; nevertheless, they can be
optimized to do so. These algorithms can perform well if they are clustering
outlier free datasets and the algorithms that are immune to outliers have expensive calculations. Many clustering-based outlier detection approaches are developed to detect outliers; however, they suffer from high and increasing false positive rate even with high detection rate. In this thesis, we propose a hybrid clustering-based outlier detection algorithm based on a modified K-Medoids clustering and density measures. This algorithm avoids the repeated distance calculations and minimizes the outlierness factor calculations. It supports searching for outliers not only in small clusters but also in large clusters with reduced calculation methodology. The experimental results demonstrate the good performance of the algorithm in terms of detection sensitivity by increasing the detection rate, decreasing the false positive rate till reaches a non-increasing saturation point and minimizing outlierness factor calculations.Most outlier detection algorithms developed till now, are One At-ATime algorithms that run from the beginning each time that makes them infeasible for real-time applications. An on-the-fly clustering based outlier detection framework, called OFCOD, is also proposed to enable analysts to effectively find out outliers in-time with request even within huge datasets.The experimental results of this framework showed its effectiveness in detecting the new outliers without reperforming the clustering process. This enables one to use this framework in real-time applications.