Author: Mohammed, Hanan Ahmed Abd El-Aal./ Title: AI-Based Personalized Cancer Treatment /

Search In this Thesis

العنوان

AI-Based Personalized Cancer Treatment /

المؤلف

Mohammed, Hanan Ahmed Abd El-Aal.

هيئة الاعداد

باحث / حنان احمد عبدالعال محمد

مشرف / هويدا عبد الفتاح صابر شديد

مناقش / تيسير حسن عبد الحميد

مناقش / مصطفى محمود عارف

تاريخ النشر

2023.

عدد الصفحات

113 P. :

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

Computer Science (miscellaneous)

تاريخ الإجازة

1/1/2023

مكان الإجازة

جامعة عين شمس - كلية الحاسبات والمعلومات - قسم الحسابات العلمية

الفهرس

Only 14 pages are availabe for public view

from

113

from

113

Abstract

Cancer is a group of more than 100 different diseases. It can develop almost anywhere in the body. It is the second leading reason of death worldwide; there were 9.6 million deaths in 2018 according to WHO. Cells are the basic units that make up the human body. Cells grow and divide to make new cells as the body needs them. Usually, cells die when they get too old or damaged. Then, new cells take their place. Cancer begins when genetic changes interfere with this orderly process. Cells start to grow uncontrollably. These cells may form a mass called a tumor. A tumor can be cancerous or benign. A cancerous tumor is malignant, meaning it can grow and spread to other parts of the body. A benign tumor means the tumor can grow but will not spread. Some types of cancer do not form a tumor. These include leukemias, most types of lymphoma, and myeloma.
Cancer is used to be treated by one of the traditional methods which are as Surgery, Chemotherapy and Radiation. For Surgery, it has some limitation that it can deal only with solid tumors and must be done before the tumor spread. Both Radiation and Chemotherapy cause damage to normal cells.
In 1999 a new era in medicine is started called personalized Treatment (personalized Medicine) or Target therapy. Personalized medicine refers to procedures tailored to the unique variables that make each patient an individual. The different genes, lifestyles and behaviors, and environmental factors can influence these variables. Thus, a successful medication in a certain individual may not work for another individual. As cancer is a genetic disease, Personalized medicine uses an individual’s genetic profile to guide decisions made regarding the treatment. Knowledge of a patient’s genetic profile can help doctors select the proper medication or therapy and administer it using the proper dose or regimen. Personalized medicine is being advanced through data from the Human Genome Project. Using personalized treatment has fewer side effects, reduces the damage of the normal cells, and reduces the chance of cancer reproduction, So, that makes all make personalized medicine a promising approach for treating cancer.
Studying each patient genetic profile requires a medical team with several working days to decide on a single patient which is too difficult to do with the high number of patients. So, recently effort is exerted in replacing human effort with artificial intelligence-based algorithms to study the genetic profile, extract its mutations and study the various treatment effect on DNA.
In this thesis, we introduced an efficient technique for predicting the cancer treatment response of 265 different drugs using 19702 genes to personalize the cancer treatment. Those techniques are Deep Learning (DL) based techniques that depend on artificial neural networks. The selection of DL techniques to be used is based on the introduced review study for the recent drug response prediction techniques [1]. The proposed DL techniques are learned by using the genetic profile (gene expression and mutations) for the cancer cell line. The genetic data was provided by Cancer Genome Atlas dataset (TCGA) [2] and the Cancer Cell Line Encyclopedia dataset (CCLE) [3] for different 33 types of cancer. For drug response data, the Genomics of Drug Sensitivity in Cancer dataset (GDSC) is used. GDSC is based on real clinical data. Due to the huge size of genetic data several preprocessing stages are applied. The first step introduced in [4] by calculating the Transcripts Per Millions (TPM) of gene expression and the second proposed step introduced in [5] is applying data federation to enhance the data quality. In data federation, different datasets consolidate to act as one dataset with a new format. In our proposed work we introduce two different DL-based techniques, the first one based on feedforward Neural Network (NN) [6] and the second one based on Convolutional Neural Networks (CNNs) and their different models. We also applied some classical Machine learning (ML) algorithms to prove the enhancement of the prediction model after using the proposed data federation process. Both Support Vector Machine (SVM) and Linear Regression (LR) are used as classical ML algorithms.
For our experimental results, we divide the datasets (CCLE and GDSC) by 80% for training, 10% for validation and, 10% for testing. The TCGA dataset is used for training the auto-encoders used for dimension reduction. from our experimental results, we find that the proposed data federation improves the data quality which impacts the results of both SVM and LR compared with results without data federation. The proposed feedforward networks achieved the best accuracy and better convergence. The proposed CNN models achieved better performance in both accuracy and convergence than the first proposed first one (Enhanced Deep-DR).
In conclusion, the contribution of our work is introducing an integration and data federation method to enhance the data quality and also three prediction approaches to generalize the predicted performance of IC50 for 265 drugs. The three prediction approaches prove better generalization by achieving better accuracy. The first approach is the Enhanced Deep-DR which reduces the MSE by 12% less than the state-of-art approaches. The second approach is the proposed CNN architectures which reduce the MSE value by 24% compared with the state-of-art approach and the last one is the proposed feedforward-based architectures which reduce the MSE by 52%.