An approximate functional dependencies (AFD) based approach to improve skyline queries computation and missing values estimation of skylines on crowdsourced-enabled incomplete database

Swidan, Marwa Behjat

Please use this identifier to cite or link to this item: http://studentrepo.iium.edu.my/handle/123456789/10725

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Ali A. Alwan Al-Juboori, Ph.D	en_US
dc.contributor.advisor	Sherzod Turaev, Ph.D	en_US
dc.contributor.advisor	Yonis Gulzar, Ph.D	en_US
dc.contributor.author	Swidan, Marwa Behjat	en_US
dc.date.accessioned	2021-12-03T04:17:49Z	-
dc.date.available	2021-12-03T04:17:49Z	-
dc.date.issued	2021	-
dc.identifier.uri	http://studentrepo.iium.edu.my/handle/123456789/10725	-
dc.description.abstract	Data incompleteness becomes a frequent phenomenon in contemporary non-trivial database applications such as web autonomous databases, incomplete databases, big data and crowd-sourced mobile databases. Processing queries over these incomplete databases impose several challenges that negatively influence processing the queries. Most importantly, the query results derived from incomplete databases are also incomplete as certain values of the query result are not present. Result incompleteness may lead to misguiding the user in multi-criteria decision-making and decision support systems. Skyline queries are one of the most prominent queries applied over these recommendation and decision-making systems. Most recently, several studies have suggested exploiting the crowd-sourced databases in order to estimate the missing values by generating plausible substitute values using the crowd resources. Crowd-sourced databases have proved to be a powerful solution to perform user-given tasks by integrating human intelligence and experience to process the tasks. However, task processing using crowd-sourced platform incurs additional monetary cost and increases the time latency. Also, it is not always possible to produce a satisfactory result according to the user’s preferences. Thus, an efficient and cost-effective approach for estimating the missing values of the skylines on crowd-sourced enabled incomplete databases is necessary which is achieved by exploiting the available data and the implicit relationships in the database before referring to the crowd is needed. This thesis proposes a new approach for estimating the missing values of the skylines over incomplete databases. The approach attempts to eliminate the unwanted tuples from the initial incomplete database using data filtration to simplify the value estimation process. Furthermore, the approach utilizes the remaining data and exploits the implicit relationships between the attributes to impute the missing values of the skylines. The approach employs the principle of mining attribute correlations to generate a set of approximate functional dependencies (AFDs) that assist in generating the estimated values. Also, the proposed approach aims at reducing the number of values to be estimated using the crowd when local estimation is inappropriate. Certain factors that influence the data processing such as monetary cost, time latency and accuracy are considered when working on the crowd-sourced platform to estimate the missing values of the skylines. Intensive experiments on both synthetic and real datasets have been accomplished. The experimental results have proven that the proposed approach for estimating the missing values of the skylines over crowd-sourced enabled incomplete databases is scalable and outperforms the other existing approaches. The proposed approach simplifies the process of missing value estimation for the skylines with a total reduction of up to 80% in the number of the values to be considered for the estimation in the initial incomplete database. Furthermore, the experimental results have also shown that the proposed solution has achieved the lowest relative error rate between the real missing and the estimated values in comparison with the other recent approach. Most importantly, our proposed strategy is capable of estimating up to 40% of the total missing values with accuracy up to 90% by exploiting the available data in the initial incomplete database. Lastly, the results of the experiments have also demonstrated that our approach has significantly decreased the monetary cost and the time latency involved when estimating the missing values of the skylines using crowd-sourced databases.	en_US
dc.language.iso	en	en_US
dc.publisher	Kuala Lumpur : Kulliyyah of Information and Communication Technology, International Islamic University Malaysia, 2021	en_US
dc.subject.lcsh	Crowdsourcing	en_US
dc.subject.lcsh	Application software	en_US
dc.title	An approximate functional dependencies (AFD) based approach to improve skyline queries computation and missing values estimation of skylines on crowdsourced-enabled incomplete database	en_US
dc.type	Doctoral Thesis	en_US
dc.description.identity	t11100393479MarwaBehjatSwidan	en_US
dc.description.identifier	Thesis : An approximate functional dependencies (AFD) based approach to improve skyline queries computation and missing values estimation of skylines on crowdsourced-enabled incomplete database /by Marwa Behjat Swidan	en_US
dc.description.kulliyah	Kulliyyah of Information and Communication Technology	en_US
dc.description.programme	Doctor of Philosophy in Computer Science	en_US
dc.description.abstractarabic	أصبح عدم اكتمال البيانات ظاهرة متكررة في عدد لا يستهان به من تطبيقات قواعد البيانات المعاصرة ؛ كقواعد البيانات الغير كاملة، والبيانات الضخمة وقواعد البيانات الحشود المتنقلة. ان نتائج الاستعلام المستمدة من قاعدة البيانات الغير كاملة هي أيضا غير كاملة، اي انه لا يمكن الحصول على قيم مؤكدة لنتائج الاستعلام؛ وقد يؤدي هذا إلى تضليل المستخدم خاصة فيما يتعلق بنظم اتخاذ القرار و نظم دعم القرار. وفي الآونة الأخيرة اقترحت مجموعة من الدراسات استخدام قواعد بيانات الحشد الجماعي (crowd-sourcing databases) لتقدير القيم المفقودة في قاعدة البيانات عن طريق توليد قيم مقبولة باستخدام موارد الحشد. وقد اثبتت الدراسات أن قواعد البيانات ذات الحشد الجماعي تمثل حلاً قوياً من خلال دمج ذكاء ،وقدرات ،وخبرات البشر في معالجة المهام. ومع ذلك فإن معالجة المهام اعتماداً على الحشود الجماعية تضع على عاتق المستخدم تكلفة نقدية وزمن انتظار ولا تقدم دائمًا نتيجة دقيقة ترضي المستخدم؛ وبالتالي نحتاج طريقة فعّالة لتقدير القيم المفقودة لـSkylines على قواعد بيانات مصادر الحشد الجماعي (crowd-sourcing) الغير كاملة. تقترح هذه الأطروحة طريقة لتقدير القيم المفقودة لـSkylines على قواعد بيانات مصادر الحشد الجماعي (crowd-sourcing) الغير كاملة وذلك بالتخلص من عناصر البيانات (tuples) الغير مرغوب فيها من قاعدة البيانات الأولية الغير كاملة باستخدام فلترة البيانات لأجل تبسيط عملية تقدير القيم المفقودة. علاوة على ذلك فانها تحاول ايضا الاستفادة من البيانات المتبقية واستغلال العلاقات الضمنية بين الخواص (attributes) لتخمين القيم المفقودة فيSkylines .حيث انه تم توظيف فكرة استنباط علاقات الخواص التي تؤدي إلى توليد مجموعة من التبعيات الوظيفية التقريبية(AFDs) للمساعدة في تقدير القيم المفقودة. بالاضافة الى ذلك تركز هذه الاطروحة على تقليل عدد القيم المراد تقديرها باستخدام الحشد (crowd) وذلك بان تتم عملية التقدير من خلال استخدام قاعدة البيانات الموجودة في الـ crowd عندما يكون التقدير باستخدام العلاقات الضمنية غير مناسب. بالاضافة الى ذلك عند العمل على منصة الحشد الجماعي لتقدير القيم المفقودة لـ Skyline سيؤخذ في الاعتبار العوامل التي تؤثر على معالجة البيانات على منصة الحشد الجماعي (crowd-sourcing platform) مثل التكلفة النقدية ، وقت الانتظار ودقة النتائج. تم إجراء تجارب مكثفة على مجموعات البيانات الاصطناعية والحقيقية و قد أثبتت نتائج التجارب أن الطريقة االمقترحة لتقدير القيم المفقودة لـ Skylines على قواعد بيانات مصادر الحشد الجماعي الغير كاملة قابلة للتطوير وتتفوق على الطريقة الحالية. كما انها تعمل على تبسيط عملية تقدير القيمة للقيم المفقودة في Skylines بشكل كبير عن طريق تقليل عدد القيم المفقودة المراد تقديرها في قاعدة البيانات الأولية التي تصل الى اكثر من 80%. وقد أظهرت النتائج أيضًا أن الطريقة المقترحة قد أنتجت قيمًا تقديرية ذات معدل خطأ نسبي أقل مقارنةً بأحدث الأساليب. حيث يمكن ان تقدير حوالي 40٪ من القيم المفقودة محليًا بدقة عالية تصل إلى 90٪ ، بينما يرتفع معدل التقدير المحلي للقيم المفقودة إلى 95٪ في قاعدة ب بيانات الارتباط. أخيرًا أظهرت نتائج التجارب أن الطريقة المقترحة أدت إلى انخفاض كبير في التكلفة النقدية و زمن الانتظار عند تقدير القيم المفقودة لـSkylines باستخدام قواعد بيانات مصادر الحشد الجماعي.	en_US
dc.description.callnumber	t QA 76.9 H84 S976A 2021	en_US
dc.description.notes	Thesis (Ph.D)--International Islamic University Malaysia, 2021.	en_US
dc.description.physicaldescription	xv, 163 leaves : colour illustrations ; 30cm.	en_US
item.openairetype	Doctoral Thesis	-
item.grantfulltext	open	-
item.fulltext	With Fulltext	-
item.languageiso639-1	en	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.cerifentitytype	Publications	-
Appears in Collections:	KICT Thesis

Files in This Item:

File	Description	Size	Format
t11100393479MarwaBehjatSwidan_24.pdf	24 pages file	477.36 kB	Adobe PDF	View/Open
t11100393479MarwaBehjatSwidan_SEC.pdf Restricted Access	Full text secured file	1.75 MB	Adobe PDF	View/Open Request a copy

Show simple item record

Google Scholar^TM

Check

Items in this repository are protected by copyright, with all rights reserved, unless otherwise indicated. Please give due acknowledgement and credits to the original authors and IIUM where applicable. No items shall be used for commercialization purposes except with written consent from the author.

Files in This Item:

Google ScholarTM

Google Scholar^TM