Financial numeric and textual data based stock prediction using machine learning techniques

Islam, Mohammad Rabiul

Please use this identifier to cite or link to this item: http://studentrepo.iium.edu.my/handle/123456789/10425

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Imad Fakhri Al-Shaikhli, Ph.D	en_US
dc.contributor.advisor	Rizal Mohd Nor, Ph.D	en_US
dc.contributor.advisor	Afidalina Tumian, Ph.D	en_US
dc.contributor.author	Islam, Mohammad Rabiul	en_US
dc.date.accessioned	2021-03-05T01:11:38Z	-
dc.date.available	2021-03-05T01:11:38Z	-
dc.date.issued	2020	-
dc.identifier.uri	http://studentrepo.iium.edu.my/handle/123456789/10425	-
dc.description.abstract	Source of web-text and numerical data analysis for stock prediction is the challenging tasks in today’s stock market engineering. Day traders in the stock market face the common issues of decision making, which is mostly dependent on daily or weekly bases data analysis. To overcome this problem, web text mining and data mining analysis techniques applied on stock market closing values, which brings the most technical approach. In terms of stock-market textual data classification, the applicable soft-computing technique holds two classifications as binary and multinomial with clustering algorithms used to apply for analysis. New prediction models overcome the drawback of previous research and indicate the necessity of classification by creating prediction algorithm with a token or a polar based financial text weighting scheme of intensive scale value (ISV) system. Binary classification helps to improve the sense of positivity and negativity with intensive value to evaluate the vast amount of financial textual data for trading decision. This research improves with the technical correlation that addressed the problem of categorical financial textual and numerical data throughout various soft-computing techniques. Targeted numerical and textual data rely on subsequently neural network, binary and multinomial classification to improve the prediction techniques by feature engineering. In terms of textual data, the novel financial data dictionary is prepared based on Harvard reference weighting schemed valued that defined as a likely result in new Stock Prediction Model. Financial news-based text analysis techniques improve the classification scenario with Naïve Bayes binary classification through financial data dictionary. Beside the text analysis, the feed-forward neural network architectural model also improved based on backpropagation neural network structural that approached by defining the correlation between the actual and prediction trend of a daily basis. Daily stock price prediction is the main objective of this research and very essential to generate accurate prediction through online daily basis financial news data. The new architectural neural network model performs with sequential data as hidden with the dataset which applied by the multi-objective optimization algorithm. Throughout feature engineering, setting by scaling value determines the weight factors of developing a neural network that used to define more precious trend within this model. This model enabled to calculate the highest frequent value that occurred on a large dataset, and clustering indicate the stock trend of the prediction. Based on the numerical financial data, new Stock Prediction Model (SPM) have developed for analyzing market movement from two benchmarks numerical stock market dataset those are S&P 500index and OHLCV dataset. Developing integrated classification techniques conducting with prediction analysis based on its classification accuracy as defined in this research 82% which is obvious and better than previous research. The performance with feature engineering in text classification also gain 93%, whereas multilevel and binary classification have performed as combined to gain the best accuracy level. Performance of the proposed approach was estimated by evaluating various parameter as part of the information retrieval field, as seen in this experimental result. However, developing model impacts on academical research philosophy in terms of financial data classification but not highly recommend using in real trading analysis.	en_US
dc.language.iso	en	en_US
dc.publisher	Kuala Lumpur : Kulliyyah of Information and Communication Technology, International Islamic University Malaysia, 2020	en_US
dc.subject.lcsh	Stock price forecasting	en_US
dc.subject.lcsh	Stock exchanges	en_US
dc.title	Financial numeric and textual data based stock prediction using machine learning techniques	en_US
dc.type	Doctoral Thesis	en_US
dc.description.identity	t11100424833MohammadRabiulIslam	en_US
dc.description.identifier	Thesis : Financial numeric and textual data based stock prediction using machine learning techniques /by Mohammad Rabiul Islam	en_US
dc.description.kulliyah	Kulliyyah of Information and Communication Technology	en_US
dc.description.programme	Doctor of Philosophy in Computer Science	en_US
dc.description.abstractarabic	تعد المصادر الإلكترونية للنصوص والتحاليل الرقمية للبيانات للتنبؤ بالأسهم مهمة صعبة في هندسة سوق الأوراق المالية في وقتنا هذا. حيث يواجه المتداولون اليوميون في سوق الأسهم قضايا مشتركة لاتخاذ القرارات والتي تعتمد في الغالب على تحليل البيانات اليومية أو الأسبوعية لسوق الأوراق المالية. وللتغلب على هذه المشكلة، يتم تطبيق تقنيات التنقيب عن البيانات وتحليلها على قيم إغلاق هذه الاسواق، وهو ما يحقق أفضل نهج تقني. وفيما يتعلق بتصنيف البيانات النصية لسوق الأسهم، فإن الأساليب الحاسوبية المعمول بها تحمل صنفين أحدهما ثنائي والآخر متعدد الحدود مع خوارزميات التجميع المستخدمة للتحليل. وتتغلب نماذج التنبؤ الجديدة على عيوب البحوث السابقة كما تشير إلى ضرورة التصنيف عن طريق إنشاء خوارزمية التنبؤ مع رمز أو نظام ترجيح للنص المالي القائم على نظام قيمة المقياس المكثف (ISV). اما التصنيف الثنائي يساعد على تحسين المعنى الإيجابي والسلبي مع أهمية مكثفة لتقييم الكم الهائل من البيانات النصية المالية لاتخاذ قرار التداول. يتم تحسين هذه الدراسة من خلال الارتباط الفني الذي تناول مشكلة البيانات المالية الرقمية والنصية عبر مختلف التقنيات الحاسوبية. حيث تعتمد البيانات الرقمية والنصية المستهدفة على الشبكات العصبية والتصنيف الثنائي والمتعدد الحدود لتحسين تقنيات التنبؤ من خلال هندسة الميزات. ومن حيث البيانات النصية، تم إعداد قاموس البيانات المالية الجديد استناداً إلى مخطط ترجيح المرجع في جامعة هارفارد، والذي تم تحديده باعتباره نتيجة محتملة لنموذج التنبؤ بالأسهم الجديد. تعمل تحليل بيانات النصوص المالية المستندة على الأخبار المالية على تحسين سيناريو التصنيف باستخدام تصنيف Naïve Bayes الثنائي من خلال قاموس البيانات المالية. وإلى جانب تحليل النص، تم أيضًا تحسين بنية الشبكة العصبية للتغذية المسبقة استناداً إلى بنية الشبكة العصبية العكسية والتي تمت من خلال تحديد العلاقة المتبادلة بين الاتجاه الفعلي والتوقع للأساس اليومي. كما يعد التنبؤ بأسعار الأسهم اليومية الهدف الرئيسي لهذا البحث وهو ضروري للغاية لتقديم تنبؤ دقيق من خلال بيانات الأخبار المالية اليومية على الإنترنت. حيث يعمل نموذج الشبكة العصبية الجديد ببيانات متسلسلة مخفية مع مجموعة البيانات التي يتم تطبيقها بواسطة خوارزمية التحسين متعددة الأهداف. وخلال هندسة الميزات، يتم تحديد الإعدادات حسب قياس قيمة عوامل الوزن لتطوير شبكة عصبية تستخدم لتحديد اتجاه أكثر قيمة ضمن هذا النموذج. تم تمكين هذا النموذج لحساب أعلى قيمة متكررة حدثت في مجموعة البيانات، ويشير التجميع إلى اتجاه الأسهم من التنبؤ. واستنادًا إلى البيانات المالية الرقمية، تم تطوير نموذج جديد للتنبؤ بالأسهم (SPM) لتحليل حركة السوق من خلال مجموعة بيانات رقمية لسوق الأوراق المالية هما S&P 500index وOHLCV. تطوير تقنيات التصنيف المتكاملة التي تجري مع تحليل التنبؤ بناءً على دقة التصنيف حددت بنسبة 82٪ في هذا البحث وهو أمر واضح انه أفضل من دقة التصنيف في البحوث السابقة. كما تحصل الأداء مع هندسة الميزات في تصنيف النصوص على 93%، في حين أن اداء كل من التصنيف الثنائي والمتعدد المستويات فقد تم دمجهما معا للحصول على أفضل مستوى من الدقة. تم تقدير أداء المنهج المقترح من خلال تقييم مختلف المتغيرات كجزء من مجال استرجاع المعلومات، كما تم توضيحه في النتائج التجريبية. ومع ذلك، فإننا لا نوصي باستخدام تطوير تأثيرات النموذج المبني على فلسفة البحوث الأكاديمية من حيث تصنيف البيانات المالية في تحليل التداول الحقيقي.	en_US
dc.description.callnumber	t HG 4637 I82F 2020	en_US
dc.description.notes	Thesis (Ph.D)--International Islamic University Malaysia, 2020.	en_US
dc.description.physicaldescription	xviii, 226 leaves : colour illustrations ; 30cm.	en_US
item.openairetype	Doctoral Thesis	-
item.grantfulltext	open	-
item.fulltext	With Fulltext	-
item.languageiso639-1	en	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.cerifentitytype	Publications	-
Appears in Collections:	KICT Thesis

Files in This Item:

File	Description	Size	Format
t11100424833MohammadRabiulIslam_24.pdf	24 pages file	317.43 kB	Adobe PDF	View/Open
t11100424833MohammadRabiulIslam_SEC.pdf Restricted Access	Full text secured file	4.53 MB	Adobe PDF	View/Open Request a copy

Show simple item record

Page view(s)

8

checked on May 17, 2021

Download(s)

4

checked on May 17, 2021

Google Scholar^TM

Check

Items in this repository are protected by copyright, with all rights reserved, unless otherwise indicated. Please give due acknowledgement and credits to the original authors and IIUM where applicable. No items shall be used for commercialization purposes except with written consent from the author.

Files in This Item:

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM