Reputation Scoring Fake News Using Text Mining

Firdaus, Ahmad and Galinium, Maulahikmah and Erwin, Alva (2017) Reputation Scoring Fake News Using Text Mining. Masters thesis, Swiss German University.

[img]
Preview
Text
Ahmad Firdaus 21551015 TOC.pdf

Download (526kB) | Preview
[img] Text
Ahmad Firdaus 21551015 1.pdf
Restricted to Registered users only

Download (186kB)
[img] Text
Ahmad Firdaus 21551015 2.pdf
Restricted to Registered users only

Download (650kB)
[img] Text
Ahmad Firdaus 21551015 3.pdf
Restricted to Registered users only

Download (134kB)
[img] Text
Ahmad Firdaus 21551015 4.pdf
Restricted to Registered users only

Download (1MB)
[img] Text
Ahmad Firdaus 21551015 5.pdf
Restricted to Registered users only

Download (162kB)
[img]
Preview
Text
Ahmad Firdaus 21551015 Ref.pdf

Download (274kB) | Preview

Abstract

The classification of hoax news or news with incorrect information is one of the text categorization applications. Like text-based categorization of machine applications in general, this system consists of pre-processing and execution of classification models. In this study, experiments were conducted to select the best technique in each sub-process by using 1200 articles hoax and 600 article no hoax collected manually. This research Tried experimenting to determine the best preprocessing stages between stop removal and stemming and showing the results of the deception Tree alghoritma achieving an accuracy of 100% concluded above naive bayes more stable level of accuracy in the number of datasets used in all candidates . Information gain, TFIDF and GGA based on using Naive Bayes algorithm, supporting Vector Machine and Decision Tree no significant percentage change occurred on all candidates. But after using GGA (Optimize Generation) feature selection there is an increase of accuracy level The results of a comparison of classification algorithms between Naive Bayes, decision trees and Support Vector machines combined with the GGA feature selection method for classifying the best result is generated by the selection of GGA + Decission Tree feature on candidate 2 (Paslon2) 100% and in the selection of the Information Gain + Decission Tree Feature selection with the lowest accuracy Candidate 3 at 36.67%, but overall improvement of accuracy Occurred on all alghoritma after using feature selection and Naive bayes are faster in processing time and Decision Tree is the longest for processing time and Naive bayes more stable level of accuracy in the number of datasets used in all candidates.

Item Type: Thesis (Masters)
Uncontrolled Keywords: Classification; Pre-processing; Feature Selection; Accuracy
Subjects: P Language and Literature > P Philology. Linguistics > P96 Media literacy
Q Science > QA Mathematics > QA76 Computer software >
Divisions: Faculty of Engineering and Information Technology > Department of Information Technology
Depositing User: Astuti Kusumaningrum
Date Deposited: 12 May 2020 08:15
Last Modified: 12 May 2020 08:15
URI: http://repository.sgu.ac.id/id/eprint/282

Actions (login required)

View Item View Item