Developing a Scalable and Accurate Job Recommendation System with Distributed Cluster System using Machine Learning Algorithm

Dicky, Timothy and Erwin, Alva and Ipung, Heru Purnomo (2019) Developing a Scalable and Accurate Job Recommendation System with Distributed Cluster System using Machine Learning Algorithm. Bachelor thesis, Swiss German University.

[img]
Preview
Text
Timothy Dicky 11502006 TOC.pdf

Download (116kB) | Preview
[img] Text
Timothy Dicky 11502006 1.pdf
Restricted to Registered users only

Download (104kB)
[img] Text
Timothy Dicky 11502006 2.pdf
Restricted to Registered users only

Download (198kB)
[img] Text
Timothy Dicky 11502006 3.pdf
Restricted to Registered users only

Download (175kB)
[img] Text
Timothy Dicky 11502006 4.pdf
Restricted to Registered users only

Download (506kB)
[img] Text
Timothy Dicky 11502006 5.pdf
Restricted to Registered users only

Download (1MB)
[img]
Preview
Text
Timothy Dicky 11502006 Ref.pdf

Download (1MB) | Preview

Abstract

The purpose of this research is to develop a job recommender system based on the Hadoop MapReduce framework to achieve scalability of the system when it processes big data. Also, a machine learning algorithm is implemented inside the job recommender to produce an accurate job recommendation. The project begins by collecting sample data to build an accurate job recommender system with a centralized program architecture. Then a job recommender with a distributed system program architecture is implemented using Hadoop MapReduce which then deployed to a Hadoop cluster. After the implementation, both systems are tested using a large number of applicants and job data, with the time required for the program to compute the data is recorded to be analyzed. Based on the experiments, we conclude that the recommender produces the most accurate result when the cosine similarity measure is used inside the algorithm. Also, the centralized job recommender system is able to process the data faster compared to the distributed cluster job recommender system. But as the size of the data grows, the centralized system eventually will lack the capacity to process the data, while the distributed cluster job recommender is able to scale according to the size of the data.

Item Type: Thesis (Bachelor)
Uncontrolled Keywords: Machine Learning ; Distributed Cluster System ; Hadoop MapReduce
Subjects: T Technology > T Technology (General) > T58.6 Management information systems
Divisions: Faculty of Engineering and Information Technology > Department of Information Technology
Depositing User: Adityatama Ratangga
Date Deposited: 19 May 2020 13:59
Last Modified: 19 May 2020 13:59
URI: http://repository.sgu.ac.id/id/eprint/623

Actions (login required)

View Item View Item