Dicky, Timothy and Erwin, Alva and Ipung, Heru Purnomo (2019) Developing a Scalable and Accurate Job Recommendation System with Distributed Cluster System using Machine Learning Algorithm. Bachelor thesis, Swiss German University.
|
Text
Timothy Dicky 11502006 TOC.pdf Download (116kB) | Preview |
|
Text
Timothy Dicky 11502006 1.pdf Restricted to Registered users only Download (104kB) |
||
Text
Timothy Dicky 11502006 2.pdf Restricted to Registered users only Download (198kB) |
||
Text
Timothy Dicky 11502006 3.pdf Restricted to Registered users only Download (175kB) |
||
Text
Timothy Dicky 11502006 4.pdf Restricted to Registered users only Download (506kB) |
||
Text
Timothy Dicky 11502006 5.pdf Restricted to Registered users only Download (1MB) |
||
|
Text
Timothy Dicky 11502006 Ref.pdf Download (1MB) | Preview |
Abstract
The purpose of this research is to develop a job recommender system based on the Hadoop MapReduce framework to achieve scalability of the system when it processes big data. Also, a machine learning algorithm is implemented inside the job recommender to produce an accurate job recommendation. The project begins by collecting sample data to build an accurate job recommender system with a centralized program architecture. Then a job recommender with a distributed system program architecture is implemented using Hadoop MapReduce which then deployed to a Hadoop cluster. After the implementation, both systems are tested using a large number of applicants and job data, with the time required for the program to compute the data is recorded to be analyzed. Based on the experiments, we conclude that the recommender produces the most accurate result when the cosine similarity measure is used inside the algorithm. Also, the centralized job recommender system is able to process the data faster compared to the distributed cluster job recommender system. But as the size of the data grows, the centralized system eventually will lack the capacity to process the data, while the distributed cluster job recommender is able to scale according to the size of the data.
Item Type: | Thesis (Bachelor) |
---|---|
Uncontrolled Keywords: | Machine Learning ; Distributed Cluster System ; Hadoop MapReduce |
Subjects: | T Technology > T Technology (General) > T58.6 Management information systems |
Divisions: | Faculty of Engineering and Information Technology > Department of Information Technology |
Depositing User: | Adityatama Ratangga |
Date Deposited: | 19 May 2020 13:59 |
Last Modified: | 19 May 2020 13:59 |
URI: | http://repository.sgu.ac.id/id/eprint/623 |
Actions (login required)
View Item |