Social Media User Personality Classification Using Computational Linguistic

Lukito, Louis Christy and Erwin, Alva and Purnama, James (2016) Social Media User Personality Classification Using Computational Linguistic. Bachelor thesis, Swiss German University.

[img]
Preview
Text
Louis Lukito 12112013 TOC.pdf

Download (1MB) | Preview
[img] Text
Louis Lukito 12112013 1.pdf
Restricted to Registered users only

Download (392kB)
[img] Text
Louis Lukito 12112013 2.pdf
Restricted to Registered users only

Download (912kB)
[img] Text
Louis Lukito 12112013 3.pdf
Restricted to Registered users only

Download (330kB)
[img] Text
Louis Lukito 12112013 4.pdf
Restricted to Registered users only

Download (2MB)
[img] Text
Louis Lukito 12112013 5.pdf
Restricted to Registered users only

Download (269kB)
[img]
Preview
Text
Louis Lukito 12112013 Ref.pdf

Download (636kB) | Preview

Abstract

Personality is what that differentiate an individual with another. By knowing and understanding an individual’s personality, many advantages can be obtained. Together with rapid growth of technology, knowing an individual’s personality can be done automatically. Psychology researches suggest that certain personality traits have correlation with linguistic behavior. Supported by the fame of social media, predicting humans’ personality from their post become possible. Most existing researches have done similar approach in predicting personality from social media. However, focuses on closed vocabulary investigation with English as their language and mostly based on Big Five personality type. In this thesis, we explore Twitter as data source for open vocabulary personality prediction in Indonesia. We analyze and compare three different statistical model and find correlation about which personality traits are related with linguistic behavior. As a result, Naïve Bayes classifier outperforms the other statistical model with the highest accuracy (80% for I/E and 60% for S/N, T/F, and J/P personality traits) and shows the best performance in terms of speed in classifying the users. Moreover, a simple application was developed based on the best statistical model compared before to classify an individual’s personality with their Twitter username and gender as an input. In addition, the top 30 of most frequently used words per personality traits is also the result of this experiment, in which all of the experiments has been validated with a psychology expert.

Item Type: Thesis (Bachelor)
Uncontrolled Keywords: Personality; Twitter; Lexical Approach; Grammatic Rule; Naïve Bayes
Subjects: H Social Sciences > HM Sociology > HM742 Online social networks > HM742.1 Social Media
T Technology > T Technology (General) > T58.5 Information technology
Divisions: Faculty of Engineering and Information Technology > Department of Information Technology
Depositing User: Astuti Kusumaningrum
Date Deposited: 19 Oct 2020 15:08
Last Modified: 19 Oct 2020 15:08
URI: http://repository.sgu.ac.id/id/eprint/919

Actions (login required)

View Item View Item