Inssi

Helecon

Vocabulary

Tenttu

search query: @keyword tourism / total: 5

reference: 1 / 5

« previous | next »

Author:	Choudhury, Najeefa
Title:	Sentiment Analysis of Twitter Data for a Tourism Recommender System in Bangladesh
Publication type:	Master's thesis
Publication year:	2016
Pages:	(8) + 55 s. + liitt. 10 Language: eng
Department/School:	Perustieteiden korkeakoulu
Main subject:	Cloud Computing and Services (SCI3081)
Supervisor:	Heljanko, Keijo
Instructor:	Heljanko, Keijo
Electronic version URL:	http://urn.fi/URN:NBN:fi:aalto-201612226260
Location:	P1 Ark Aalto 5928 \| Archive
Keywords:	big data sentiment analysis Twitter tourism scala spark
Abstract (eng):	The exponentially expanding Digital Universe is generating huge amount of data containing valuable information. The tourism industry, which is one of the fastest growing economic sectors, can benefit from the myriad of digital data travelers generate in every phase of their travel- planning, booking, traveling, feedback etc. One application of tourism related data can be to provide personalized destination recommendations. The primary objective of this research is to facilitate the business development of a tourism recommendation system for Bangladesh called "JatraLog". Sentiment based recommendation is one of the features that will be employed in the recommendation system. This thesis aims to address two research goals: firstly, to study Sentiment Analysis as a tourism recommendation tool and secondly, to investigate twitter as a potential source of valuable tourism related data for providing recommendations for different countries, specifically Bangladesh. Sentiment Analysis can be defined as a Text Classification problem, where a document or text is classified into two groups: positive or negative, and in some cases a third group, i.e. neutral. For this thesis, two sets of tourism related English language tweets were collected from Twitter using keywords. The first set contains only the tweets and the second set contains geo-location and timestamp along with the tweets. Then the collected tweets were automatically labeled as positive or negative depending on whether the tweets contained positive or negative emoticons respectively. After they were labeled, 90% of the tweets from the first set were used to train a Naive Bayes Sentiment Classifier and the remaining 10% were used to test the accuracy of the Classifier. The Classifier accuracy was found to be approximately 86.5%. The second set was used to retrieve statistical information required to address the second research goal, i.e. investigating Twitter as a potential source of sentiment data for a destination recommendation system.
ED:	2017-01-08

INSSI record number: 55298

« previous | next »

INSSI