haku: @keyword big data / yhteensä: 34
viite: 4 / 34
Tekijä:Choudhury, Najeefa
Työn nimi:Sentiment Analysis of Twitter Data for a Tourism Recommender System in Bangladesh
Julkaisutyyppi:Diplomityö
Julkaisuvuosi:2016
Sivut:(8) + 55 s. + liitt. 10      Kieli:   eng
Koulu/Laitos/Osasto:Perustieteiden korkeakoulu
Oppiaine:Cloud Computing and Services   (SCI3081)
Valvoja:Heljanko, Keijo
Ohjaaja:Heljanko, Keijo
Elektroninen julkaisu: http://urn.fi/URN:NBN:fi:aalto-201612226260
Sijainti:P1 Ark Aalto  5928   | Arkisto
Avainsanat:big data
sentiment analysis
Twitter
tourism
scala
spark
Tiivistelmä (eng):The exponentially expanding Digital Universe is generating huge amount of data containing valuable information.
The tourism industry, which is one of the fastest growing economic sectors, can benefit from the myriad of digital data travelers generate in every phase of their travel- planning, booking, traveling, feedback etc.
One application of tourism related data can be to provide personalized destination recommendations.
The primary objective of this research is to facilitate the business development of a tourism recommendation system for Bangladesh called "JatraLog".
Sentiment based recommendation is one of the features that will be employed in the recommendation system.
This thesis aims to address two research goals: firstly, to study Sentiment Analysis as a tourism recommendation tool and secondly, to investigate twitter as a potential source of valuable tourism related data for providing recommendations for different countries, specifically Bangladesh.

Sentiment Analysis can be defined as a Text Classification problem, where a document or text is classified into two groups: positive or negative, and in some cases a third group, i.e. neutral.
For this thesis, two sets of tourism related English language tweets were collected from Twitter using keywords.
The first set contains only the tweets and the second set contains geo-location and timestamp along with the tweets.
Then the collected tweets were automatically labeled as positive or negative depending on whether the tweets contained positive or negative emoticons respectively.
After they were labeled, 90% of the tweets from the first set were used to train a Naive Bayes Sentiment Classifier and the remaining 10% were used to test the accuracy of the Classifier.
The Classifier accuracy was found to be approximately 86.5%.
The second set was used to retrieve statistical information required to address the second research goal, i.e. investigating Twitter as a potential source of sentiment data for a destination recommendation system.
ED:2017-01-08
INSSI tietueen numero: 55298
+ lisää koriin
INSSI