haku: @keyword dimensionality reduction / yhteensä: 9
viite: 2 / 9
Tekijä: | Heuer, Hendrik |
Työn nimi: | Semantic and stylistic text analysis and text summary evaluation |
Julkaisutyyppi: | Diplomityö |
Julkaisuvuosi: | 2015 |
Sivut: | [4] + 42 Kieli: eng |
Koulu/Laitos/Osasto: | Perustieteiden korkeakoulu |
Oppiaine: | Human Computer Interaction and Design (SCI3020) |
Valvoja: | Kaski, Samuel ; Karlgren, Jussi |
Ohjaaja: | Laaksonen, Jorma |
Elektroninen julkaisu: | http://urn.fi/URN:NBN:fi:aalto-201509184348 |
Sijainti: | P1 Ark Aalto 3105 | Arkisto |
Avainsanat: | text analysis machine learning distributional semantics word representations word2vec dimensionality reduction |
Tiivistelmä (eng): | The main contribution of this Master's thesis is a novel way of doing text comparison using word vector representations (word2vec) and dimensionality reduction (t-SNE). This yields a bird's-eye view of different text sources, including text summaries and their source material, and enables users to explore a text source like a geographical map. The main goal of the thesis was to support the quality control and quality assurance efforts of a company. This goal was operationalized and subdivided into several modules. In this thesis, the Topic and Topic Comparison modules are described. For each module, the state of the art in natural language processing and machine learning research was investigated and applied. The implementation section of this thesis discusses what each module does, how it relates to theory, how the module is implemented, the motivation for the chosen approach and self-criticism. The thesis also describes how to derive a text quality gold standard using machine learning. |
ED: | 2015-09-27 |
INSSI tietueen numero: 52066
+ lisää koriin
INSSI