haku: @supervisor Raiko, Tapani / yhteensä: 4
viite: 4 / 4
« edellinen | seuraava »
Tekijä: | Perello Nieto, Miquel |
Työn nimi: | Merging chrominance and luminance in early, medium, and late fusion using Convolutional Neural Networks |
Julkaisutyyppi: | Diplomityö |
Julkaisuvuosi: | 2015 |
Sivut: | xxiv + 166 Kieli: eng |
Koulu/Laitos/Osasto: | Perustieteiden korkeakoulu |
Oppiaine: | Machine Learning and Data Mining (SCI3015) |
Valvoja: | Raiko, Tapani |
Ohjaaja: | Koskela, Markus ; Gavaldá Mestre, Ricard |
Elektroninen julkaisu: | http://urn.fi/URN:NBN:fi:aalto-201506303564 |
Sijainti: | P1 Ark Aalto 2889 | Arkisto |
Avainsanat: | machine learning computer vision image classification artificial neural network convolutional neural network connectionism |
Tiivistelmä (eng): | The field of Machine Learning has received extensive attention in recent years. More particularly, computer vision problems have got abundant consideration as the use of images and pictures in our daily routines is growing. The classification of images is one of the most important tasks that can be used to organize, store, retrieve, and explain pictures. In order to do that, researchers have been designing algorithms that automatically detect objects in images. During last decades, the common approach has been to create sets of features - manually designed - that could be exploited by image classification algorithms. More recently, researchers designed algorithms that automatically learn these sets of features, surpassing state-of-the-art performances. However, learning optimal sets of features is computationally expensive and it can be relaxed by adding prior knowledge about the task, improving and accelerating the learning phase. Furthermore, with problems with a large feature space the complexity of the models need to be reduced to make it computationally tractable (e.g. the recognition of human actions in videos). Consequently, we propose to use multimodal learning techniques to reduce the complexity of the learning phase in Artificial Neural Networks by incorporating prior knowledge about the connectivity of the network. Furthermore, we analyze state-of-the-art models for image classification and propose new architectures that can learn a locally optimal set of features in an easier and faster manner. In this thesis, we demonstrate that merging the luminance and the chrominance part of the images using multimodal learning techniques can improve the acquisition of good visual set of features. We compare the validation accuracy of several models and we demonstrate that our approach outperforms the basic model with statistically significant results. |
ED: | 2015-08-16 |
INSSI tietueen numero: 51980
+ lisää koriin
« edellinen | seuraava »
INSSI