haku: @keyword HDFS / yhteensä: 3
viite: 1 / 3
« edellinen | seuraava »
Tekijä:Mukherjee, Alapan
Työn nimi:Benchmarking Hadoop performance on different distributed storage systems
Julkaisutyyppi:Diplomityö
Julkaisuvuosi:2015
Sivut:110      Kieli:   eng
Koulu/Laitos/Osasto:Perustieteiden korkeakoulu
Oppiaine:Mobile Computing   (T-110)
Valvoja:Heljanko, Keijo
Ohjaaja:Döngelci, Ridvan
Elektroninen julkaisu: http://urn.fi/URN:NBN:fi:aalto-201509184328
Sijainti:P1 Ark Aalto  3020   | Arkisto
Avainsanat:Tachyon
HDFS
Ceph
benchmarks
Tiivistelmä (eng):Distributed storage systems have been in place for years, and have undergone significant changes in architecture to ensure reliable storage of data in a cost-effective manner.
With the demand for data increasing, there has been a shift from disk-centric to memory-centric computing - the focus is on saving data in memory rather than on the disk.
The primary motivation for this is the increased speed of data processing.
This could, however, mean a change in the approach to providing the necessary fault-tolerance - instead of data replication, other techniques may be considered.

One example of an in-memory distributed storage system is Tachyon.
Instead of replicating data files in memory, Tachyon provides fault-tolerance by maintaining a record of the operations needed to generate the data files.
These operations are replayed if the files are lost.
This approach is termed lineage.
Tachyon is already deployed by many well-known companies.

This thesis work compares the storage performance of Tachyon with that of the on-disk storage systems HDFS and Ceph.
After studying the architectures of well-known distributed storage systems, the major contribution of the work is to integrate Tachyon with Ceph as an underlayer storage system, and understand how this affects its performance, and how to tune Tachyon to extract maximum performance out of it.
ED:2015-09-27
INSSI tietueen numero: 52047
+ lisää koriin
« edellinen | seuraava »
INSSI