search query: @keyword data staging / total: 2
reference: 1 / 2
« previous | next »
Author:Honkanen, Rami
Title:Genomic data staging for parallel analysis
Genomisen tiedon välivarastointi rinnakkaista käsittelyä varten
Publication type:Master's thesis
Publication year:2014
Pages:86      Language:   eng
Department/School:Perustieteiden korkeakoulu
Main subject:Ohjelmistotekniikka   (T3001)
Supervisor:Heljanko, Keijo
Instructor:Sevon, Petteri
Electronic version URL: http://urn.fi/URN:NBN:fi:aalto-201406252191
OEVS:
Electronic archive copy is available via Aalto Thesis Database.
Instructions

Reading digital theses in the closed network of the Aalto University Harald Herlin Learning Centre

In the closed network of Learning Centre you can read digital and digitized theses not available in the open network.

The Learning Centre contact details and opening hours: https://learningcentre.aalto.fi/en/harald-herlin-learning-centre/

You can read theses on the Learning Centre customer computers, which are available on all floors.

Logging on to the customer computers

  • Aalto University staff members log on to the customer computer using the Aalto username and password.
  • Other customers log on using a shared username and password.

Opening a thesis

  • On the desktop of the customer computers, you will find an icon titled:

    Aalto Thesis Database

  • Click on the icon to search for and open the thesis you are looking for from Aaltodoc database. You can find the thesis file by clicking the link on the OEV or OEVS field.

Reading the thesis

  • You can either print the thesis or read it on the customer computer screen.
  • You cannot save the thesis file on a flash drive or email it.
  • You cannot copy text or images from the file.
  • You cannot edit the file.

Printing the thesis

  • You can print the thesis for your personal study or research use.
  • Aalto University students and staff members may print black-and-white prints on the PrintingPoint devices when using the computer with personal Aalto username and password. Color printing is possible using the printer u90203-psc3, which is located near the customer service. Color printing is subject to a charge to Aalto University students and staff members.
  • Other customers can use the printer u90203-psc3. All printing is subject to a charge to non-University members.
Location:P1 Ark Aalto  1725   | Archive
Keywords:distributed storage
distributed analysis
data staging
genomics
big data
hajautettu tallennus
tiedon välivarastointi
genomiikka
massadata
Abstract (eng):This Master's Thesis describes a solution for storing large genomic data in a scalable, robust and secure way.
There are various constraints for the design, because the new solution is intended to replace an existing storage system that is already in production use by Biocomputing Platforms Ltd.

The primary demand for this solution arises from the growing size of data produced by genotyping devices and processes, and the growing practice of combining large genomic data sets for analysis.
In addition to scalability, security requirements and expectations are also tightening.

A new distributed storage system was designed to provide fast and location-transparent access to various storage back-ends, including some popular cloud storage services.

The solution scales up to hundreds of terabytes with conventional hardware, and much further when used in conjunction with other scalable storage systems.
Finally, other ways are presented for improving the design to reach petascale with conventional or virtualised hardware.
Abstract (fin):Tämä diplomityö esittelee ratkaisun suurten genomisten tietomäärien skaalautuvaan, luotettavaan ja turvalliseen tallentamiseen.
Suunniteltavan järjestelmän on tarkoitus korvata Biocomputing Platforms Oy:n tuotteissa käytössä oleva tallennusjärjestelmä, mikä asettaa työlle useita vaatimuksia.

Suurin tarve uudelle ratkaisulle aiheutuu genotyypityslaitteiden tuottaman tietomäärän kasvusta sekä yleistyvästä käytännöstä yhdistellä suuria eri lähteistä saatuja aineistoja.
Skaalautuvuuden lisäksi myös tietoturvallisuusvaatimukset ovat tiukentumassa.

Työssä suunniteltiin uusi hajautettu tiedontallennusjärjestelmä, joka tarjoaa nopean ja sijaintiriippumattoman pääsyn monenlaisiin tiedonvarastoinnin taustajärjestelmiin, mukaanlukien joihinkin suosittuihin pilvitallennuspalveluihin.

Ratkaisu skaalautuu satoihin teratavuihin perinteisellä laitteistolla ja huomattavasti suurempaan tietomäärään joihinkin ulkoisiin tallennusjärjestelmiin yhdistettynä.
Lopuksi esitellään tapoja parantaa suunnitelmaa petatavujen kokoisen tiedon tallentamiseen ilman ulkoisia järjestelmiä.
ED:2014-08-03
INSSI record number: 49408
+ add basket
« previous | next »
INSSI