search query: @keyword JASPAR / total: 1
reference: 1 / 1
« previous | next »
Author: | Khakipoor, Banafsheh |
Title: | Integrated data analysis pipeline for whole human genome transcription factor binding sites prediction |
Publication type: | Master's thesis |
Publication year: | 2015 |
Pages: | 36 s. + liitt. 1 Language: eng |
Department/School: | Perustieteiden korkeakoulu |
Main subject: | Bioinformatics (T3012) |
Supervisor: | Lähdesmäki, Harri |
Instructor: | Lähdesmäki, Harri |
Electronic version URL: | http://urn.fi/URN:NBN:fi:aalto-201506303593 |
Location: | P1 Ark Aalto 2892 | Archive |
Keywords: | transcription factor PWM TRANSFAC JASPAR SELEX PBM |
Abstract (eng): | Transcription factors (TF) have a central role in regulating gene expression by binding to regulatory regions in DNA. Position weight matrix (PWM) model is the most commonly used model for representing and predicting TF binding sites. Consequently, several studies have been done on predicting TF binding sites using PWMs and many databases have been created containing large numbers of PWMs. However, these studies require the user to search for binding sites for each PWM separately, thus making it is difficult to get a general view of binding predictions for many PWMs simultaneously. In response to this need, this thesis project evaluates both individual and groups of PWMs and creates an effortless method to analyze and visualize the desired set of PWMs together, making it easier for biologist to analyze large amount of data in a short period of time. For this purpose, we used bioinformatics methods to detect putative TF binding sites in human genome and make them available online via the UCSC genome browser. Still, the sheer amount of data in PWM databases required a more efficient method to summarize TF binding prediction. Hence, we used PWM similarity measures and clustering algorithms to group together PWMs and to create one integrated database from four popular PWM databases: SELEX, TRANSFAC, UniPROBE, and JASPAR. All results are made publicly available for the research community via the UCSC genome broswer. |
ED: | 2015-08-16 |
INSSI record number: 52009
+ add basket
« previous | next »
INSSI