search query: @keyword feature extraction / total: 22
reference: 9 / 22
« previous | next »
Author:Räsänen, Okko
Title:Speech Segmentation and Clustering Methods for a New Speech Recognition Architecture
Puheen segmentointi ja klusterointi uutta puheentunnistimen arkkitehtuuria varten
Publication type:Master's thesis
Publication year:2007
Pages:vii + 87 s. + liitt. 2      Language:   eng
Department/School:Sähkö- ja tietoliikennetekniikan osasto
Main subject:Akustiikka ja äänenkäsittelytekniikka   (S-89)
Supervisor:Laine, Unto K.
Instructor:
Electronic version URL: http://urn.fi/urn:nbn:fi:tkk-010123
OEVS:
Electronic archive copy is available via Aalto Thesis Database.
Instructions

Reading digital theses in the closed network of the Aalto University Harald Herlin Learning Centre

In the closed network of Learning Centre you can read digital and digitized theses not available in the open network.

The Learning Centre contact details and opening hours: https://learningcentre.aalto.fi/en/harald-herlin-learning-centre/

You can read theses on the Learning Centre customer computers, which are available on all floors.

Logging on to the customer computers

  • Aalto University staff members log on to the customer computer using the Aalto username and password.
  • Other customers log on using a shared username and password.

Opening a thesis

  • On the desktop of the customer computers, you will find an icon titled:

    Aalto Thesis Database

  • Click on the icon to search for and open the thesis you are looking for from Aaltodoc database. You can find the thesis file by clicking the link on the OEV or OEVS field.

Reading the thesis

  • You can either print the thesis or read it on the customer computer screen.
  • You cannot save the thesis file on a flash drive or email it.
  • You cannot copy text or images from the file.
  • You cannot edit the file.

Printing the thesis

  • You can print the thesis for your personal study or research use.
  • Aalto University students and staff members may print black-and-white prints on the PrintingPoint devices when using the computer with personal Aalto username and password. Color printing is possible using the printer u90203-psc3, which is located near the customer service. Color printing is subject to a charge to Aalto University students and staff members.
  • Other customers can use the printer u90203-psc3. All printing is subject to a charge to non-University members.
Location:P1 Ark S80     | Archive
Keywords:speech segmentation
speech clustering
data classification
feature extraction
speech perception
pattern recognition
bottom-up processing
top-down processing
puheen segmentointi
puheen klusterointi
äänimateriaalin luokittelu
piirteistys
hahmontunnistus
puheen havaitseminen
bottom-up prosessointi
top-down prosessointi
Abstract (eng):To reduce the gap between performance of traditional speech recognition systems and human speech recognition skills, a new architecture is required.
A system that is capable of incremental learning offers one such solution to this problem.

This thesis introduces a bottom-up approach for such a speech processing system, consisting of a novel blind speech segmentation algorithm, a segmental feature extraction methodology, and data classification by incremental clustering.
All methods were evaluated by extensive experiments with a broad range of test material and the evaluation methodology was itself also scrutinized.
The segmentation algorithm achieved above standard quality results compared to what is found in current literature regarding blind segmentation.
Possibilities for follow-up research of memory structures and intelligent top-down feedback in speech processing are also outlined.
Abstract (fin):Perinteiset automaattiset puheentunnistusmenetelmät eivät pärjää suorituskyvyssä ihmisen puheenhavaintokyvylle.
Voidaksemme kuroa tämän eron umpeen, on kehitettävä täysin uudentyyppisiä arkkitehtuureja puheentunnistusta varten.
Puhetta ja kieltä itsestään ihmisen lailla oppiva järjestelmä on yksi tällainen vaihtoehto.

Tämä diplomityö esittelee erään lähtökohdan oppivalle järjestelmälle, koostuen uudenlaisesta sokeasta puheen segmentointialgoritmista, segmenttien piirteistyksestä, sekä menetelmistä vähittäiselle puhedatan luokittelulle klusteroinnin avulla.
Kaikki metodit arvioitiin kattavilla kokeilla, ja itse arviontimenetelmien luonteeseen kiinnitettiin huomiota.
Segmentoinnissa saavutettiin alan kirjallisuuteen nähden hyvät tulokset.
Järjestelmän mahdollisia jatkokehityssuuntauksia on hahmoteltu muunmuassa mahdollisten muistiarkkitehtuurien ja älykkään top-down palautteen osalta.
ED:2007-12-19
INSSI record number: 35025
+ add basket
« previous | next »
INSSI