FALA2010 Invited Keynotes

zen Heiga Zen. Toshiba Research Europe Ltd. (UK)

Title: “Fundamentals and recent advances in HMM-based speech synthesis”

PDF file with the presentation at FALA2010

Abstract: Statistical parametric speech synthesis based on HMMs has grown in popularity over the last years. In this talk, its system architecture is outlined, and then basic techniques used in the system, including algorithms for speech parameter generation from HMM, are described with simple examples. Relation to the unit selection approach and recent improvements are summarized. Techniques developed for increasing the flexibility and improving the speech quality are also reviewed.

Speaker Bio:  Heiga Zen received the Dr.Eng. degree in computer science and engineering from Nagoya Institute of Technology in 2006. He is currently a Research Engineer in the Speech Technology Group of Toshiba Research Europe Ltd. Cambridge Research Laboratory. He was an intern researcher at the ATR Spoken Language Translation Research Laboratories in 2003 and an intern/co-op researcher at the IBM T. J. Watson Research Center from 2004 to 2005. From April 2006 to July 2008, he was a postdoctoral research associate at the Nagoya Institute of Technology. He has been working on HMM-based speech synthesis for 9 years after joining Prof. Tokuda’s research group in 2000. He was also the main developer and maintainer of HTS, one of the main developers of the Festival Speech Synthesis System, one of the main developers of SPTK, and one of the active contributors to the hidden Markov model toolkit (HTK). He published over 10 journal papers and over 40 conference papers, and received 5 paper awards.


alex soccer Alex Acero. Microsoft Research (USA).

 Title: "New Machine Learning approaches to Speech Recognition"

PDF file with the presentation at FALA2010

Abstract: In this talk I will describe some new approaches to speech recognition that leverage large amounts of data using techniques from information retrieval and machine learning.

Bio (picture and more at http://research.microsoft.com/en-us/people/alexac/)
Alex Acero received a M.S. degree from the Polytechnic University of Madrid, Madrid, Spain, in 1985, a M.S. degree from Rice University, Houston, TX, in 1987, and a Ph.D. degree from Carnegie Mellon University, Pittsburgh, PA, in 1990, all in Electrical Engineering. Dr. Acero worked in Apple Computer’s Advanced Technology Group in 1990-1991. In 1992, he joined Telefonica I+D, Madrid, Spain, as Manager of the speech technology group. Since 1994 he has been with Microsoft Research, Redmond, WA, where he is presently a Research Area Manager directing an organization with 70 engineers conducting research in audio, speech, multimedia, communication, natural language, and information retrieval. He is also an affiliate Professor of Electrical Engineering at the University of Washington, Seattle.

Dr. Acero is author of the books "Acoustical and Environmental Robustness in Automatic Speech Recognition" (Kluwer, 1993) and "Spoken Language Processing" (Prentice Hall, 2001), has written invited chapters in 4 edited books and 200 technical papers. He holds 78 US patents. Dr. Acero is a Fellow of IEEE. He has served the IEEE Signal Processing Society as Vice President Technical Directions (2007-2009), Director Industrial Relations (2009-2011), 2006 Distinguished Lecturer, member of the Board of Governors (2004-2005), Associate Editor for IEEE Signal Processing Letters (2003-2005) and IEEE Transactions of Audio, Speech and Language Processing (2005-2007), and member of the editorial board of IEEE Journal of Selected Topics in Signal Processing (2006-2008) and IEEE Signal Processing Magazine (2008-2010). He also served as member (1996–2000) and Chair (2000-2002) of the Speech Technical Committee of the IEEE Signal Processing Society. He was Publications Chair of ICASSP98, Sponsorship Chair of the 1999 IEEE Workshop on Automatic Speech Recognition and Understanding, and General Co-Chair of the 2001 IEEE Workshop on Automatic Speech Recognition and Understanding. Since 2004, Dr. Acero, along with co-authors Drs. Huang and Hon, has been using proceeds from their textbook “Spoken Language Processing” to fund the “IEEE Spoken Language Processing Student Travel Grant” for the best ICASSP student papers in the speech area. Dr. Acero served as member of the editorial board of Computer Speech and Language and and member of Carnegie Mellon University Dean’s Leadership Council for College of Engineering.



billbyrne Bill Byrne. University of Cambridge (UK).

Title: “Hierarchical phrase-based statistical machine translation with weighted finite state transducers”.

PDF file with the presentation at FALA2010

Abstract: I will present a introduction and review of recent developments in statistical machine translation which exploit weighted finite state transducers to implement a variety of search and estimation algorithms. The presentation will describe work done at the University of Cambridge and the University of Vigo by Adrià de Gispert, Gonzalo Iglesias, Graeme Blackwood, and Jamie Brunning. The focus will be mainly on translation but the approaches described are general and are also applicable to other problems in speech and language processing.

Speaker Bio: Bill Byrne is a Reader in Information Engineering in the Department of Engineering, University of Cambridge. His research is in statistical modelling techniques for speech and language processing, and he has worked on a variety of search and estimation algorithms for speech recognition, speech synthesis, and statistical machine translation. Current research interests include cross-lingual acoustic modelling for speech synthesis, weighted finite state transducers for hierarchical and syntactic phrase-based translation, and the use of natural language generation in statistical machine translation. He has published more than 100 refereed journal articles and conference papers and has an extensive history of editorial and professional service. He has received research funding from NSF(USA), DARPA(USA), Microsoft, and Google, and he is currently coordinator of the ICT-FP7 project FAUST (faust-fp7.eu) on interactive statistical machine translation. He came to Cambridge in 2004 as a Lecturer in Speech Processing, having been Research Associate Professor at the Johns Hopkins University Center for Language and Speech Processing (USA). He received his Ph.D. in Electrical Engineering from the University of Maryland, College Park, and he is a Fellow of Clare College, Cambridge.

