zum Inhalt
Header Image

The Decade of Machines, that Understand Speech

-
Schrödinger-Saal
Plenary / Panel
German and English language

Alex Acero

THE DECADE OF MACHINES THAT UNDERSTAND SPEECH

Speech recognition has been an active area of research for the last 40 years that, while it s starting to be used in some commercial applications, is far from the Star Trek computer we all want. Many of the predictions in science fiction movies like  2001 Space Odyssey have been correct but not the prediction of the intelligent computer that talks. In this talk I will give a brief historical overview and then describe some of the challenges this technology faces. Demos that illustrate the state-of-the-art will be provided. Finally I ll describe opportunities for speech technology during this decade.

Artificial Intelligence is the set of disciplines that tackle problems that humans find easy to solve but machines find it hard. Speech recognition is the holy grail of artificial intelligence. Many users are perplexed that computers can beat a chess grandmaster yet cannot do something  as simple as recognize speech reliably. This mismatch in expectations has caused many problems in the field.

Humans do much better than machines in recognizing speech because they don t  simply transcribe the words but they also understand what the message is, and thus can guess what a missing word (perhaps due to background noise or lack of clarity on the speaker) is through the use of context. Understanding and transcription often come hand in hand for humans, yet this is not the case with computers that have a very limited understanding capability. Much of the work that will happen this decade to break this status-quo will have to do with improving this context model by adding domain-independent knowledge as well as personalization.

The  cocktail party effect shows the ability of humans to follow one conversation when several are present simultaneously. This is currently not possible with today s speech recognition technology. Scene analysis should take the incoming signal and interpreted as a sum of two signals that has the highest likelihood and a more powerful spectral analysis is needed for this to happen. In addition, the context model will be needed in breaking up the signal into two or more independent signals. This poses tremendous computational and algorithmic challenges that will need to be resolved before we can successfully talk to our smartphones in the train station or cafeteria.

Speech recognition works reasonably well when a speaker trains a system and articulates his/her speech. The error rate of recognition systems increases to the point of making them useless when the user speaks in a more spontaneous manner. A new paradigm is needed to better model this spontaneous style.

Senior Researcher and Manager Speech Technology Group
Dean emeritus and distinguished Professor of International Affairs, Georgetown University, Washington, D.C.

Alejandro ACERO

Senior Researcher and Manager Speech Technology Group

 Before joining Microsoft in 1994, I worked in the speech groups of Apple Computer and Telefonica Investigacion y Desarrollo. I received a Ph.D. from Carnegie Mellon University in 1990, a Master's from Rice University in 1987 and an engineering degree from the Universidad Politecnica de Madrid in 1985, all in Electrical Engineering. I'm also an affiliate Professor of Electrical Engineering at University of Washington.
 Research interests:
 Speech Recognition: robustness to noise, rapid adaptation, acoustic modeling, signal processing.
 Spoken Language Systems: rapid prototyping of speech understanding systems.
 Speech Synthesis: automatically trained concatenative synthesis and distribution-based synthesis.

Dr. Peter F. KROGH

Dean emeritus and distinguished Professor of International Affairs, Georgetown University, Washington, D.C.

Trainee and Acting Assistant Branch Manager, The New England Merchants Bank, Boston Instructor in Government, Tufts University Assistant Dean, Fletcher School of Law and Diplomacy, Tufts University Host, television interview program, "Backgrounds" - WGBH-TV, Boston Visiting Scholar, The Brookings Institute White House Fellow, Special Assistant to the Secretary of State Associate Dean, Fletcher School of Law and Diplomacy, Tufts University Dean and Professor of International Affairs, School of Foreign Service Moderator, weekly PBS television program on foreign affairs "American Interests" Moderator, PBS television foreign affairs series: "Great Decisions"
 Studied Arts in Law and Diplomacy and Philosophy at Tufts University
1958-1960
1961-1962
1962-1967
1963-1967
1965
1967-1968
1968-1970
1970-1995
1982-1988
1988-2005

Technologiegespräche

Timetable einblenden
kategorie: Alle Breakout Plenary

21.08.2003

11:00 - 12:15 Eröffnung Plenary
11:15 - 12:00 Zeit des Wandels – Wandel als Chance Plenary
12:00 - 12:45 50 Years of Schrödingers Reflections on Life and Living Plenary
13:00 - 14:15 Location Strategies for Know-how Intensive Industries Plenary
14:15 - 15:15 Medical Technology and Preventive Medicine Plenary
15:15 - 16:30 The Future of European Reseach – New Instruments and Resources Plenary
18:00 - 18:45 The Living Clock Plenary
18:45 - 19:30 The Devices of Wonder – the Science of Devices of Wonder Plenary

22.08.2003

07:00 - 15:00 Arbeitskreis 5: Innovationsmotor Mikro- und Nanotechnologien Breakout
07:00 - 15:00 Arbeitskreis 4: Kyoto und CO2 – Technologielokomotive und/oder Anlass zur Standortverlagerung? Breakout
07:00 - 15:00 Arbeitskreis 7: Neue Mobilität – neue Partnerschaften für die westlichen Balkanländer Breakout
07:00 - 12:00 Off Alpbach Plenary
07:00 - 15:00 Arbeitskreis 8: Medizintechnologie und Vorsorgemedizin – Finanzierung und Organisation Breakout
07:00 - 15:00 Arbeitskreis 6: Brain Gain, Brain Drain – Zukunftsnetzwerke Österreich – USA Breakout
07:00 - 15:00 Arbeitskreis 2: F&E Infrastruktur – eine Standortstrategie für Großstadtbetreibe Breakout
07:00 - 15:00 Arbeitskreis 9: Digitalisierung der Kommunikation – „Ihr persönliches Radio- und Fernsehprogramm“ Breakout
07:00 - 15:00 Arbeitskreis 1: Risiko Breakout
07:00 - 15:00 Arbeitskreis 3: Utilities und Infrastruktur – Rückgrat industrialisierter Länder Breakout
18:00 - 18:45 The Decade of Machines, that Understand Speech Plenary
18:45 - 19:30 Technology and Know-how Management in and for Intelligence Services Plenary

23.08.2003

07:00 - 08:00 The Location of Science Plenary
08:00 - 09:00 Reflexionen über die Technologiegespräche 2003 – Zusammenfassung „Junior-Alpbach“ Plenary
09:30 - 10:15 Cosmic Background Radiation Plenary
10:15 - 11:00 Architecture für Science – the New Architecture of Science Plenary