Crowdsourcing for Speech Processing: Applications to Data Collection, Transcription and Assessment - Tapa dura

Eskenazi, Maxine; Levow, Gina-Anne; Meng, Helen; Parent, Gabriel; Suendermann, David

 
9781118358696: Crowdsourcing for Speech Processing: Applications to Data Collection, Transcription and Assessment

Sinopsis

Provides an insightful and practical introduction to crowdsourcing as a means of rapidly processing speech data

Intended for those who want to get started in the domain and  learn how to set up a task, what interfaces are available, how to assess the work, etc. as well as for those who already have used crowdsourcing and want to create better tasks and obtain better assessments of the work of the crowd. It will include screenshots to show examples of good and poor interfaces; examples of case studies in speech processing tasks, going through the task creation process, reviewing options in the interface, in the choice of medium (MTurk or other) and explaining choices, etc.

  • Provides an insightful and practical introduction to crowdsourcing as a means of rapidly processing speech data.
  • Addresses important aspects of this new technique that should be mastered before attempting a crowdsourcing application.
  • Offers speech researchers the hope that they can spend much less time dealing with the data gathering/annotation bottleneck, leaving them to focus on the scientific issues. 
  • Readers will directly benefit from the book’s successful examples of how crowd- sourcing was implemented for speech processing, discussions of interface and processing choices that worked and  choices that didn’t, and guidelines on how to play and record speech over the internet, how to design tasks, and how to assess workers.

Essential reading for researchers and practitioners in speech research groups involved in speech processing

"Sinopsis" puede pertenecer a otra edición de este libro.

Acerca del autor

Maxine Eskenazi, Carnegie Mellon University, USA
Dr. Eskenazi is Principal Systems Scientist at the Language Technologies Institute, Carnegie Mellon University, USA. She has authored over 100 scientific papers in the areas of computer assisted language learning and speech and spoken dialog systems. Her work has produced such systems as the Let's Go spoken dialog system and the REAP vocabulary tutor. She is also the founder and CTO of the Carnegie Speech Company.

Gina-Anne Levow, University of Washington, USA
Dr. Levow is currently an Assistant Professor in the Department of Linguistics, University of Washington, USA. Prior to joining the faculty at the University of Washington, she served on the faculty at the University of Chicago in the Department of Computer Science and as a Research Fellow at the University of Manchester, UK. She served on the Editorial Board of Computational Linguistics and as Associate Editor of ACM Transactions on Asian Language Processing.

Helen Meng, The Chinese University of Hong Kong, Hong Kong
Dr. Meng is Founder and Director of the Human-Computer Communications Laboratory at The Chinese University of Hong Kong, and is also the Founder and Co-Director of the Microsoft-CUHK Joint Laboratory for Human-Centric Computing and Interface Technologies, which was conferred the national status of the Ministry of Education of China (MoE) Key Laboratory in 2008. Prof. Meng also served as an Associate Dean (Research) of the Faculty of Engineering from 2006 to 2010. She serves as Editor-in-Chief of the IEEE Transactions on Audio, Speech and Language Processing.

Gabriel Parent, Amazon.com, USA
Gabriel Parent is a Software Development Engineer at Amazon.com working on solving natural language related problems. His main research focuses were human-computer interaction through spoken dialog systems and crowdsourcing.

David Suendermann, Baden-Wuerttemberg Cooperative State University, Germany
Dr. Sundermann is currently full Professor of Computer Science at the Baden-Wuerttemberg Cooperative State University, Stuttgart, Germany. He is also the Principal Speech Scientist of SpeechCycle, New York, USA which has been recognized by Deloitte as a "Technology Fast 500" company based on revenue growth. He has authored more than 70 publications and patents, including a book and six book chapters.

De la contraportada

The concept of crowdsourcing is based on the observation that if a crowd of non-experts is asked an opinion, the aggregation of their individual opinions will be very close to the true value. Tasks such as collecting speech, labelling it, assessing systems and carrying out studies on the speech data are natural candidates for crowdsourcing. This book is a detailed and hands-on comprehensive reference for those who want to use crowdsourcing for speech applications. From the reader who has already used crowdsourcing and wants to refine their methods to the novice who has never used this technique before; this book will provide a practical introduction to crowdsourcing as a means of rapidly processing speech data with contributions from leading researchers in the field.

  • Informs readers about how to collect and label speech using crowdsourcing; how to assess speech applications and run perception studies using crowdsourcing.
  • Explains to readers about how to choose crowdsourcing platforms.
  • Considers the ethical and legal implications of performing crowdsourcing for speech processing.
  • Includes numerous real-life examples of how to implement crowdsourcing for various types of speech processing.
  • Offers several options for each type of task enabling readers to choose which option best fits their individual needs.
  • Provides an extensive overview of the literature on crowdsourcing for speech processing.

"Sobre este título" puede pertenecer a otra edición de este libro.