Artículos relacionados a Visual Question Answering: From Theory to Application...

Visual Question Answering: From Theory to Application (Advances in Computer Vision and Pattern Recognition) - Tapa blanda

 
9789811909665: Visual Question Answering: From Theory to Application (Advances in Computer Vision and Pattern Recognition)

Sinopsis

Visual Question Answering (VQA) usually combines visual inputs like image and video with a natural language question concerning the input and generates a natural language answer as the output. This is by nature a multi-disciplinary research problem, involving computer vision (CV), natural language processing (NLP), knowledge representation and reasoning (KR), etc.

Further, VQA is an ambitious undertaking, as it must overcome the challenges of general image understanding and the question-answering task, as well as the difficulties entailed by using large-scale databases with mixed-quality inputs. However, with the advent of deep learning (DL) and driven by the existence of advanced techniques in both CV and NLP and the availability of relevant large-scale datasets, we have recently seen enormous strides in VQA, with more systems and promising results emerging.

This book provides a comprehensive overview of VQA, covering fundamental theories, models, datasets, andpromising future directions. Given its scope, it can be used as a textbook on computer vision and natural language processing, especially for researchers and students in the area of visual question answering. It also highlights the key models used in VQA.

"Sinopsis" puede pertenecer a otra edición de este libro.

Acerca del autor

Dr. Qi Wu is Senior Lecturer at the University of Adelaide and Chief Investigator at the ARC Centre of Excellence for Robotic Vision. He is also Director of Vision-and-Language Methods at the Australian Institute for Machine Learning. Dr Wu has been in the Computer Vision field for 10 years and he has a strong track record, having pioneered the field of Vision-and-Language, one of the most interesting and technically challenging areas of Computer Vision. This area, which has emerged over the last 5 years, represents the application of computer vision technology to problems that are closer to Artificial Intelligence. Dr Wu has made breakthroughs in methods and conceptual understanding to advance the field and is recognised as an international leader in the discipline. Beyond publishing some of the seminal papers in the area, he has organised a series of workshops in CVPR, ICCV and ACL. and authored key benchmarks that define the field. Recently, he led a team that won second place in VATEX Video Captioning Challenge, the first place in both TextVQA Challenge and MedicalVQA Challenge. His achievements have been recognised with the Australian Academy of Science J G Russel Award in 2019, one of four awards to ECRs across Australia; and an NVIDIA Pioneer Research Award.

Dr. Peng Wang is Professor at the School of Computer Science, Northwestern Polytechnical University, China. He previously served at the School of Computer Science, University of Adelaide, for four years. His research interests include computer vision, machine learning, and artificial intelligence. 

Dr. Xin Wang is currently Assistant Professor at the Department of Computer Science and Technology, Tsinghua University. His research interests include cross-modal multimedia intelligence and inferable recommendations in social media. He has published several high-quality research papers for top conferences including ICML, KDD, WWW, SIGIR ACM Multimedia, etc. In addition to being selected for the 2017 China Postdoctoral innovative talents supporting program, he received the ACM China Rising Star Award in 2020.

Dr. Xiaodong He is Deputy Managing Director of JD AI Research; Head of the Deep Learning, NLP and Speech Lab; and Technical Vice President of JD.com. He is also Affiliate Professor at the University of Washington (Seattle), where he serves on doctoral supervisory committees. His research interests are mainly in artificial intelligence areas including deep learning, natural language, computer vision, speech, information retrieval, and knowledge representation. He has published more than 100 papers in ACL, EMNLP, NAACL, CVPR, SIGIR, WWW, CIKM, NIPS, ICLR, ICASSP, Proc. IEEE, IEEE TASLP, IEEE SPM, and other venues. He has received several awards including the Outstanding Paper Award at ACL 2015. He is Co-inventor of the DSSM, which is now broadly applied to language, vision, IR, and knowledge representation tasks. He also led the development of the CaptionBot, the world-first image captioning cloud service, deployed in 2016. He and colleagues have won major AI challenges including the 2008 NIST MT Eval, IWSLT 2011, COCO Captioning Challenge 2015, and VQA 2017. His work has been widely integrated into influential software and services including Microsoft Image Caption Services, Bing & Ads, Seeing AI, Word, and PowerPoint. He has held editorial positions with several IEEE journals, served as Area Chair for NAACL-HLT 2015 and served on the organizing committees/program committees of major speech and language processing conferences. He is IEEE Fellow and Member of the ACL.

Wenwu Zhu is currently Professor in the Department of Computer Science and Technology at Tsinghua University and Vice Dean of National Research Center for Information Science and Technology. Prior to his current post, he was Senior Researcher and Research Manager at Microsoft Research Asia. He was Chief Scientist and Director at Intel Research China from 2004to 2008. He worked at Bell Labs New Jersey as Member of Technical Staff during 1996–1999. He received his Ph.D. degree from New York University in 1996.

His current research interests are in the area of data-driven multimedia networking and multimedia intelligence. He has published over 350 referred papers and is Inventor or Co-inventor of over 50 patents. He received eight Best Paper Awards, including ACM Multimedia 2012 and IEEE Transactions on Circuits and Systems for Video Technology in 2001 and 2019.  

He served as EiC for IEEE Transactions on Multimedia (2017–2019). He serves as Chair of the steering committee for IEEE Transactions on Multimedia, and he serves as Associate EiC for IEEE Transactions for Circuits and Systems for Video technology. He serves as General Co-Chair for ACM Multimedia 2018 and ACM CIKM 2019, respectively. He is AAAS Fellow, IEEE Fellow, SPIE Fellow, and Member of The Academy of Europe (Academia Europaea).

De la contraportada

Visual Question Answering (VQA) usually combines visual inputs like image and video with a natural language question concerning the input and generates a natural language answer as the output. This is by nature a multi-disciplinary research problem, involving computer vision (CV), natural language processing (NLP), knowledge representation and reasoning (KR), etc.

Further, VQA is an ambitious undertaking, as it must overcome the challenges of general image understanding and the question-answering task, as well as the difficulties entailed by using large-scale databases with mixed-quality inputs. However, with the advent of deep learning (DL) and driven by the existence of advanced techniques in both CV and NLP and the availability of relevant large-scale datasets, we have recently seen enormous strides in VQA, with more systems and promising results emerging.

This book provides a comprehensive overview of VQA, covering fundamental theories, models, datasets, andpromising future directions. Given its scope, it can be used as a textbook on computer vision and natural language processing, especially for researchers and students in the area of visual question answering. It also highlights the key models used in VQA.

"Sobre este título" puede pertenecer a otra edición de este libro.

Comprar nuevo

Ver este artículo

EUR 19,49 gastos de envío desde Alemania a España

Destinos, gastos y plazos de envío

Otras ediciones populares con el mismo título

9789811909634: Visual Question Answering: From Theory to Application (Advances in Computer Vision and Pattern Recognition)

Edición Destacada

ISBN 10:  9811909636 ISBN 13:  9789811909634
Editorial: Springer, 2022
Tapa dura

Resultados de la búsqueda para Visual Question Answering: From Theory to Application...

Imagen del vendedor

Wu, Qi|Wang, Peng|Wang, Xin|He, Xiaodong|Zhu, Wenwu
ISBN 10: 9811909660 ISBN 13: 9789811909665
Nuevo Tapa blanda

Librería: moluna, Greven, Alemania

Calificación del vendedor: 5 de 5 estrellas Valoración 5 estrellas, Más información sobre las valoraciones de los vendedores

Condición: New. Nº de ref. del artículo: 859350374

Contactar al vendedor

Comprar nuevo

EUR 89,99
Convertir moneda
Gastos de envío: EUR 19,49
De Alemania a España
Destinos, gastos y plazos de envío

Cantidad disponible: Más de 20 disponibles

Añadir al carrito

Imagen del vendedor

Qi Wu
ISBN 10: 9811909660 ISBN 13: 9789811909665
Nuevo Taschenbuch
Impresión bajo demanda

Librería: BuchWeltWeit Ludwig Meier e.K., Bergisch Gladbach, Alemania

Calificación del vendedor: 5 de 5 estrellas Valoración 5 estrellas, Más información sobre las valoraciones de los vendedores

Taschenbuch. Condición: Neu. This item is printed on demand - it takes 3-4 days longer - Neuware -Visual Question Answering (VQA) usually combines visual inputs like image and video with a natural language question concerning the input and generates a natural language answer as the output.This is by nature a multi-disciplinary research problem, involving computer vision (CV), natural language processing (NLP), knowledge representation and reasoning (KR), etc.Further, VQA is an ambitious undertaking, as it must overcome the challenges of general image understanding and the question-answering task, as well as the difficulties entailed by using large-scale databases with mixed-quality inputs. However, with the advent of deep learning (DL) and driven by the existence of advanced techniques in both CV and NLP and the availability of relevant large-scale datasets, we have recently seen enormous strides in VQA, with more systems and promising results emerging.This book provides a comprehensive overview of VQA, covering fundamental theories, models, datasets, and promising future directions. Given its scope, it can be used as a textbook on computer vision and natural language processing, especially for researchers and students in the area of visual question answering. It also highlights the key models used in VQA. 252 pp. Englisch. Nº de ref. del artículo: 9789811909665

Contactar al vendedor

Comprar nuevo

EUR 106,99
Convertir moneda
Gastos de envío: EUR 11,00
De Alemania a España
Destinos, gastos y plazos de envío

Cantidad disponible: 2 disponibles

Añadir al carrito

Imagen de archivo

Wu, Qi; Wang, Peng; Wang, Xin; He, Xiaodong; Zhu, Wenwu
Publicado por Springer, 2023
ISBN 10: 9811909660 ISBN 13: 9789811909665
Nuevo Tapa blanda

Librería: Ria Christie Collections, Uxbridge, Reino Unido

Calificación del vendedor: 5 de 5 estrellas Valoración 5 estrellas, Más información sobre las valoraciones de los vendedores

Condición: New. In. Nº de ref. del artículo: ria9789811909665_new

Contactar al vendedor

Comprar nuevo

EUR 116,27
Convertir moneda
Gastos de envío: EUR 5,19
De Reino Unido a España
Destinos, gastos y plazos de envío

Cantidad disponible: Más de 20 disponibles

Añadir al carrito

Imagen de archivo

Wu, Qi; Wang, Peng; Wang, Xin; He, Xiaodong; Zhu, Wenwu
Publicado por Springer, 2023
ISBN 10: 9811909660 ISBN 13: 9789811909665
Nuevo Tapa blanda

Librería: Best Price, Torrance, CA, Estados Unidos de America

Calificación del vendedor: 5 de 5 estrellas Valoración 5 estrellas, Más información sobre las valoraciones de los vendedores

Condición: New. SUPER FAST SHIPPING. Nº de ref. del artículo: 9789811909665

Contactar al vendedor

Comprar nuevo

EUR 96,21
Convertir moneda
Gastos de envío: EUR 25,59
De Estados Unidos de America a España
Destinos, gastos y plazos de envío

Cantidad disponible: 2 disponibles

Añadir al carrito

Imagen del vendedor

Qi Wu
ISBN 10: 9811909660 ISBN 13: 9789811909665
Nuevo Taschenbuch

Librería: AHA-BUCH GmbH, Einbeck, Alemania

Calificación del vendedor: 5 de 5 estrellas Valoración 5 estrellas, Más información sobre las valoraciones de los vendedores

Taschenbuch. Condición: Neu. Druck auf Anfrage Neuware - Printed after ordering - Visual Question Answering (VQA) usually combines visual inputs like image and video with a natural language question concerning the input and generates a natural language answer as the output.This is by nature a multi-disciplinary research problem, involving computer vision (CV), natural language processing (NLP), knowledge representation and reasoning (KR), etc.Further, VQA is an ambitious undertaking, as it must overcome the challenges of general image understanding and the question-answering task, as well as the difficulties entailed by using large-scale databases with mixed-quality inputs. However, with the advent of deep learning (DL) and driven by the existence of advanced techniques in both CV and NLP and the availability of relevant large-scale datasets, we have recently seen enormous strides in VQA, with more systems and promising results emerging.This book provides a comprehensive overview of VQA, covering fundamental theories, models, datasets, andpromising future directions. Given its scope, it can be used as a textbook on computer vision and natural language processing, especially for researchers and students in the area of visual question answering. It also highlights the key models used in VQA. Nº de ref. del artículo: 9789811909665

Contactar al vendedor

Comprar nuevo

EUR 109,94
Convertir moneda
Gastos de envío: EUR 11,99
De Alemania a España
Destinos, gastos y plazos de envío

Cantidad disponible: 1 disponibles

Añadir al carrito

Imagen del vendedor

Qi Wu
ISBN 10: 9811909660 ISBN 13: 9789811909665
Nuevo Taschenbuch

Librería: buchversandmimpf2000, Emtmannsberg, BAYE, Alemania

Calificación del vendedor: 5 de 5 estrellas Valoración 5 estrellas, Más información sobre las valoraciones de los vendedores

Taschenbuch. Condición: Neu. Neuware -Visual Question Answering (VQA) usually combines visual inputs like image and video with a natural language question concerning the input and generates a natural language answer as the output. This is by nature a multi-disciplinary research problem, involving computer vision (CV), natural language processing (NLP), knowledge representation and reasoning (KR), etc.Further, VQA is an ambitious undertaking, as it must overcome the challenges of general image understanding and the question-answering task, as well as the difficulties entailed by using large-scale databases with mixed-quality inputs. However, with the advent of deep learning (DL) and driven by the existence of advanced techniques in both CV and NLP and the availability of relevant large-scale datasets, we have recently seen enormous strides in VQA, with more systems and promising results emerging.This book provides a comprehensive overview of VQA, covering fundamental theories, models, datasets, andpromising future directions. Given its scope, it can be used as a textbook on computer vision and natural language processing, especially for researchers and students in the area of visual question answering. It also highlights the key models used in VQA.Springer Verlag GmbH, Tiergartenstr. 17, 69121 Heidelberg 252 pp. Englisch. Nº de ref. del artículo: 9789811909665

Contactar al vendedor

Comprar nuevo

EUR 106,99
Convertir moneda
Gastos de envío: EUR 35,00
De Alemania a España
Destinos, gastos y plazos de envío

Cantidad disponible: 2 disponibles

Añadir al carrito

Imagen de archivo

Wu, Qi; Wang, Peng; Wang, Xin; He, Xiaodong; Zhu, Wenwu
Publicado por Springer, 2023
ISBN 10: 9811909660 ISBN 13: 9789811909665
Nuevo Tapa blanda

Librería: Books Puddle, New York, NY, Estados Unidos de America

Calificación del vendedor: 4 de 5 estrellas Valoración 4 estrellas, Más información sobre las valoraciones de los vendedores

Condición: New. pp. 252. Nº de ref. del artículo: 26398549886

Contactar al vendedor

Comprar nuevo

EUR 142,60
Convertir moneda
Gastos de envío: EUR 9,82
De Estados Unidos de America a España
Destinos, gastos y plazos de envío

Cantidad disponible: 4 disponibles

Añadir al carrito

Imagen de archivo

Wu, Qi; Wang, Peng; Wang, Xin; He, Xiaodong; Zhu, Wenwu
Publicado por Springer, 2023
ISBN 10: 9811909660 ISBN 13: 9789811909665
Nuevo Tapa blanda
Impresión bajo demanda

Librería: Majestic Books, Hounslow, Reino Unido

Calificación del vendedor: 5 de 5 estrellas Valoración 5 estrellas, Más información sobre las valoraciones de los vendedores

Condición: New. Print on Demand pp. 252. Nº de ref. del artículo: 397860001

Contactar al vendedor

Comprar nuevo

EUR 148,66
Convertir moneda
Gastos de envío: EUR 10,23
De Reino Unido a España
Destinos, gastos y plazos de envío

Cantidad disponible: 4 disponibles

Añadir al carrito

Imagen de archivo

Wu, Qi/ Wang, Peng/ Wang, Xin/ He, Xiaodong/ Zhu, Wenwu
Publicado por Springer-Nature New York Inc, 2023
ISBN 10: 9811909660 ISBN 13: 9789811909665
Nuevo Paperback

Librería: Revaluation Books, Exeter, Reino Unido

Calificación del vendedor: 5 de 5 estrellas Valoración 5 estrellas, Más información sobre las valoraciones de los vendedores

Paperback. Condición: Brand New. 251 pages. 9.25x6.10x0.53 inches. In Stock. Nº de ref. del artículo: x-9811909660

Contactar al vendedor

Comprar nuevo

EUR 150,36
Convertir moneda
Gastos de envío: EUR 11,56
De Reino Unido a España
Destinos, gastos y plazos de envío

Cantidad disponible: 2 disponibles

Añadir al carrito

Imagen de archivo

Wu, Qi; Wang, Peng; Wang, Xin; He, Xiaodong; Zhu, Wenwu
Publicado por Springer, 2023
ISBN 10: 9811909660 ISBN 13: 9789811909665
Nuevo Tapa blanda
Impresión bajo demanda

Librería: Biblios, Frankfurt am main, HESSE, Alemania

Calificación del vendedor: 5 de 5 estrellas Valoración 5 estrellas, Más información sobre las valoraciones de los vendedores

Condición: New. PRINT ON DEMAND pp. 252. Nº de ref. del artículo: 18398549876

Contactar al vendedor

Comprar nuevo

EUR 153,17
Convertir moneda
Gastos de envío: EUR 14,50
De Alemania a España
Destinos, gastos y plazos de envío

Cantidad disponible: 4 disponibles

Añadir al carrito