Action Recognition (动作识别): Step-by-step Recognizing Actions with Python and Recurrent Neural Network (一步一步地使用 Python 以及循环神经网络对动作进行识别) (Computer Vision and Machine Learning) - Tapa blanda

Magic, Mark

9781095718124: Action Recognition (动作识别): Step-by-step Recognizing Actions with Python and Recurrent Neural Network (一步一步地使用 Python 以及循环神经网络对动作进行识别) (Computer Vision and Machine Learning)

Tapa blanda

ISBN 10: 1095718126 ISBN 13: 9781095718124

Editorial: Independently published, 2019

Ver todas las copias de esta edici�n del ISBN

0 Usado

2 Nuevo

De EUR 87,50

Research fields: Computer Vision and Machine Learning. Book Topic: Action recognition from videos. Recognition Tool: Recurrent Neural Network (RNN) with LSTM (Long-Short Term Memory) layer and fully connected layer. Programming Language: Step-by-step implementation with Python in Jupyter Notebook. Major Steps: Building a network, training the network, testing the network, comparing the network with an SVM (Support Vector Machines) classifier. Processing Units to Execute the Codes: CPU and GPU (on Google Colaboratory). Image Feature Extraction Tool: Pretrained VGG16 network. Dataset: UCF101 (the first 15 actions, 2010 videos). Main Results: For the testing data, the highest prediction accuracy from the RNN is 86.97%, which is a little higher than that from the SVM classifier (86.09%). 研究领域：计算机视觉以及机器学习。本书主题：视频动作识别。识别工具：具有长短时记忆层（LSTM）和完全连接层的循环神经网络（RNN）。编程语言：Python（一步一步地在 Jupyter Notebook 中实现）。主要步骤：构建一个循环神经网络，训练该网络，测试该网络，比较该网络与一个支持向量机（SVM）分类器。处理器：代码可以运行在 CPU 以及 GPU 上（使用了 Google Colaboratory 上的免费 GPU。提取图像特征的工具：预训练的 VGG16 网络。数据集：UCF101 （仅使用了前15个动作，共2010个视频）。主要结果：对于测试数据来说，从循环神经网络得到的最高预测准确度为86.97%，比从支持向量机分类器得到的准确度稍高（86.09%）。Detailed Description: Recurrent Neural Network (RNN) is a great tool to do video action recognition. This book built an RNN with an LSTM (Long-Short Term Memory) layer and a fully connected layer to do video action recognition. The RNN was trained and evaluated with VGG16 Features that were saved in .mat files; the features were extracted from images with a modified pretrained VGG16 network; the images were converted from videos in the UCF101 dataset, which has 101 different actions including 13,320 videos; please notice that only the first 15 actions in this dataset were used to do the recognition. The codes were implemented step-by-step with Python in Jupyter Notebook, and they could be executed on both CPUs and GPUs; free GPUs on Google Colaboratory were used as hardware accelerator to do most of the calculations. For the purpose of getting a higher testing accuracy, the architecture of the network was regulated, and parameters of the network and its optimizer were fine-tuned. For comparison purpose only, an SVM (Support Vector Machines) classifier was trained and tested. For the first 15 actions in the UCF101 dataset, the highest prediction accuracy of the testing data from the RNN is 86.97%, which is a little higher than that from the SVM classifier (86.09%). In conclusion, the performances of the RNN and the SVM classifier are approximately the same for the task in this book, which is a little embarrassed. However, RNN does have its own advantages in many other cases in the fields of Computer Vision and Machine Learning, and the implementation in this book can be an introduction to this topic in order to throw out a minnow to catch a whale. 详细描述：循环神经网络（RNN）是进行视频动作识别的一个非常好的工具。本书构建了一个带有长短期记忆（LSTM）层和完全连接层的循环神经网络，来进行视频动作的识别。使用保存在 .mat 文件中的 VGG16 特征，我们对该神经网络进行了训练和评估。这些特征是使用修改过的预训练的 VGG16 网络从图像中提取出来的。这些图像是从 UCF101 数据集中的视频转换而来的。该数据集包含101个不同的动作，共有13320个视频。请读者注意：我们只使用了前15个动作来进行识别。所有代码都在 Jupyter Notebook 中使用 Python 一步一步地来实现的。这些代码都可以在 CPU 和 GPU 上运行。我们使用了 Google Colaboratory 上的免费的 GPU 作为硬件加速器，来完成绝大部分的计算。为了获得更高的测试数据的预测准确度，我们对该神经网络的结构进行了调节，并对该网络及其优化器的参数进行了微调。作为对照，我们还训练了一个支持向量机（SVM）分类器，并使用该分类器对测试数据进行了预测。对于 UCF101 数据集中的前15个动作，由循环神经网络得到的测试数据的最高预测准确度为86.97%，比由支持向量机分类器得到的准确度（86.09%）稍高。总体说来，对于本书中的分类任务，循环神经网络和支持向量机分类器的性能大体相当，这个结果让人有点儿尴尬。但是不管怎样，在计算机视觉和机器学习领域，循环神经网络确实有其不可替代的优点。本书中对循环神经网络的介绍和应用，权当抛砖引玉吧。

"Sinopsis" puede pertenecer a otra edici�n de este libro.