Image-to-Video Retrieval by ResNet50

Authors

  • Chomtip Pornpanomchai Faculty of Information and communication technology, Mahidol University, Phuthamonthon 4 road, Salaya, Nakhorn Pathom, Thailand, 73170

DOI:

https://doi.org/10.15379/ijmst.v10i3.3411

Keywords:

Confusion Matrix, Convolutional Neural Networks, Image-to-Video Retrieval, Mean Average Precision (mAP), ResNet50

Abstract

The objective of this research is to create a computer system which can retrieve a video clip by using only a single image. The developed system is called “Image-to-Video Retrieval System (I2VRS)”. The system employs the convolutional neural networks called “ResNet50”, which is a toolbox in MATLAB software to retrieve the video clip dataset.  The ResNet50 is one of the powerful CNN to recognize an image in the image processing technique. The I2VRS creates its own dataset called I2VRS dataset, which consists of 101 video clips and each video clip contains 1,000 video frames.  All video clips are filmed around 60 s. each in the .MP4 file-format.  The system also tests an un-training dataset with 100 images, which are directly taken with a mobile phone in the .JPEG file-format.  The mean average precision (mAP) of the system is 0.9825, with the training dataset time being 5,668.7 s.  The average access time to retrieve a video clip is 1.5726 s. per image.

Downloads

Download data is not yet available.

Downloads

Published

2023-08-18

How to Cite

[1]
C. . Pornpanomchai, “Image-to-Video Retrieval by ResNet50”, ijmst, vol. 10, no. 3, pp. 3604-3615, Aug. 2023.