Region-based Object Detection and Classification using Faster R-CNN
Abstract
With the advent of Deep Learning the machine learning systems are able to recognize and classify objects of interest in an image. Various advancement has been done in the field of object recognition and classification. Our research work focusses on improving the R-CNN, Fast R-CNN, YOLO architecture. The work focused on using Region Proposals Network (RPN) to extract region of interest in an image. RPN outputs an image based on the objectness score. The output objects are subjected to Roll Polling for classification. Our research work focusses on training Faster R-CNN using custom based data set of images. Our trained network efficiently detects objects from an image consisting of multiple objects. Our network requires minimum GPU capability of 3. 0 or higher.
Keywords: Convolution neural network, deep learning, faster R-CNN, region proposal network
Full Text:
PDFReferences
Ren, Shaoqing, “Faster R-CNN: Towards Real-Time Object
detection with Region Proposal Networks.” Advances in Neural
Information Processing Systems. 2015.
Girshick, Ross. “Fast r-cnn.”Proceedings of the IEEE
International Conference on Computer Vision. 2015.
Uijlings, Jasper RR, “Selective search for object
recognition.” International Journal of Computer Vision
(2013): 154-171.
R. Girshick, “Fast R-CNN,” in IEEE International
Conference on Computer Vision (ICCV), 2015.
J. R. Uijlings, K. E. van de Sande, T. Gevers, and A. W.
Smeulders,“Selective search for object recognition,” International
Journal of Computer Vision (IJCV), 2013.
C. L. Zitnick and P. Dollar, “Edge boxes: Locating object
proposals from edges,” in European Conference on Computer
Vision (ECCV), 2014.
P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D.
Ramanan, “Object detection with discriminatively trained
partbased models,” IEEE Transactions on Pattern Analysis and
Machine Intelligence (TPAMI), 2010.
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN:
Towards real-time object detection with region proposal
networks,” in Neural Information Processing Systems (NIPS),
J. Zhu, X. Chen, and A. L. Yuille, “DeePM: A deep part-
based model for object detection and semantic part localization,”
arXiv:1511.07131, 2015.
J. Johnson, A. Karpathy, and L. Fei-Fei, “Densecap: Fully
convolutional localization networks for dense captioning,”
arXiv:1511.07571, 2015.
J. Hosang, R. Benenson, and B. Schiele, “How good are
detection proposals, really?” in British Machine Vision
Conference (BMVC), 2014.
N. Chavali, H. Agrawal, A. Mahendru, and D. Batra,
“Object-Proposal Evaluation Protocol is ’Gameable’,” arXiv:
05836, 2015.
S. Ren, K. He, R. Girshick, X. Zhang, and J. Sun, “Object
detection networks on convolutional feature maps,”
arXiv:1504.06066, 2015.
C. Szegedy, A. Toshev, and D. Erhan, “Deep neural
networks for object detection,” in Neural Information Processing
Systems (NIPS), 2013.
C. Szegedy, S. Reed, D. Erhan, and D. Anguelov,
“Scalable, high-quality object detection,” arXiv:1412.1441 (v1),
J. Dai, K. He, and J. Sun, “Convolutional feature masking
for joint object and stuff segmentation,” in IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2015.
J. K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and
Y. Bengio,“Attention-based models for speech recognition,”in
Neural Information Processing Systems (NIPS), 2015.
V. Nair and G. E. Hinton, “Rectified linear units improve
restricted boltzmann machines,” in International Conference on
Machine Learning (ICML), 2010.
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E.
Howard, W. Hubbard, and L. D. Jackel, “Backpropagation
applied to handwritten zip code recognition,” Neural
computation, 1989.
A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet
classification with deep convolutional neural networks,” inNeural
Information Processing Systems (NIPS), 2012.
Girshick, Ross, "Rich feature hierarchies for accurate
object detection and semantic segmentation." Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition.
Zitnick, C. Lawrence, and Piotr Dollar. "Edge boxes:
Locating object proposals from edges." European Conference on
Computer Vision 2014. Springer International Publishing, 2014.
-405.
K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid
pooling in deep convolutional networks for visual recognition,”
in European Conference on Computer Vision (ECCV), 2014
K. Simonyan and A. Zisserman, “Very deep convolutional
networks for large-scale image recognition,” in International
Conference on Learning Representations (ICLR), 2015.
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich
feature hierarchies for accurate object detection and semantic
segmentation,” in IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 2014.
J. Long, E. Shelhamer, and T. Darrell, “Fully
convolutional networks for semantic segmentation,” in IEEE
Conference on Computer Vision and Pattern Recognition
(CVPR), 2015.
P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus,
and Y. LeCun, “Overfeat: Integrated recognition, localization and
detection using convolutional networks,” in International
Conference on Learning Representations (ICLR), 2014.
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D.
Anguelov, D. Erhan, and A. Rabinovich, “Going deeper with
convolutions,” in IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 2015.
Li Z, Zhang L, Fang Y, Wang J, Xu H, Yin B, Lu H, “Deep
People Counting with Faster R-CNN and Correlation Tracking,”
Proceedings of the International Conference on Internet
Multimedia Computing and Service - ICIMCS'16 (2016) pp. 57-
Object Detection using fasterR-CNN,2017[Online]
Available: https://in.mathworks.com/help/vision/examp
les/object-detection-using-faster-r-cnn-deep-
learning.html?requestedDomain=www.mathworks.com
[Accessed 10-Oct-2017]
Refbacks
- There are currently no refbacks.