Open Access Open Access  Restricted Access Subscription or Fee Access

Region-based Object Detection and Classification using Faster R-CNN

Abhishek Mehta, Subhashchandra Desai, Ashish Chaturvedi

Abstract


With the advent of Deep Learning the machine learning systems are able to recognize and classify objects of interest in an image. Various advancement has been done in the field of object recognition and classification. Our research work focusses on improving the R-CNN, Fast R-CNN, YOLO architecture. The work focused on using Region Proposals Network (RPN) to extract region of interest in an image. RPN outputs an image based on the objectness score. The output objects are subjected to Roll Polling for classification. Our research work focusses on training Faster R-CNN using custom based data set of images. Our trained network efficiently detects objects from an image consisting of multiple objects. Our network requires minimum GPU capability of 3. 0 or higher.


Keywords: Convolution neural network, deep learning, faster R-CNN, region proposal network


Full Text:

PDF

References


Ren, Shaoqing, “Faster R-CNN: Towards Real-Time Object

detection with Region Proposal Networks.” Advances in Neural

Information Processing Systems. 2015.

Girshick, Ross. “Fast r-cnn.”Proceedings of the IEEE

International Conference on Computer Vision. 2015.

Uijlings, Jasper RR, “Selective search for object

recognition.” International Journal of Computer Vision

(2013): 154-171.

R. Girshick, “Fast R-CNN,” in IEEE International

Conference on Computer Vision (ICCV), 2015.

J. R. Uijlings, K. E. van de Sande, T. Gevers, and A. W.

Smeulders,“Selective search for object recognition,” International

Journal of Computer Vision (IJCV), 2013.

C. L. Zitnick and P. Dollar, “Edge boxes: Locating object

proposals from edges,” in European Conference on Computer

Vision (ECCV), 2014.

P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D.

Ramanan, “Object detection with discriminatively trained

partbased models,” IEEE Transactions on Pattern Analysis and

Machine Intelligence (TPAMI), 2010.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN:

Towards real-time object detection with region proposal

networks,” in Neural Information Processing Systems (NIPS),

J. Zhu, X. Chen, and A. L. Yuille, “DeePM: A deep part-

based model for object detection and semantic part localization,”

arXiv:1511.07131, 2015.

J. Johnson, A. Karpathy, and L. Fei-Fei, “Densecap: Fully

convolutional localization networks for dense captioning,”

arXiv:1511.07571, 2015.

J. Hosang, R. Benenson, and B. Schiele, “How good are

detection proposals, really?” in British Machine Vision

Conference (BMVC), 2014.

N. Chavali, H. Agrawal, A. Mahendru, and D. Batra,

“Object-Proposal Evaluation Protocol is ’Gameable’,” arXiv:

05836, 2015.

S. Ren, K. He, R. Girshick, X. Zhang, and J. Sun, “Object

detection networks on convolutional feature maps,”

arXiv:1504.06066, 2015.

C. Szegedy, A. Toshev, and D. Erhan, “Deep neural

networks for object detection,” in Neural Information Processing

Systems (NIPS), 2013.

C. Szegedy, S. Reed, D. Erhan, and D. Anguelov,

“Scalable, high-quality object detection,” arXiv:1412.1441 (v1),

J. Dai, K. He, and J. Sun, “Convolutional feature masking

for joint object and stuff segmentation,” in IEEE Conference on

Computer Vision and Pattern Recognition (CVPR), 2015.

J. K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and

Y. Bengio,“Attention-based models for speech recognition,”in

Neural Information Processing Systems (NIPS), 2015.

V. Nair and G. E. Hinton, “Rectified linear units improve

restricted boltzmann machines,” in International Conference on

Machine Learning (ICML), 2010.

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E.

Howard, W. Hubbard, and L. D. Jackel, “Backpropagation

applied to handwritten zip code recognition,” Neural

computation, 1989.

A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet

classification with deep convolutional neural networks,” inNeural

Information Processing Systems (NIPS), 2012.

Girshick, Ross, "Rich feature hierarchies for accurate

object detection and semantic segmentation." Proceedings of the

IEEE Conference on Computer Vision and Pattern Recognition.

Zitnick, C. Lawrence, and Piotr Dollar. "Edge boxes:

Locating object proposals from edges." European Conference on

Computer Vision 2014. Springer International Publishing, 2014.

-405.

K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid

pooling in deep convolutional networks for visual recognition,”

in European Conference on Computer Vision (ECCV), 2014

K. Simonyan and A. Zisserman, “Very deep convolutional

networks for large-scale image recognition,” in International

Conference on Learning Representations (ICLR), 2015.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich

feature hierarchies for accurate object detection and semantic

segmentation,” in IEEE Conference on Computer Vision and

Pattern Recognition (CVPR), 2014.

J. Long, E. Shelhamer, and T. Darrell, “Fully

convolutional networks for semantic segmentation,” in IEEE

Conference on Computer Vision and Pattern Recognition

(CVPR), 2015.

P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus,

and Y. LeCun, “Overfeat: Integrated recognition, localization and

detection using convolutional networks,” in International

Conference on Learning Representations (ICLR), 2014.

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D.

Anguelov, D. Erhan, and A. Rabinovich, “Going deeper with

convolutions,” in IEEE Conference on Computer Vision and

Pattern Recognition (CVPR), 2015.

Li Z, Zhang L, Fang Y, Wang J, Xu H, Yin B, Lu H, “Deep

People Counting with Faster R-CNN and Correlation Tracking,”

Proceedings of the International Conference on Internet

Multimedia Computing and Service - ICIMCS'16 (2016) pp. 57-

Object Detection using fasterR-CNN,2017[Online]

Available: https://in.mathworks.com/help/vision/examp

les/object-detection-using-faster-r-cnn-deep-

learning.html?requestedDomain=www.mathworks.com

[Accessed 10-Oct-2017]


Refbacks

  • There are currently no refbacks.