Object detection

Object detection is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos. Well-researched domains of object detection include face detection and pedestrian detection. Object detection has applications in many areas of computer vision, including image retrieval and video surveillance.

Uses

It is widely used in computer vision task such as face detection, face recognition, video object co-segmentation. It is also used in tracking objects, for example tracking a ball during a football match, tracking movement of a cricket bat, tracking a person in a video.

Concept

Every object class has its own special features that helps in classifying the class for example all circles are round. Object class detection uses these special features. For example, when looking for circles, objects that are at a particular distance from a point (i.e. the center) are sought. Similarly, when looking for squares, objects that are perpendicular at corners and have equal side lengths are needed. A similar approach is used for face identification where eyes, nose, and lips can be found and features like skin color and distance between eyes can be found.

Methods

Methods for object detection generally fall into either machine learning-based approaches or deep learning-based approaches. For Machine Learning approaches, it becomes necessary to first define features using one of the methods below, then using a technique such as support vector machine (SVM) to do the classification. On the other hand, deep learning techniques that are able to do end-to-end object detection without specifically defining features, and are typically based on convolutional neural networks (CNN).

  • Deep Learning approaches:
    • Region Proposals (R-CNN[2], Fast R-CNN[3], Faster R-CNN[4])
    • Single Shot MultiBox Detector (SSD) [5]
    • You Only Look Once (YOLO) [6][7][8]

See also

References

  1. Dalal, Navneet (2005). "Histograms of oriented gradients for human detection" (PDF). Computer Vision and Pattern Recognition. 1.
  2. Ross, Girshick (2014). "Rich feature hierarchies for accurate object detection and semantic segmentation" (PDF). Proceedings of the IEEE conference on computer vision and pattern recognition.
  3. Girschick, Ross (2015). "Fast R-CNN" (PDF). Proceedings of the IEEE international conference on computer vision: 1440-1448.
  4. Shaoqing, Ren (2015). "Faster R-CNN" (PDF). Advances in neural information processing systems.
  5. Liu, Wei (October 2016). "SSD: Single shot multibox detector" (PDF). European conference on computer vision: 21-37.
  6. Redmon, Joseph (2016). "You only look once: Unified, real-time object detection". Proceedings of the IEEE conference on computer vision and pattern recognition.
  7. Redmon, Joseph (2017). "YOLO9000: better, faster, stronger" (PDF). arXiv.
  8. Redmon, Joseph (2018). "Yolov3: An incremental improvement" (PDF). arXiv:1804.02767.
  • "Object Class Detection". Vision.eecs.ucf.edu. Retrieved 2013-10-09.
  • "ETHZ - Computer Vision Lab: Publications". Vision.ee.ethz.ch. Retrieved 2013-10-09.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.