MS-COCO

Microsoft COCO: Common Objects in Context. We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance segmentations to aid in precise object localization. Our dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old. With a total of 2.5 million labeled instances in 328k images, the creation of our dataset drew upon extensive crowd worker involvement via novel user interfaces for category detection, instance spotting and instance segmentation. We present a detailed statistical analysis of the dataset in comparison to PASCAL, ImageNet, and SUN. Finally, we provide baseline performance analysis for bounding box and segmentation detection results using a Deformable Parts Model.


References in zbMATH (referenced in 49 articles )

Showing results 1 to 20 of 49.
Sorted by year (citations)

1 2 3 next

  1. Avazov, Kuldoshbay; Abdusalomov, Akmalbek; Mukhiddinov, Mukhriddin; Baratov, Nodirbek; Makhmudov, Fazliddin; Cho, Young Im: An improvement for the automatic classification method for ultrasound images used on CNN (2022)
  2. Chrupała, Grzegorz: Visually grounded models of spoken language: a survey of datasets, architectures and evaluation techniques (2022)
  3. Iliadis, Dimitrios; De Baets, Bernard; Waegeman, Willem: Multi-target prediction for dummies using two-branch neural networks (2022)
  4. Bowden, Adam; Sirakov, Nikolay Metodiev: Active contour directed by the Poisson gradient vector field and edge tracking (2021)
  5. Cauchois, Maxime; Gupta, Suyash; Duchi, John C.: Knowing what you know: valid and validated confidence sets in multiclass and multilabel prediction (2021)
  6. Chen, Zhe; Zhang, Jing; Tao, Dacheng: Recursive context routing for object detection (2021)
  7. Kortylewski, Adam; Liu, Qing; Wang, Angtian; Sun, Yihong; Yuille, Alan: Compositional convolutional neural networks: a robust and interpretable model for object recognition under occlusion (2021)
  8. Li, Lingfeng; Luo, Shousheng; Tai, Xue-Cheng; Yang, Jiang: A new variational approach based on level-set function for convex hull problem with outliers (2021)
  9. Marcos Nieto, Orti Senderos, Oihana Otaegui: Boosting AI applications: Labeling format for complex datasets (2021) not zbMATH
  10. Mark Weber, Huiyu Wang, Siyuan Qiao, Jun Xie, Maxwell D. Collins, Yukun Zhu, Liangzhe Yuan, Dahun Kim, Qihang Yu, Daniel Cremers, Laura Leal-Taixe, Alan L. Yuille, Florian Schroff, Hartwig Adam, Liang-Chieh Chen: DeepLab2: A TensorFlow Library for Deep Labeling (2021) arXiv
  11. Nie, Yan; Zhang, Taiping; Zhao, Linchang; Ma, Xindi; Tang, Yuanyan; Liu, Xiaoyu: Siamese pyramid residual module with local binary convolution network for single object tracking (2021)
  12. Northcutt, Curtis G.; Jiang, Lu; Chuang, Isaac L.: Confident learning: estimating uncertainty in dataset labels (2021)
  13. Shu, Xin; Cheng, Xin; Xu, Shubin; Chen, Yunfang; Ma, Tinghuai; Zhang, Wei: How to construct low-altitude aerial image datasets for deep learning (2021)
  14. Sven Kreiss, Lorenzo Bertoni, Alexandre Alahi: OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association (2021) arXiv
  15. Wangni, Jianqiao; Lin, Dahua; Liu, Ji; Daniilidis, Kostas; Shi, Jianbo: Towards statistically provable geometric 3D human pose recovery (2021)
  16. Wang, Zhengyang; Ji, Shuiwang: Smoothed dilated convolutions for improved dense prediction (2021)
  17. Xin Chen, Anqi Pang, Wei Yang, Yuexin Ma, Lan Xu, Jingyi Yu: SportsCap: Monocular 3D Human Motion Capture and Fine-grained Understanding in Challenging Sports Videos (2021) arXiv
  18. Yuan, Yuhui; Huang, Lang; Guo, Jianyuan; Zhang, Chao; Chen, Xilin; Wang, Jingdong: OCNet: object context for semantic segmentation (2021)
  19. Zejiang Shen, Ruochen Zhang, Melissa Dell, Benjamin Charles Germain Lee, Jacob Carlson, Weining Li: LayoutParser Toolkit Document Image (2021) arXiv
  20. Zhang, Susu; Ni, Jiancheng; Hou, Lijun; Zhou, Zili; Hou, Jie; Gao, Feng: Global-affine and local-specific generative adversarial network for semantic-guided image generation (2021)

1 2 3 next