Learning object localization and 6d pose estimation from simulation and weakly labeled real images

Jean Philippe Mercier, Chaitanya Mitash, Philippe Giguere, Abdeslam Boularias

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Accurate pose estimation is often a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. When deep learning approaches are employed to perform this task, they typically require a large amount of training data. However, obtaining precise 6 degrees of freedom for ground-truth can be prohibitively expensive. This work therefore proposes an architecture and a training process to solve this issue. More precisely, we present a weak object detector that enables localizing objects and estimating their 6D poses in cluttered and occluded scenes. To minimize the human labor required for annotations, the proposed detector is trained with a combination of synthetic and a few weakly annotated real images (as little as 10 images per object), for which a human provides only a list of objects present in each image (no time-consuming annotations, such as bounding boxes, segmentation masks and object poses). To close the gap between real and synthetic images, we use multiple domain classifiers trained adversarially. During the inference phase, the resulting class-specific heatmaps of the weak detector are used to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.

Original languageEnglish (US)
Title of host publication2019 International Conference on Robotics and Automation, ICRA 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3500-3506
Number of pages7
ISBN (Electronic)9781538660263
DOIs
StatePublished - May 1 2019
Event2019 International Conference on Robotics and Automation, ICRA 2019 - Montreal, Canada
Duration: May 20 2019May 24 2019

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
Volume2019-May
ISSN (Print)1050-4729

Conference

Conference2019 International Conference on Robotics and Automation, ICRA 2019
CountryCanada
CityMontreal
Period5/20/195/24/19

Fingerprint

Detectors
Robotics
Computer vision
Masks
Classifiers
Personnel
Deep learning

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Cite this

Mercier, J. P., Mitash, C., Giguere, P., & Boularias, A. (2019). Learning object localization and 6d pose estimation from simulation and weakly labeled real images. In 2019 International Conference on Robotics and Automation, ICRA 2019 (pp. 3500-3506). [8794112] (Proceedings - IEEE International Conference on Robotics and Automation; Vol. 2019-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICRA.2019.8794112
Mercier, Jean Philippe ; Mitash, Chaitanya ; Giguere, Philippe ; Boularias, Abdeslam. / Learning object localization and 6d pose estimation from simulation and weakly labeled real images. 2019 International Conference on Robotics and Automation, ICRA 2019. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 3500-3506 (Proceedings - IEEE International Conference on Robotics and Automation).
@inproceedings{a6349fbb081841b1b57da3258af14425,
title = "Learning object localization and 6d pose estimation from simulation and weakly labeled real images",
abstract = "Accurate pose estimation is often a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. When deep learning approaches are employed to perform this task, they typically require a large amount of training data. However, obtaining precise 6 degrees of freedom for ground-truth can be prohibitively expensive. This work therefore proposes an architecture and a training process to solve this issue. More precisely, we present a weak object detector that enables localizing objects and estimating their 6D poses in cluttered and occluded scenes. To minimize the human labor required for annotations, the proposed detector is trained with a combination of synthetic and a few weakly annotated real images (as little as 10 images per object), for which a human provides only a list of objects present in each image (no time-consuming annotations, such as bounding boxes, segmentation masks and object poses). To close the gap between real and synthetic images, we use multiple domain classifiers trained adversarially. During the inference phase, the resulting class-specific heatmaps of the weak detector are used to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.",
author = "Mercier, {Jean Philippe} and Chaitanya Mitash and Philippe Giguere and Abdeslam Boularias",
year = "2019",
month = "5",
day = "1",
doi = "10.1109/ICRA.2019.8794112",
language = "English (US)",
series = "Proceedings - IEEE International Conference on Robotics and Automation",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "3500--3506",
booktitle = "2019 International Conference on Robotics and Automation, ICRA 2019",
address = "United States",

}

Mercier, JP, Mitash, C, Giguere, P & Boularias, A 2019, Learning object localization and 6d pose estimation from simulation and weakly labeled real images. in 2019 International Conference on Robotics and Automation, ICRA 2019., 8794112, Proceedings - IEEE International Conference on Robotics and Automation, vol. 2019-May, Institute of Electrical and Electronics Engineers Inc., pp. 3500-3506, 2019 International Conference on Robotics and Automation, ICRA 2019, Montreal, Canada, 5/20/19. https://doi.org/10.1109/ICRA.2019.8794112

Learning object localization and 6d pose estimation from simulation and weakly labeled real images. / Mercier, Jean Philippe; Mitash, Chaitanya; Giguere, Philippe; Boularias, Abdeslam.

2019 International Conference on Robotics and Automation, ICRA 2019. Institute of Electrical and Electronics Engineers Inc., 2019. p. 3500-3506 8794112 (Proceedings - IEEE International Conference on Robotics and Automation; Vol. 2019-May).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Learning object localization and 6d pose estimation from simulation and weakly labeled real images

AU - Mercier, Jean Philippe

AU - Mitash, Chaitanya

AU - Giguere, Philippe

AU - Boularias, Abdeslam

PY - 2019/5/1

Y1 - 2019/5/1

N2 - Accurate pose estimation is often a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. When deep learning approaches are employed to perform this task, they typically require a large amount of training data. However, obtaining precise 6 degrees of freedom for ground-truth can be prohibitively expensive. This work therefore proposes an architecture and a training process to solve this issue. More precisely, we present a weak object detector that enables localizing objects and estimating their 6D poses in cluttered and occluded scenes. To minimize the human labor required for annotations, the proposed detector is trained with a combination of synthetic and a few weakly annotated real images (as little as 10 images per object), for which a human provides only a list of objects present in each image (no time-consuming annotations, such as bounding boxes, segmentation masks and object poses). To close the gap between real and synthetic images, we use multiple domain classifiers trained adversarially. During the inference phase, the resulting class-specific heatmaps of the weak detector are used to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.

AB - Accurate pose estimation is often a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. When deep learning approaches are employed to perform this task, they typically require a large amount of training data. However, obtaining precise 6 degrees of freedom for ground-truth can be prohibitively expensive. This work therefore proposes an architecture and a training process to solve this issue. More precisely, we present a weak object detector that enables localizing objects and estimating their 6D poses in cluttered and occluded scenes. To minimize the human labor required for annotations, the proposed detector is trained with a combination of synthetic and a few weakly annotated real images (as little as 10 images per object), for which a human provides only a list of objects present in each image (no time-consuming annotations, such as bounding boxes, segmentation masks and object poses). To close the gap between real and synthetic images, we use multiple domain classifiers trained adversarially. During the inference phase, the resulting class-specific heatmaps of the weak detector are used to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.

UR - http://www.scopus.com/inward/record.url?scp=85071450808&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071450808&partnerID=8YFLogxK

U2 - 10.1109/ICRA.2019.8794112

DO - 10.1109/ICRA.2019.8794112

M3 - Conference contribution

AN - SCOPUS:85071450808

T3 - Proceedings - IEEE International Conference on Robotics and Automation

SP - 3500

EP - 3506

BT - 2019 International Conference on Robotics and Automation, ICRA 2019

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Mercier JP, Mitash C, Giguere P, Boularias A. Learning object localization and 6d pose estimation from simulation and weakly labeled real images. In 2019 International Conference on Robotics and Automation, ICRA 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 3500-3506. 8794112. (Proceedings - IEEE International Conference on Robotics and Automation). https://doi.org/10.1109/ICRA.2019.8794112