TY - JOUR
T1 - Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs
AU - Chang, Haonan
AU - Boyalakuntla, Kowndinya
AU - Lu, Shiyang
AU - Cai, Siwei
AU - Jing, Eric Pu
AU - Keskar, Shreesh
AU - Geng, Shijie
AU - Abbas, Adeeb
AU - Zhou, Lifeng
AU - Bekris, Kostas
AU - Boularias, Abdeslam
N1 - Publisher Copyright:
© 2023 Proceedings of Machine Learning Research. All Rights Reserved.
PY - 2023
Y1 - 2023
N2 - We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as “pick up a cup on a kitchen table” or “navigate to a sofa on which someone is sitting”. In contrast to existing research on 3D scene graphs, OVSG supports free-form text input and open-vocabulary querying. Through a series of comparative experiments using the ScanNet [1] dataset and a self-collected dataset, we demonstrate that our proposed approach significantly surpasses the performance of previous semantic-based localization techniques. Moreover, we highlight the practical application of OVSG in real-world robot navigation and manipulation experiments. The code and dataset used for evaluation can be found at https://github.com/changhaonan/OVSG.
AB - We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as “pick up a cup on a kitchen table” or “navigate to a sofa on which someone is sitting”. In contrast to existing research on 3D scene graphs, OVSG supports free-form text input and open-vocabulary querying. Through a series of comparative experiments using the ScanNet [1] dataset and a self-collected dataset, we demonstrate that our proposed approach significantly surpasses the performance of previous semantic-based localization techniques. Moreover, we highlight the practical application of OVSG in real-world robot navigation and manipulation experiments. The code and dataset used for evaluation can be found at https://github.com/changhaonan/OVSG.
KW - Object Grounding
KW - Open-Vocabulary Semantics
KW - Scene Graph
UR - http://www.scopus.com/inward/record.url?scp=85174051892&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85174051892&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85174051892
SN - 2640-3498
VL - 229
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 7th Conference on Robot Learning, CoRL 2023
Y2 - 6 November 2023 through 9 November 2023
ER -