TY - GEN
T1 - A comparative analysis and study of multiview CNN models for joint object categorization and pose estimation
AU - Elhoseiny, Mohamed
AU - El-Gaaly, Tarek
AU - Bakry, Amr
AU - Elgammal, Ahmed
PY - 2016
Y1 - 2016
N2 - In the Object Recognition task, there exists a dichotomy between the categorization of objects and estimating object pose, where the former necessitates a view-invariant representation, while the latter requires a representation capable of capturing pose information over different categories of objects. With the rise of deep architectures, the prime focus has been on object category recognition. Deep learning methods have achieved wide success in this task. In contrast, object pose estimation using these approaches has received relatively less attention. In this work, we study how Convolutional Neural Networks (CNN) architectures can be adapted to the task of simultaneous object recognition and pose estimation. We investigate and analyze the layers of various CNN models and extensively compare between them with the goal of discovering how the layers of distributed representations within CNNs represent object pose information and how this contradicts with object category representations. We extensively experiment on two recent large and challenging multi-view dataseis and we achieve better than the state-of-the-art.
AB - In the Object Recognition task, there exists a dichotomy between the categorization of objects and estimating object pose, where the former necessitates a view-invariant representation, while the latter requires a representation capable of capturing pose information over different categories of objects. With the rise of deep architectures, the prime focus has been on object category recognition. Deep learning methods have achieved wide success in this task. In contrast, object pose estimation using these approaches has received relatively less attention. In this work, we study how Convolutional Neural Networks (CNN) architectures can be adapted to the task of simultaneous object recognition and pose estimation. We investigate and analyze the layers of various CNN models and extensively compare between them with the goal of discovering how the layers of distributed representations within CNNs represent object pose information and how this contradicts with object category representations. We extensively experiment on two recent large and challenging multi-view dataseis and we achieve better than the state-of-the-art.
UR - http://www.scopus.com/inward/record.url?scp=84998953482&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84998953482&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84998953482
T3 - 33rd International Conference on Machine Learning, ICML 2016
SP - 1402
EP - 1422
BT - 33rd International Conference on Machine Learning, ICML 2016
A2 - Balcan, Maria Florina
A2 - Weinberger, Kilian Q.
PB - International Machine Learning Society (IMLS)
T2 - 33rd International Conference on Machine Learning, ICML 2016
Y2 - 19 June 2016 through 24 June 2016
ER -