A coupled encoder–decoder network for joint face detection and landmark localization

Lezi Wang, Xiang Yu, Thirimachos Bourlai, Dimitri Metaxas

Research output: Contribution to journalArticle

Abstract

Face detection and landmark localization have been extensively investigated and are the prerequisite for many face related applications, such as face recognition and 3D face reconstruction. Most existing methods address only one of the two problems. In this paper, we propose a coupled encoder–decoder network to jointly detect faces and localize facial key points. The encoder and decoder generate response maps for facial landmark localization. Moreover, we observe that the intermediate feature maps from the encoder and decoder represent facial regions, which motivates us to build a unified framework for multi-scale cascaded face detection by coupling the feature maps. Experiments on face detection using two public benchmarks show improved results compared to the existing methods. They also demonstrate that face detection as a pre-processing step leads to increased robustness in face recognition. Finally, our experiments show that the landmark localization accuracy is consistently better than the state-of-the-art on three face-in-the-wild databases.

Original languageEnglish (US)
Pages (from-to)37-46
Number of pages10
JournalImage and Vision Computing
Volume87
DOIs
StatePublished - Jul 1 2019

Fingerprint

Face recognition
Experiments
Processing

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Computer Vision and Pattern Recognition

Keywords

  • Convolutional neural network
  • Deep learning
  • Face detection
  • Landmark localization

Cite this

@article{2d0819eb44fc4d7f9d90a3806f53f8ef,
title = "A coupled encoder–decoder network for joint face detection and landmark localization",
abstract = "Face detection and landmark localization have been extensively investigated and are the prerequisite for many face related applications, such as face recognition and 3D face reconstruction. Most existing methods address only one of the two problems. In this paper, we propose a coupled encoder–decoder network to jointly detect faces and localize facial key points. The encoder and decoder generate response maps for facial landmark localization. Moreover, we observe that the intermediate feature maps from the encoder and decoder represent facial regions, which motivates us to build a unified framework for multi-scale cascaded face detection by coupling the feature maps. Experiments on face detection using two public benchmarks show improved results compared to the existing methods. They also demonstrate that face detection as a pre-processing step leads to increased robustness in face recognition. Finally, our experiments show that the landmark localization accuracy is consistently better than the state-of-the-art on three face-in-the-wild databases.",
keywords = "Convolutional neural network, Deep learning, Face detection, Landmark localization",
author = "Lezi Wang and Xiang Yu and Thirimachos Bourlai and Dimitri Metaxas",
year = "2019",
month = "7",
day = "1",
doi = "10.1016/j.imavis.2018.09.008",
language = "English (US)",
volume = "87",
pages = "37--46",
journal = "Image and Vision Computing",
issn = "0262-8856",
publisher = "Elsevier Limited",

}

A coupled encoder–decoder network for joint face detection and landmark localization. / Wang, Lezi; Yu, Xiang; Bourlai, Thirimachos; Metaxas, Dimitri.

In: Image and Vision Computing, Vol. 87, 01.07.2019, p. 37-46.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A coupled encoder–decoder network for joint face detection and landmark localization

AU - Wang, Lezi

AU - Yu, Xiang

AU - Bourlai, Thirimachos

AU - Metaxas, Dimitri

PY - 2019/7/1

Y1 - 2019/7/1

N2 - Face detection and landmark localization have been extensively investigated and are the prerequisite for many face related applications, such as face recognition and 3D face reconstruction. Most existing methods address only one of the two problems. In this paper, we propose a coupled encoder–decoder network to jointly detect faces and localize facial key points. The encoder and decoder generate response maps for facial landmark localization. Moreover, we observe that the intermediate feature maps from the encoder and decoder represent facial regions, which motivates us to build a unified framework for multi-scale cascaded face detection by coupling the feature maps. Experiments on face detection using two public benchmarks show improved results compared to the existing methods. They also demonstrate that face detection as a pre-processing step leads to increased robustness in face recognition. Finally, our experiments show that the landmark localization accuracy is consistently better than the state-of-the-art on three face-in-the-wild databases.

AB - Face detection and landmark localization have been extensively investigated and are the prerequisite for many face related applications, such as face recognition and 3D face reconstruction. Most existing methods address only one of the two problems. In this paper, we propose a coupled encoder–decoder network to jointly detect faces and localize facial key points. The encoder and decoder generate response maps for facial landmark localization. Moreover, we observe that the intermediate feature maps from the encoder and decoder represent facial regions, which motivates us to build a unified framework for multi-scale cascaded face detection by coupling the feature maps. Experiments on face detection using two public benchmarks show improved results compared to the existing methods. They also demonstrate that face detection as a pre-processing step leads to increased robustness in face recognition. Finally, our experiments show that the landmark localization accuracy is consistently better than the state-of-the-art on three face-in-the-wild databases.

KW - Convolutional neural network

KW - Deep learning

KW - Face detection

KW - Landmark localization

UR - http://www.scopus.com/inward/record.url?scp=85065568779&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85065568779&partnerID=8YFLogxK

U2 - 10.1016/j.imavis.2018.09.008

DO - 10.1016/j.imavis.2018.09.008

M3 - Article

VL - 87

SP - 37

EP - 46

JO - Image and Vision Computing

JF - Image and Vision Computing

SN - 0262-8856

ER -