Defeating hidden audio channel attacks on voice assistants via audio-induced surface vibrations

Chen Wang, S. Abhishek Anand, Jian Liu, Payton Walker, Yingying Chen, Nitesh Saxena

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

Voice access technologies are widely adopted in mobile devices and voice assistant systems as a convenient way of user interaction. Recent studies have demonstrated a potentially serious vulnerability of the existing voice interfaces on these systems to “hidden voice commands”. This attack uses synthetically rendered adversarial sounds embedded within a voice command to trick the speech recognition process into executing malicious commands, without being noticed by legitimate users. In this paper, we employ low-cost motion sensors, in a novel way, to detect these hidden voice commands. In particular, our proposed system extracts and examines the unique audio signatures of the issued voice commands in the vibration domain. We show that such signatures of normal commands vs. synthetic hidden voice commands are distinctive, leading to the detection of the attacks. The proposed system, which benefits from a speaker-motion sensor setup, can be easily deployed on smartphones by reusing existing on-board motion sensors or utilizing a cloud service that provides the relevant setup environment. The system is based on the premise that while the crafted audio features of the hidden voice commands may fool an authentication system in the audio domain, their unique audio-induced surface vibrations captured by the motion sensor are hard to forge. Our proposed system creates a harder challenge for the attacker as now it has to forge the acoustic features in both the audio and vibration domains, simultaneously. We extract the time and frequency domain statistical features, and the acoustic features (e.g., chroma vectors and MFCCs) from the motion sensor data and use learning-based methods for uniquely determining both normal commands and hidden voice commands. The results show that our system can detect hidden voice commands vs. normal commands with 99.9% accuracy by simply using the low-cost motion sensors that have very low sampling frequencies.

Original languageEnglish (US)
Title of host publicationProceedings - 35th Annual Computer Security Applications Conference, ACSAC 2019
PublisherICST
Pages42-56
Number of pages15
ISBN (Electronic)9781450376280
DOIs
StatePublished - Dec 9 2019
Event35th Annual Computer Security Applications Conference, ACSAC 2019 - San Juan, United States
Duration: Dec 9 2019Dec 13 2019

Publication series

NamePervasiveHealth: Pervasive Computing Technologies for Healthcare
ISSN (Print)2153-1633

Conference

Conference35th Annual Computer Security Applications Conference, ACSAC 2019
Country/TerritoryUnited States
CitySan Juan
Period12/9/1912/13/19

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Computer Science Applications
  • Health Informatics

Keywords

  • Hidden voice command detection
  • Motion sensor
  • Surface vibrations
  • Voice access

Fingerprint

Dive into the research topics of 'Defeating hidden audio channel attacks on voice assistants via audio-induced surface vibrations'. Together they form a unique fingerprint.

Cite this