Beyond User Embedding Matrix: Learning to Hash for Modeling Large-Scale Users in Recommendation

Shaoyun Shi, Weizhi Ma, Min Zhang, Yongfeng Zhang, Xinxing Yu, Houzhi Shan, Yiqun Liu, Shaoping Ma

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Modeling large scale and rare-interaction users are the two major challenges in recommender systems, which derives big gaps between researches and applications. Facing to millions or even billions of users, it is hard to store and leverage personalized preferences with a user embedding matrix in real scenarios. And many researches pay attention to users with rich histories, while users with only one or several interactions are the biggest part in real systems. Previous studies make efforts to handle one of the above issues but rarely tackle efficiency and cold-start problems together. In this work, a novel user preference representation called Preference Hash (PreHash) is proposed to model large scale users, including rare-interaction ones. In PreHash, a series of buckets are generated based on users' historical interactions. Users with similar preferences are assigned into the same buckets automatically, including warm and cold ones. Representations of the buckets are learned accordingly. Contributing to the designed hash buckets, only limited parameters are stored, which saves a lot of memory for more efficient modeling. Furthermore, when new interactions are made by a user, his buckets and representations will be dynamically updated, which enables more effective understanding and modeling of the user. It is worth mentioning that PreHash is flexible to work with various recommendation algorithms by taking the place of previous user embedding matrices. We combine it with multiple state-of-the-art recommendation methods and conduct various experiments. Comparative results on public datasets show that it not only improves the recommendation performance but also significantly reduces the number of model parameters. To summarize, PreHash has achieved significant improvements in both efficiency and effectiveness for recommender systems.

Original languageEnglish (US)
Title of host publicationSIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages319-328
Number of pages10
ISBN (Electronic)9781450380164
DOIs
StatePublished - Jul 25 2020
Event43rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020 - Virtual, Online, China
Duration: Jul 25 2020Jul 30 2020

Publication series

NameSIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

Conference43rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020
CountryChina
CityVirtual, Online
Period7/25/207/30/20

All Science Journal Classification (ASJC) codes

  • Computer Graphics and Computer-Aided Design
  • Information Systems
  • Software

Keywords

  • cold start problem
  • efficiency and effectiveness
  • neural recommendation
  • recommender system
  • user preference modeling

Fingerprint Dive into the research topics of 'Beyond User Embedding Matrix: Learning to Hash for Modeling Large-Scale Users in Recommendation'. Together they form a unique fingerprint.

Cite this