In recent years, sparse and low-rank modeling techniques have emerged as powerful tools for efficiently processing visual data in non-traditional ways. A particular area of promise for these theories is visual recognition, where object detection and image classification approaches need to be able to deal with the highly diverse appearance of real-world objects. However, existing visual recognition methods generally succeed only in the presence of sufficient amounts of homogeneous and balanced training data that are well matched to the actual test conditions. In practice, when the data are heterogeneous and imbalanced, the performance of existing methods can be much worse than expected.This project will develop a comprehensive framework for real-world visual recognition based on novel sparse and low-rank modeling techniques, which will be able to deal with imbalanced, heterogeneous and multi-modal data. Imbalanced data will be handled using convex optimization techniques that automatically divide a dataset into common and rare patterns, and select a small set of representatives for the common patterns that are then combined with the rare patterns to form a balanced dataset. Heterogeneous and multi-modal data will be handled using non-convex optimization techniques that learn a latent representation from multiple domains or modalities. Classification and clustering algorithms can be applied to the latent representation. Applications of these methods include image and video-based object recognition, activity recognition, video summarization, and surveillance.
|Effective start/end date||7/1/16 → 6/30/18|
- National Science Foundation (National Science Foundation (NSF))