[C28] M. S. Ryoo, Kiyoon Kim, Hyun Jong Yang*, “Extreme low resolution activity recognition with multi-Siamese embedding learning,” AAAI Conference on Artificial Intelligence (one of the top AI conferences, acceptance rate=24.6%), New Orleans, Louisiana, Feb. 2018. [arXiv]
This paper presents an approach for recognition of human activities from extreme low resolution (e.g., 16×12) videos. Extreme low resolution recognition is not only necessary for analyzing actions at a distance but also is crucial for enabling privacy-preserving recognition of human activities. We propose a new approach to learn an embedding (i.e., representation) optimized for low resolution (LR) videos by taking advantage of their inherent property: two images originated from the exact same scene often have totally different pixel (i.e., RGB) values dependent on their LR transformations. We designed a new two-stream multi-Siamese convolutional neural network that learns the embedding space to be shared by low resolution videos created with different LR transforms, thereby enabling learning of transform-robust activity classifiers. We experimentally confirm that our approach of jointly learning the optimal LR video representation and the classifier outperforms the previous state-of-the-art low resolution recognition approaches on two public standard datasets by a meaningful margin.