Salient subsequence learning for time series clustering

Abstract

Time series has been a popular research topic over the past decade. Salient subsequences of time series that can benefit the learning task, e.g. classification or clustering, are called shapelets. Shapelet-based time series learning extracts these types of salient subsequences with highly informative features from a time series. Most existing methods for shapelet discovery must scan a large pool of candidate subsequences, which is a time-consuming process. A recent work, [1], uses regression learning to discover shapelets in a time series; however, it only considers learning shapelets from labeled time series data. This paper proposes an Unsupervised Salient Subsequence Learning (USSL) model that discovers shapelets without the effort of labeling. We developed this new learning function by integrating the strengths of shapelet learning, shapelet regularization, spectral analysis and pseudo-label to simultaneously and automatically learn shapelets to help clustering unlabeled time series better. The optimization model is iteratively solved via a coordinate descent algorithm. Experiments show that our USSL can learn meaningful shapelets, with promising results on real-world and synthetic data that surpass current state-of-the-art unsupervised time series learning methods.

Publication
IEEE transactions on pattern analysis and machine intelligence