摘要

Labeled datasets are one of the key factors for obtaining a good and robust classifier using supervised learning methods. However, labeling raw data is a tedious and labor-intensive process, which is usually done manually. Many efforts were proposed to utilize a small amount of labeled data to train a classifier that is sufficiently robust to label more data for training or make a prediction on unlabeled data. Unlike previous studies, we proposed an automatic labeling framework without labeling a small amount of data in advance, to directly annotate unlabeled time series data regarding body-worn sensor-based human activity recognition (HAR) in laboratory settings. The framework automatically labels collected time series activity data by transforming the original data into its corresponding absolute wavelet energy entropy and detects activity endpoints based on constraints and information extracted from a predefined human activity sequence. The performance of the proposed framework was evaluated on the collected dataset and the UCI HAR Dataset. In both cases, the average precision and recall scores are above 81.9% and the average F-measure scores are above 88.9%. Results showed that the proposed framework can be adopted as a rapid and reliable way of generating labeled datasets from unlabeled data.

全文