Recognizing human activities using light-weight and effective machine learning methodologies

Keerthi Varadhi; Chinta Someswara Rao; GNVG Sirisha; Butchi Raju katari

doi:10.12688/f1000research.124164.4

Home Browse Recognizing human activities using light-weight and effective machine...

ALL Metrics

Views

Downloads

Get PDF

Get XML

Export

▬

✚

Method Article

Revised

Recognizing human activities using light-weight and effective machine learning methodologies

[version 4; peer review: 1 approved, 3 not approved]

Keerthi Varadhi ¹, Chinta Someswara Rao², GNVG Sirisha², Butchi Raju katari¹

PUBLISHED 05 Nov 2024

Author details Author details

¹ CSE Department, Gokaraju Rangaraju Institute of Engineering and Technology, Hyderabad, TELANGANA, 500090, India
² CSE Department, SRKR Engineering College, Bhimavaram, ANDHRA PRADESH, 534204, India

Keerthi Varadhi
Roles: Conceptualization, Methodology

Chinta Someswara Rao
Roles: Methodology, Validation

GNVG Sirisha
Roles: Writing – Original Draft Preparation

Butchi Raju katari
Roles: Formal Analysis, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Computational Modelling and Numerical Aspects in Engineering collection.

Abstract

Background

Human activity recognition (HAR) is increasingly important in enhancing healthcare systems by enabling accurate monitoring of individuals' movements through sensor data. This paper is motivated by the need to improve the accuracy of HAR, particularly for applications in e-health systems, where reliable activity detection can lead to better health outcomes. The study explores six prominent machine learning techniques—decision tree, random forest, linear regression, Naïve Bayes, k-nearest neighbor, and neural networks—to determine which methods can most effectively predict activities like walking, sitting, standing, laying, walking upstairs, and walking downstairs.

Methods

We employed these six machine learning algorithms to analyze a comprehensive dataset derived from various sensors. Each model was rigorously trained and evaluated to compare its effectiveness in recognizing human activities. The experiments aimed to identify strengths and weaknesses in each approach, with particular emphasis on advanced techniques such as random forest, convolutional neural networks (CNNs), and gated recurrent networks (GRNs).

Results

The experimental evaluation revealed that the random forest classifier, CNN, GRN, and neural networks delivered promising results, achieving high accuracy levels. Notably, the neural network model excelled, attaining an impressive accuracy of 98%. In contrast, the Naïve Bayes model did not meet the performance expectations set by the other algorithms.

Conclusions

This research effectively classifies activities such as sitting, standing, laying, walking, walking downstairs, and walking upstairs, underscoring the potential of machine learning in HAR. The findings highlight the superior performance of neural networks in enhancing activity recognition, which could lead to advanced applications in e-health systems and improve overall healthcare monitoring strategies.

Keywords

Classification; Land Cover ; Deep Learning; CNN

Corresponding authors: Keerthi Varadhi, Chinta Someswara Rao, Butchi Raju katari

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2024 Varadhi K et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The author(s) is/are employees of the US Government and therefore domestic copyright protection in USA does not apply to this work. The work may be protected under the copyright laws of other jurisdictions when used in those jurisdictions.

How to cite: Varadhi K, Someswara Rao C, Sirisha G and katari BR. Recognizing human activities using light-weight and effective machine learning methodologies [version 4; peer review: 1 approved, 3 not approved]. F1000Research 2024, 12:247 (https://doi.org/10.12688/f1000research.124164.4) First published: 06 Mar 2023, 12:247 (https://doi.org/10.12688/f1000research.124164.1) Latest published: 05 Nov 2024, 12:247 (https://doi.org/10.12688/f1000research.124164.4)

Revised Amendments from Version 3

We have carefully addressed reviewer suggestions in the revised version. Firstly, we updated the abstract to include a clearer motivation for the research, a fundamental overview of the study, and detailed results from our experiments. In the introduction, we enhanced the content to highlight the significant contributions of our work more explicitly. We also incorporated the recommended references into the related work section to provide a broader context for our research. Additionally, we included a comprehensive description of the dataset in the results and discussion section to clarify our methodology. To improve comprehension, we revised the confusion matrix to present the data in percentage form. Lastly, we clarified the unique contributions of our research, emphasizing how it advances human activity recognition through a comparison of six machine learning algorithms, showcasing the exceptional performance of neural networks with an accuracy of 98.93%. These revisions collectively enhance the clarity, depth, and quality of our paper.

The revised paper was rewritten according to the suggestions given by the reviewers.

See the authors' detailed response to the review by Nurul Amin Choudhury

Introduction

Human activity recognition is the process of identifying physical activities performed by individuals based on observations captured while they engage in various actions within a specific environment. With the rise of wearable and pervasive computing technologies, this field has seen significant growth, leading to the development of numerous applications. These applications include assistive technology, health and fitness tracking, elder care, and automated surveillance.

The goal of human activity recognition is to enhance the quality of life by making devices more aware of and responsive to users' needs. This technology can monitor daily activities, provide assistance in real-time, and offer insights that can improve health and safety. The motivation behind this task lies in its potential to revolutionize various aspects of daily living, from supporting the elderly to ensuring security in sensitive environments.

Mobile phone sensors, such as accelerometers and gyroscopes, play a crucial role in this process. By capturing precise motion data, these sensors enable the prediction of human actions through machine learning algorithms. This ability to continuously monitor and interpret human behavior makes human activity recognition a powerful tool for understanding and enhancing the way we interact with the world around us.

Contributions

This research significantly advances the field of human activity recognition by providing a comprehensive exploration of machine learning techniques applied to a dataset derived from triaxial accelerometer and gyroscope sensors. The study focuses on six distinct classes of activities, namely Sitting, Standing, Laying, Walking, Walking_Downstairs, and Walking_Upstairs. The primary contributions can be summarized as follows:

Methodological Diversity: The study employs six prominent machine learning algorithms—naïve Bayes, decision tree, random forest, K-nearest neighbours, logistic regression, and neural network. This diverse approach allows for a thorough investigation into the strengths and limitations of each algorithm in predicting human activities.

Performance Analysis: Through rigorous experimentation, the research provides a detailed performance analysis of each machine learning model. Notably, the neural network emerges as the standout performer, achieving an impressive accuracy rate of 98.93%, surpassing other models such as naïve Bayes, decision tree, random forest, K-nearest neighbours, and logistic regression.

High Accuracy Achievements: The study successfully classifies activities like Sitting, Standing, Laying, Walking, Walking_Downstairs, and Walking_Upstairs with a remarkable accuracy of 98%. This achievement underscores the potential for accurate human activity recognition, a crucial aspect in applications such as e-health systems.

Insights into Algorithmic Strengths and Weaknesses: By presenting varied accuracies achieved by each model, the research sheds light on the strengths and weaknesses of different machine learning algorithms. This information is valuable for researchers and practitioners seeking to optimize activity recognition models for specific contexts.

Prominence of Neural Networks: The findings highlight the efficacy of neural networks in enhancing human activity recognition, positioning them as a promising approach for future endeavors in this domain. This insight can guide future research and development efforts towards leveraging neural networks for improved accuracy and reliability in activity prediction.

In conclusion, this research significantly contributes to the field of human activity recognition by offering a thorough examination of diverse machine learning methodologies. The emphasis on neural networks and their exceptional performance opens new avenues for advanced applications in e-health systems and beyond. The insights gained from this study not only deepen our understanding of machine learning applications in activity recognition but also pave the way for more effective methodologies in predicting and classifying human activities.

Literature review

This section encompasses selected literature papers pertaining to human activity recognition, with specific details.

Schüldt et al.¹ employed Support Vector Machines (SVM) for recognizing human activities in video sequences. The technique was effective, but the authors recognized the need to enhance recognition rates further. To achieve this, future work may involve utilizing more advanced neural network techniques, which have shown promise in improving classification performance in similar tasks.

Laptev et al.² utilized an SVM with a multichannel Gaussian kernel for activity recognition. This approach allowed for a more nuanced analysis of video data by leveraging the multi-channel information. However, like other traditional machine learning methods, there is a need to improve the recognition rate. Future efforts should explore more sophisticated models or hybrid techniques that can capture the complex patterns in human activities more effectively.

Yamato et al.³ used Hidden Markov Models (HMM) in conjunction with feature vectors to recognize activities in video data. HMMs are particularly useful for modeling temporal sequences, making them suitable for this task. However, the recognition rates were not optimal, suggesting that there is room for improvement. Future research could focus on integrating HMMs with other models or exploring more advanced techniques like deep learning to boost performance.

Oliver et al.⁴ utilized Coupled Hidden Markov Models (CHMM) for activity recognition. CHMMs offer an enhanced capability over standard HMMs by modeling multiple interacting processes, which can be beneficial in complex activity recognition tasks. Despite these advantages, the authors noted that recognition rates could still be improved, indicating a potential area for future work, such as incorporating deep learning methods to handle more complex scenarios.

Natarajan et al.⁵ explored the use of Conditional Random Fields (CRF) for detecting human actions. CRFs are powerful for sequence modeling, especially in structured prediction tasks like activity recognition. However, the authors identified a need for further improvements in recognition accuracy, which could be addressed by integrating CRFs with deep learning frameworks or by refining the feature extraction process.

Ning et al.⁶ applied a standard method for human activity detection, though the specifics of the technique were not groundbreaking. The recognition rates achieved were satisfactory but not exemplary. To push the boundaries of what is currently possible, future research should consider incorporating more advanced or novel techniques, possibly exploring deep learning or ensemble methods to achieve higher accuracy.

Vali M. et al.⁷ combined Hidden Markov Models (HMM) and Conditional Random Fields (CRF) to enhance activity recognition. This hybrid approach leverages the strengths of both models—HMMs for temporal modeling and CRFs for sequence prediction. Despite this, the recognition rates were not optimal, suggesting that there is a need to further refine the model, possibly by integrating deep learning techniques or optimizing the existing hybrid approach.

Madharshahian R. et al.⁸ utilized Multinomial Logistic Regression (LR) for activity recognition. While LR is a robust statistical model, its application in complex tasks like activity recognition may fall short in terms of accuracy. The authors acknowledged this limitation and pointed out that future work should focus on enhancing activity identification, potentially by adopting more complex models like deep neural networks.

Kiros R. et al.⁹ employed consecutive Recurrent Neural Networks (RNNs) in their study. RNNs are well-suited for sequential data like human activities, but the authors noted that the network’s performance could be improved. They suggested that adding more layers to the network might enhance its ability to capture intricate temporal dependencies, leading to better identification rates.

Grushin A. et al.¹⁰ also utilized consecutive Recurrent Neural Networks (RNNs) for activity recognition, similar to Kiros et al. They observed that while RNNs perform well in handling sequential data, the recognition rates could still be better. To achieve this, they proposed that expanding the network by adding more layers could improve its capacity to model complex patterns in the data.

Veeriah et al.¹¹ implemented differential Recurrent Neural Networks (RNNs) for recognizing human activities. This variation of RNNs aims to improve the model’s ability to capture subtle changes in the data. However, the authors noted that the recognition rates were not as high as desired. They suggested that adding more layers to the network could potentially enhance the model’s ability to detect activities more accurately.

Du Y., Wang W.,¹² and Wang L. explored hierarchical Recurrent Neural Networks (RNNs) for activity recognition. Hierarchical RNNs can model different levels of abstraction in the data, making them powerful tools for this task. Despite their advantages, the authors acknowledged that further improvements could be made by incorporating Deep Neural Networks (DNNs), particularly Long Short-Term Memory (LSTM) units, to boost recognition rates.

Anna Ferrari et al.¹³ developed automated methods to identify daily activities (ADLs) using sensor signals and deep learning techniques. While their approach was innovative, the authors noted that the deep learning model could benefit from additional layers to improve its accuracy in detecting activities. This suggests that there is still room to optimize the model for better performance.

Li et al.¹⁴ proposed the HAR_WCNN algorithm, which uses a wide time-domain Convolutional Neural Network (CNN) combined with multienvironment sensor data for daily behavior recognition. The method shows promise, but the authors pointed out that the activity identification process could still be enhanced. Future work might focus on refining the CNN architecture or incorporating additional data sources to achieve better recognition rates.

Sarkar A.¹⁵ applied a Convolutional Neural Network (CNN) augmented with Spatial Attention for accurate human activity recognition. The spatial attention mechanism helps the model focus on relevant parts of the input data, improving accuracy. However, the authors recognized the need to further enhance the efficiency of activity recognition, which could be achieved by optimizing the network architecture or increasing the complexity of the model.

Choudhury N.¹⁶ used an adaptive batch size-based CNN-LSTM model to discern various human activities in uncontrolled environments. The combination of CNN for feature extraction and LSTM for sequence modeling makes this approach effective. However, the authors suggested that increasing the number of layers in the network could further boost the recognition rate, indicating that there is still potential for improvement.

Gholamiangonabadi D. and Grolinger K.¹⁷ employed a personalized Human Activity Recognition (HAR) model that utilizes CNN and signal decomposition. This approach allows for customized activity recognition based on individual differences. Despite its personalized nature, the authors acknowledged that the recognition rates could be better and suggested that incorporating Deep Neural Networks (DNN), particularly LSTM units, could lead to improved performance.

Dua N.¹⁸ introduced “ICGNet,” a hybrid model combining Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU) for Human Activity Recognition (HAR). This hybrid approach leverages the strengths of both CNNs and GRUs, but the authors noted that adding more layers to the network might enhance recognition rates. This suggests that there is still room to optimize the model for better accuracy.

Wu, Hao¹⁹ proposed the Differential Spatio-Temporal LSTM (DST-LSTM) method for Human Activity Recognition (HAR). This method is designed to capture both spatial and temporal dependencies in the data. However, the authors pointed out that expanding the network’s layers could improve recognition rates, indicating that the model’s current performance might benefit from further refinement.

Liciotti, Daniele ²⁰ utilized Long Short-Term Memory (LSTM) networks to model spatio-temporal sequences obtained from smart home sensors and classify human activities. LSTMs are well-known for their ability to handle long-term dependencies in sequential data. Despite this, the authors acknowledged that there is still a need to improve recognition rates, suggesting that future work might focus on optimizing the LSTM architecture or incorporating additional data sources.

Priyadarshini I.²¹ explored machine learning techniques, including Random Forest (RF), Decision Trees (DT), K-Nearest Neighbors (k-NN), and deep learning algorithms like Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU) for Human Activity Recognition (HAR). While these techniques are powerful, the authors noted that increasing the layers of the network could improve recognition rates, indicating that there is still potential to optimize the model for better performance.

Ronald M, et al. ²² introduces iSPLInception, a deep learning model based on the Inception-ResNet architecture, designed for efficient human activity recognition (HAR) on devices with limited computational resources. The model outperforms existing approaches in accuracy, cross-entropy loss, and F1 score across four public HAR datasets, demonstrating its effectiveness for real-world applications. In future need to focus on optimizing the iSPLInception model for real-time applications and expanding its use to more diverse datasets.

Poulose A, et al.²³ introduces HIT HAR, a human activity recognition system using smartphone image data and a mask R-CNN for body detection, combined with a deep learning model for activity classification. Achieving 98.53% accuracy with ResNet, this approach overcomes misclassification challenges seen in sensor-based systems, especially for complex activities. In future need to focus on enhancing the HIT HAR system for real-time performance and expanding it to recognize more complex activities.

Methods

The human activity recognition methodology involves four key phases, namely input, data cleaning, data splitting, and classification and validation. Figure 1 illustrates the structure of the human activity recognition process.

Figure 1. Human activity recognition structure.

Dataset: In our research endeavor, we drew upon the extensive resources offered by the kaggle, specifically selecting a dataset characterized by its wealth of information pertaining to triaxial acceleration derived from an accelerometer and triaxial angular velocity obtained from a gyroscope. This dataset, comprising a total of 10,299 entries, is distinguished by each entry being defined by a set of 561 features. This curated dataset plays a pivotal role in the domain of machine learning, providing a comprehensive array of features essential for thorough analysis and effective model training.

The foundation of our investigation lies in the Human Activity Recognition database, meticulously constructed from recordings capturing the daily activities of 30 participants. These individuals engaged in various activities of daily living (ADL), all while carrying a waist-mounted smartphone equipped with embedded inertial sensors. The primary objective was to classify these activities into one of six predefined categories: WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, and LAYING.

The experimental phase involved a cohort of 30 volunteers aged between 19 and 48 years. Each participant undertook the performance of six distinct activities while wearing a Samsung Galaxy S II smartphone on their waist. The smartphone, equipped with an accelerometer and gyroscope, captured 3-axial linear acceleration and 3-axial angular velocity at a consistent rate of 50Hz. To ensure the accuracy of the recorded data, the experiments were video-recorded, enabling manual labeling of the dataset. Subsequently, the dataset was randomly partitioned into two sets: 70% of the volunteers were allocated to generate the training data, and the remaining 30% constituted the test data.

The raw sensor signals, originating from both the accelerometer and gyroscope, underwent a pre-processing phase that included noise filtering. The processed signals were then sampled in fixed-width sliding windows, each lasting 2.56 seconds with a 50% overlap (128 readings per window). Further refinement involved the separation of the sensor acceleration signal, which encompasses both gravitational and body motion components. This separation was achieved using a Butterworth low-pass filter, isolating body acceleration and gravity. The gravitational force, primarily composed of low-frequency components, was effectively filtered using a cutoff frequency of 0.3 Hz.

Each window in the dataset resulted in a feature vector, totaling 561 dimensions, comprising variables derived from both the time and frequency domains. The attributes associated with each record in the dataset include triaxial acceleration from the accelerometer (representing total acceleration) and estimated body acceleration, triaxial angular velocity from the gyroscope, the 561-feature vector, the corresponding activity label, and an identifier specifying the subject who conducted the experiment.

Data cleaning: Data cleaning is a crucial process in refining a dataset, involving the removal of redundant columns and the correction or replacement of inaccurate records. As part of this process, an encoding technique is often employed to transform categorical data into numerical format; however, in cases where all feature values are already numerical, such encoding may be unnecessary. The primary objective of data cleaning is to mitigate the risk of errors recurring in the dataset.

The initial step in data cleaning entails the elimination of unwanted observations, which encompass both irrelevant and duplicate data. Duplicate entries commonly emerge when data is amalgamated from various sources, while irrelevant data pertains to information that does not contribute to addressing the underlying problem. This meticulous process is imperative for enhancing the dataset’s integrity.

Following the removal of undesirable observations, the subsequent step involves addressing missing data, which can be approached through two methods: either removing the entire record containing missing values or filling in the gaps based on other observations. However, outright removal or manual filling may not be optimal, as it could result in information loss, such as replacing a range with an arbitrary value. To address missing numerical data effectively, a dual-pronged approach is employed:

▪ Flagging the observation with an indicator value denoting missingness.
▪ Temporarily filling the observation with a placeholder value, such as 0, to ensure that no gaps exist.

This technique of flagging and filling not only aids in preserving the overall dataset but also allows algorithms to estimate missing values more optimally, mitigating potential information loss. In essence, data cleaning is a meticulous process that goes beyond mere elimination, involving strategic measures to enhance the accuracy and completeness of the dataset.

Feature fusion pipeline: In the realm of human activity recognition using sensor data, the feature fusion pipeline plays a pivotal role in amalgamating diverse sets of features extracted from various sensors to provide a comprehensive and nuanced understanding of human movements. The feature fusion pipeline is essentially a systematic process that integrates information from different sensors, such as accelerometers, gyroscopes, and magnetometers, to create a unified representation of human activities. This integration of features not only enhances the accuracy of activity recognition but also ensures a more robust and holistic analysis of complex human behaviors. For instance, by fusing data from accelerometers measuring linear motion with gyroscopes capturing rotational movements, the feature fusion pipeline enables a more nuanced differentiation between activities like walking, running, or even specific gestures, contributing to the refinement of human activity recognition systems.

The feature fusion pipeline involves multiple stages, beginning with the extraction of relevant features from individual sensors. These features may include acceleration patterns, orientation changes, or temporal characteristics. Subsequently, a fusion mechanism is applied to combine these diverse features into a cohesive representation, often using techniques such as concatenation, averaging, or weighted summation. The resultant fused feature set serves as input to machine learning models, allowing for more sophisticated and context-aware human activity recognition. By seamlessly integrating information from various sensors, the feature fusion pipeline significantly enhances the capability of systems to discern intricate human activities, making it a cornerstone in advancing the accuracy and applicability of human activity recognition using sensor data.

Data splitting: During the pivotal phase of data splitting, the dataset undergoes a discerning segmentation, effectively yielding two distinctive components with unique roles and significance:

Training Data: At the core of this phase lies the training data, a crucial subset meticulously employed to educate and refine the model. Specifically, we judiciously allocated 60% of the entire dataset to serve as the bedrock for training our model. This subset plays a pivotal role in shaping the model’s understanding, fostering its ability to generalize patterns, and ultimately enhancing its predictive capabilities.

Testing Data: The counterpart to the training data, the testing data serves a distinct purpose in evaluating the model’s proficiency and generalization beyond the training set. Carefully designated as the dataset reserved for assessing the model’s performance, we conscientiously earmarked 20% of the dataset for this purpose. This subset represents a critical measure of the model’s ability to extrapolate its learned knowledge to unseen instances, thereby validating its robustness and reliability.

Classification and validation: The intricate process of classification and validation unfolded in two distinct phases, namely:

- Classification algorithms
- Evaluation metrics

Classification: Within the realm of data analysis, classification stands as the pivotal process dedicated to predicting the target class of novel observations, a task fundamentally grounded in the patterns discerned from the training data. The dataset meticulously chosen for our analysis already encompasses the essential target class information. To harness this data effectively, we executed the model by employing a suite of sophisticated classification algorithms. The algorithms we incorporated in this analytical journey were carefully selected and included the following advanced techniques: [list of algorithms]. Each of these algorithms brought its unique strengths and nuances to the forefront, enhancing our model’s predictive capabilities.

▪ Naïve Bayes²⁴
▪ K-nearest neigbours²⁵
▪ Decision tree²⁶
▪ Random forest²⁷
▪ Simple logistic regression²⁸
▪ Neural network²⁹

Decision Tree: The decision tree classification utilizes a tree representation to address a problem, with each internal node representing features, branches denoting feature outcomes, and leaf nodes indicating target class labels. The algorithm for constructing the decision tree involves several stages with a critical consideration of hyperparameters.

Algorithm

Hyperparameter: Attribute Selection Measure

▪ Identify the best attribute to serve as the root of the tree.
▪ Subdivide the training set into sections based on the selected attribute.
▪ Repeat the above steps until all branches of the tree have leaf nodes.

Hyperparameter: Attribute Selection Measures

▪ The decision on the best attribute at each level involves employing specific measures.
▪ Two commonly used measures are:
- ○ Information Gain
- ○ Gini Index

Information Gain is mathematically defined as Equation 1:

(1)

Entropy = - \sum P_{i} \times {log}_{2} P_{i}

where P_i is the probability of class i

Information Gain (Equation 2) is then calculated as:

(2)

Gain (S, A) = Entropy (S) - \sum_{v \in values (A)} \frac{|S_{i}|}{|S|} . Entropy (S_{v})

Gini Index, another measure, quantifies the likelihood of a randomly chosen element being incorrectly classified (Equation 3):

(3)

Gini Index = 1 - \sum_{j} P_{j}^{2}

Considering the unbalanced nature of the dataset with multiple classes, the decision tree primarily relies on Information Gain to select the most appropriate attribute.

Logistic regression: Logistic regression is a statistical method used for classification which falls under supervised learning. It is used for predictions based on probability concepts. It analyses data set and take discrete independent input values and give a single output.

Logistic function: Logistic function is also known as the sigmoid function, which takes a real-value number as input and maps it into a value between 0 and 1, but not exactly at those limits. It maps predictions to probabilities.

By using the sigmoid function, if we give an input value on x, then it predicts the target value on the y axis.

(4)

Logistic function = \frac{L (e^{k (x - x_{0})})}{1 + e^{k (x - x_{0})}}

(5)

Sigmoid function = \frac{e^{x}}{1 + e^{x}}

Here k = 1, x₀ = 0, L = 1

Algorithm

1. Plot the labeled data
2. Draw the regression curve
3. Find out the best fitted curve using maximum likelihood estimator (MLE)
- ▪ Convert y-axis probability scale to log (odds)
- ▪ Assume regression line and scale paper data to regression line
- ▪ Apply sigmoid function

Naïve Bayes classifier: Naïve is a probabilistic classification algorithm which is based on Bayes theorem. It assumes that the occurrence of an instance is independent of other instances.

(6)

P (A| B) = \frac{P (B| A) P (A)}{P (B)}

The conditional probability of an object with feature vector x₁,x₂, … … ., x_n belongs to a particular class C_i, and it is calculated with Equation 7.

(7)

P (C_{i}| x_{2,} \dots \dots .| x_{n}) = \frac{P (x_{1}, x_{2,} \dots \dots ., x_{n}, C_{i}) . P (C_{i})}{P (x_{1}, x_{2,} \dots \dots ., x_{n})} for 1 \leq i \leq k

Algorithm

1. The data set is used to construct a frequency table.
2. By computing the probabilities of all the elements in the dataset, a likelihood table is constructed.
3. Posterior probability is calculated for each class by using (1).
4. The class with the highest posterior probability is the output.

Gaussian naïve Bayes: As values of each feature of our data set are continuous, we use the Gaussian distribution, which is also called the normal distribution.

(8)

P (A / B) = \frac{1}{\sqrt 2 π σ_{B}^{2}} exp (- \frac{{(A - μB)}^{2}}{2 σ_{B}^{2}})

Where μ is the mean of all the values a feature and σ is the standard deviation.

K-nearest neighbours is one of the essential classification algorithms. It falls under the category of supervised learning. It assumes that similar data are close to each other. Here the data is classified into groups based on an attribute. It is used in applications like pattern recognition, intrusion detection and data mining.

Algorithm

Let N be the number of training samples of data and U is the unknown point.

The data set is stored in an array each element represented as a tuple (x, y).

For i = 0 to n-1

Begin

Euclidian distance from each point to U is calculated.

S is the set of m smallest distances (all the points related to distances must be classified)

end

Return (Mode (m labels))

Random forest: Random forests are ensembles, which means combination of two or more models to get better results. Random forests create a forest of decision trees on data samples. Each decision tree gives a target class as output and the final target class is identified by performing some measures on outputs from each decision tree of corresponding input record.

Algorithm

1. Picks random samples from the provided dataset.
2. For each random sample, creates a decision tree.
3. Predicts the result from every decision tree.
4. Predicts target class of sample by simple voting as the final result.

Artificial neural network (ANN): Artificial Neural Network (ANN), an intricate computational model, is comprised of interconnected nodes that collaboratively process and store property values. These nodes serve as pivotal entities where input values, derived from the distinctive characteristics of training examples, are meticulously aggregated. The interconnection between nodes involves weighted values (W) and biases ( $\bar{b}$ ), which are then systematically forwarded to the subsequent layer of nodes. A crucial aspect of this progression includes subjecting the weighted total to a non-linear activation function ( $σ)$ , with the resultant value efficiently transmitted to the subsequent layer. This iterative process persists until the network culminates in generating the final output, as illustrated in the intricate structure depicted in Figure 2.

Figure 2. Neural network model structure.

Figure 2 exemplifies the complex architecture of a neural network model, visually portraying the interplay of nodes, weighted values, and biases that characterize its underlying structure.

In the quest to enhance the accuracy of output predictions, the manipulation of weights (W) and biases (b) within the ANN plays a central role. These parameters are optimized through the employment of various optimization algorithms, each tailored to address specific challenges in training neural networks. These algorithms are characterized by hyperparameters, including: Gradient Descent (GD), Stochastic Gradient Descent (SGD), Mini-Batch Gradient Descent.

Gradient Descent (GD): Update Rule

θ_{t} = θ_{t} - \propto \nabla J (θ_{t})

θ_t represents the parameters (weights and biases) at iteration t

∝ is the learning rate, controlling the size of the step taken during each iteration.

Stochastic Gradient Descent (SGD): Update Rule

θ_{t} = θ_{t} - \propto \nabla J (θ_{t}, x_{i}, y_{i})

Similar to GD,but instead of using the gradient of the entire dataset, it uses the gradient of the cost function for a single data point (x_i,y_i) at each iteration.

Mini-Batch Gradient Descent: Update Rule

W_{t + 1} = W_{t} - \propto \nabla J (W_{t}, X_{mini ‐ batch}, Y_{mini ‐ batch})

b_{t + 1} = b_{t} - \propto \nabla J (b_{t}, X_{mini ‐ batch}, Y_{mini ‐ batch})

An intermediate approach between GD and SGD,where updates are computed using a small andom subset (mini-batch) of the entire dataset.

The neural network architecture is defined by the number of layers (L) and the number of neurons in each layer (N_l). The weights (W) and biases (b) are initialized and updated through the training process, adjusting their values to minimize the cost function (J). The activation function (σ) introduces non-linearity to the model, enabling it to learn complex relationships within the data.

Number of Layers and Neurons: The neural network architecture is defined by the number of layers (L) and the number of neurons in each layer (N_l).

Initialization of Weights and Biases: The weights (W) and biases (b) are initialized before the training process begins. This initialization is typically done randomly or using specific strategies to break symmetry and aid convergence.

W^{l} and b^{l} for layer

Forward Propagation: The forward propagation computes the predicted output (a^l) for each layer using the following equations:

z^{l} = W^{l} . a^{l} + b^{l}

a^{l} = σ (z^{l})

Cost Function: The cost function (J) measures the difference between the predicted output and the actual output. For regression problems, Mean Squared Error (MSE) is commonly used:

J = \frac{1}{2 m} \sum_{i = 1}^{m} {(a_{i}^{L} - y_{i})}^{2}

Backward Propagation: The backward propagation computes the gradients of the cost function with respect to the weights and biases, enabling updates in the direction that minimizes the cost. The gradients are calculated using the chain rule:

{dz}^{l} = \frac{\partial J}{{\partial z}^{l}} = \frac{\partial J}{{\partial a}^{l}} \frac{{\partial a}^{l}}{{\partial z}^{l}}

The iterative nature of the learning process in ANN, combined with the dynamic adjustments introduced by these parameters and hyperparameters, showcases the adaptability and efficiency in navigating the complex parameter space to converge towards optimal solutions.

Validation: Validation is a crucial step in assessing the performance of a machine learning model, where evaluation metrics are employed to gauge the model’s compatibility with the underlying dataset. The process involves employing various metrics designed to provide insights into different aspects of the model’s performance. Among these metrics, commonly utilized for assessing classification models, are accuracy, precision, recall, and the F1 score.

Precision: Precision is also known as positive predicted value which is a measure of accuracy. It is mathematically defined as true positives divided by the sum of true positives and false positives.

(9)

Precision = \frac{True positives}{True positives + False positives}

Recall: Recall is also referred toas sensitivity, which is a measure of accuracy. It is mathematically defined as true positives divided by sum of true positives and false negatives.

(10)

Recall = \frac{True positives}{True positives + False negatives}

F1 score: The F1 score is calculated with equation 11

(11)

F1 score = 2 \times \frac{1}{\frac{1}{precision} + \frac{1}{recall}}

Accuracy: Accuracy defines how well the model predicts the class. It is mathematically defined as the number of truly predicted classes divided by total number of classes.

(12)

Accuracy = \frac{True positives + True negatives}{True positives + True negatives + False positives + False negatives}

Results and discussion

Dataset

In our research, we utilized a rich dataset from Kaggle, which contains detailed measurements from both an accelerometer and a gyroscope. The accelerometer captures triaxial acceleration, while the gyroscope records triaxial angular velocity, providing a comprehensive view of movement and rotation. This dataset consists of 10,299 individual entries, with each entry described by 561 unique features. These features offer a wide range of valuable data points, making the dataset highly suitable for in-depth machine learning analysis and effective model training, particularly in the field of human activity recognition.

The total number of rows and columns in a dataset are identified by using the ‘shape’ function.

The shape function reveals that the dataset comprises 10,299 instances and 562 attributes. Among these attributes, 561 are independent, while one serves as the dependent variable. Our exploration of the data involved the utilization of various graphs, including a bar graph, box plot, and heatmap. Specifically, the bar graph facilitated the comparison of different target classes. Figure 3 displays the bar graph, illustrating the distribution of classifications among individuals. The observations extracted from Figure 3 are as follows: LAYING—1944 instances, STANDING—1906 instances, SITTING—1777 instances, WALKING—1722 instances, WALKING_UPSTAIRS—1544 instances, and WALKING_DOWNSTAIRS—1406 instances.

Figure 3. Bar graph for the attribute ‘Activity’.

Box plot: In Figure 4, a box plot illustrates the data distribution for the ‘Activity’ target class and the ‘tBodyAcc-max()-X’ feature. This visualization incorporates five key measures: minimum, first quartile (Q1), median, third quartile (Q3), and maximum.

Figure 4. Box plot.

Analyzing the information presented in Figure 4, the following noteworthy observations were discerned:

▪ When the value of tBodyAcc-max()-X falls below -0.75, the corresponding activities are predominantly identified as Standing, Sitting, or Laying.
▪ In instances where tBodyAcc-max()-X exceeds -0.50, the categorization shifts towards activities characterized as Walking, Walking_Downstairs, or Walking_Upstairs.
▪ Notably, when tBodyAcc-max()-X surpasses 0.00, the specific activity discerned is attributed to Walking_Downstairs.

The confusion matrix for the naïve Bayes classifier is shown in Figure 5. From Figure 5, the following observations were made:

• The actual number of Laying was 389, but the model correctly predicted 92.8% as Laying and incorrectly predicted 5.7% as Sitting and 2.3% as Walking_Upstairs.
• The actual number of Sitting was 372, but the model correctly predicted 90.3% as Sitting and incorrectly predicted 0.5% as Laying, 8.3% as Standing, and 0.8% as Walking_Upstairs.
• The actual number of Standing was 375, but the model correctly predicted 41.9% as Standing and incorrectly predicted 57.1% as Sitting and 1.1% as Walking_Upstairs.
• The actual number of Walking was 345, but the model correctly predicted 74.4% as Walking and incorrectly predicted 9% as Walking_Downstairs and 16.8% as Walking_Upstairs.
• The actual number of Walking_Downstairs was 282, but the model correctly predicted 70.6% as Walking_Downstairs and incorrectly predicted 8.8% as Walking and 20.6% as Walking_Upstairs.
• The actual number of Walking_Upstairs was 297, but the model correctly predicted 92.6% as Walking_Upstairs and incorrectly predicted 5.7% as Walking_Downstairs and 1.7% as Walking.

Figure 5. Confusion matrix for naïve Bayes classification.

Figure 6 illustrates the precision, recall, and F1 score values for each activity class, providing insight into the classifier’s performance across different activities. The interpretations are:

• Laying: This class had the highest precision, recall, and F1 score, indicating that the model most accurately identified instances of “Laying” with minimal errors. The high scores suggest that the model effectively distinguishes “Laying” from other activities, likely due to its distinct features.
• Standing: On the other hand, “Standing” exhibited lower precision, recall, and F1 scores compared to other classes. This indicates that the model struggled to correctly identify “Standing” instances, often confusing them with similar activities such as “Sitting.” The low performance here reflects the challenges in distinguishing between activities with similar postures.
• Other Classes: The performance metrics for activities like “Walking,” “Walking_Upstairs,” and “Walking_Downstairs” showed moderate values, reflecting a balance between accurate predictions and some misclassifications. These activities have overlapping features, making it harder for the model to achieve higher scores.

Figure 6. Precision, recall and F1-Score for naïve Bayesclassifier.

The confusion matrix for the decision tree classifier is depicted in Figure 7. Based on Figure 7, the following observations were made:

• The actual number of instances classified as “Laying” was 389, and the model correctly predicted 100% as “Laying.”
• The actual number of instances classified as “Sitting” was 372. The model correctly predicted 92.9% as “Sitting” and incorrectly predicted 7.0% as “Standing.”
• The actual number of instances classified as “Standing” was 375. The model correctly predicted 92.5% as “Standing” and incorrectly predicted 7.5% as “Sitting.”
• The actual number of instances classified as “Walking” was 345. The model correctly predicted 94.1% as “Walking” and incorrectly predicted 1.7% as “Walking_Downstairs” and 4.3% as “Walking_Upstairs.”
• The actual number of instances classified as “Walking_Downstairs” was 282. The model correctly predicted 91.1% as “Walking_Downstairs” and incorrectly predicted 2.8% as “Walking” and 5.3% as “Walking_Upstairs.”
• The actual number of instances classified as “Walking_Upstairs” was 297. The model correctly predicted 89.6% as “Walking_Upstairs” and incorrectly predicted 4.4% as “Walking” and 6.1% as “Walking_Downstairs.”

Figure 7. Confusion matrix for decision tree.

Precision, recall and F1 score values of decision tree classifier are depicted in the bar plot shown in Figure 8.

Key misclassifications of decision tree classifier:

• Laying: The Decision Tree classifier achieved perfect precision, recall, and F1 scores for the “Laying” class, indicating that all instances of “Laying” were correctly predicted without any misclassifications. This suggests that the model effectively distinguishes “Laying” from other activities, highlighting its strength in accurately identifying this class.
• Standing: The performance for the “Standing” class is also depicted in Figure 8, showing the precision, recall, and F1 scores specific to this activity. The values here reveal how well the classifier managed to identify “Standing” instances. While exact numbers are not provided, the focus would be on understanding whether “Standing” had strong or weak scores compared to other activities.
• Other Classes: The metrics for other activities such as “Walking,” “Walking_Upstairs,” and “Walking_Downstairs” provide insight into the classifier’s performance across these classes. The precision, recall, and F1 scores reflect the model’s ability to correctly classify these activities and highlight any potential areas of misclassification.

Figure 8. Precision, recall and F1 score for decision tree classifier.

The confusion matrix for the K-nearest neighbors classifier is shown in Figure 9. From Figure 9, the following observations were made:

• The actual number of “Laying” was 389, and the model correctly predicted 100% as “Laying.”
• The actual number of “Sitting” was 372, but the model correctly predicted 89.0% as “Sitting” and incorrectly predicted 64.2% as “Standing” and 0.3% as “Sitting.”
• The actual number of “Standing” was 375, but the model correctly predicted 94.4% as “Standing” and incorrectly predicted 5.6% as “Sitting.”
• The actual number of “Walking” was 345, but the model correctly predicted 99.7% as “Walking” and incorrectly predicted 0.3% as “Walking_Upstairs.”
• The actual number of “Walking_Downstairs” was 282, but the model correctly predicted 99.3% as “Walking_Downstairs” and incorrectly predicted 0.7% as “Walking.”
• The actual number of “Walking_Upstairs” was 297, and the model correctly predicted 100% as “Walking_Upstairs.”

Figure 9. Confusion matrix for K-nearest neighbour.

Precision, recall and F1 score values of the K-nearest neighbour classifier are depicted in the bar plot shown in Figure 10.

• Laying: The KNN classifier achieved perfect scores for the “Laying” class, with precision, recall, and F1 score values all at 100%. This indicates that the model correctly identified every instance of “Laying” without any misclassifications. The perfect metrics reflect the classifier’s exceptional performance in recognizing this activity.
• Other Classes: The precision, recall, and F1 scores for other activities, while not specified in detail, would be depicted in Figure 10. These metrics provide insight into how well the KNN classifier performed for activities other than “Laying.” High values across these metrics for other classes would indicate strong overall performance, whereas lower values might suggest areas for improvement.

Figure 10. Precision, recall and F1 score for K-nearest neighbour classifier.

The confusion matrix for the random forest classifier is depicted in Figure 11. From Figure 11, the following observations were made:

• The actual number of instances labeled as “Laying” was 389, and the model correctly predicted 100% as “Laying.”
• The actual number of instances labeled as “Sitting” was 372. However, the model correctly predicted 95.4% as “Sitting” and incorrectly predicted 4.6% as “Standing.”
• The actual number of instances labeled as “Standing” was 375. The model correctly predicted 94.7% as “Standing” and incorrectly predicted 5.3% as “Sitting.”
• The actual number of instances labeled as “Walking” was 345. The model correctly predicted 97.4% as “Walking” and incorrectly predicted 0.6% as “Walking_Downstairs” and 2.0% as “Walking_Upstairs.”
• The actual number of instances labeled as “Walking_Downstairs” was 282. The model correctly predicted 94.0% as “Walking_Downstairs” and incorrectly predicted 4.3% as “Walking” and 1.8% as “Walking_Upstairs.”
• The actual number of instances labeled as “Walking_Upstairs” was 297. The model correctly predicted 96.3% as “Walking_Upstairs” and incorrectly predicted 2.4% as “Walking” and 1.4% as “Walking_Downstairs.”

Figure 11. Confusion matrix for random forest.

Precision, recall and F1 score values of random forest are depicted as a bar plot shown in Figure 12.

• Laying: The Random Forest classifier achieved perfect precision, recall, and F1 score values for the “Laying” class. This means that every instance of “Laying” was correctly identified by the model, with no misclassifications. The 100% scores across these metrics indicate exceptional accuracy in predicting the “Laying” activity.
• Other Classes: The precision, recall, and F1 scores for other activities are also shown in Figure 13. These metrics provide insights into how well the Random Forest classifier performed in predicting activities other than “Laying.” The values would help assess the overall effectiveness of the model in distinguishing between various activities.

Figure 12. Precision, recall and F1 score for random forest classifier.

The confusion matrix for logistic regression is shown in Figure 13. From Figure 13, the following observations were made:

• The actual number of Laying was 389, and the model correctly predicted 100% as Laying.
• The actual number of Sitting was 372, but the model correctly predicted 94.4% as Sitting and incorrectly predicted 5.6% as Standing.
• The actual number of Standing was 375, and the model correctly predicted 96.0% as Standing while incorrectly predicting 4.0% as Sitting.
• The actual number of Walking was 345, and the model correctly predicted 99.7% as Walking while incorrectly predicting 0.3% as Walking_Upstairs.
• The actual number of Walking_Downstairs was 282, and the model correctly predicted 99.6% as Walking_Downstairs while incorrectly predicting 0.4% as Walking_Upstairs.
• The actual number of Walking_Upstairs was 297, and the model correctly predicted 99.3% as Walking_Upstairs while incorrectly predicting 0.3% as Walking and 0.3% as Walking_Downstairs.

Figure 13. Confusion matrix for logistic regression.

The graphical representation in Figure 14 illustrates the precision, recall, and F1 score values associated with logistic regression.

• Laying, Walking, and Walking Downstairs: The classifier demonstrated particularly strong performance for these activities. The precision, recall, and F1 scores for “Laying,” “Walking,” and “Walking Downstairs” are notably high, reflecting the model’s proficiency in accurately identifying and classifying these activities. The clarity and distinctiveness of the bar plot for these instances indicate that the logistic regression model effectively distinguishes these activities from others.
• Other Classes: The precision, recall, and F1 scores for other activities are also illustrated in Figure 15. While the focus is on the high performance for “Laying,” “Walking,” and “Walking Downstairs,” the metrics for other classes would provide additional context on the model’s overall classification capabilities.

Figure 14. Precision, recall and F1 score for logistic regression classifier.

Neural network: The graphical representation of the relationship between epoch and training loss for the human activity dataset is visually depicted in Figure 15. This graph serves as a valuable visual insight into the model’s learning process over successive epochs. Upon close examination of Figure 15, it is evident that the testing loss exhibits a notable decrease, indicating an improvement in the model’s predictive performance. This reduction in testing loss aligns with an observed increase in accuracy, suggesting that the model has become more adept at making accurate predictions on the human activity dataset.

Figure 15. Loss graph for neural network.

Comparison of different algorithms: The exploration of various algorithms employed in training the model has been represented by treating the algorithms as a variable denoted by x, while their corresponding accuracy serves as the dependent variable y. Notably, the proposed neural network model is subjected to a comparative analysis against benchmark models identified as CNN (Convolutional Neural Network) and GRN (Generic Reference Network). The respective accuracy results for all the models are summarized in Table 1.

Table 1. Accuracy of naïve Bayes, decision tree, K-nearest neighbour, random forest, logistic regression, neural network.

Algorithm	Accuracy
Naïve Bayes	76.89
K-nearest neighbours	96.40
Decision tree	93.39
Random forest	96.89
Logistic regression	98.05
Convolutional neural networks	98.22
Gated recurrent networks	98.63
Neural networks [proposed]	98.93

Upon careful examination of accuracy Table 1, it becomes evident that the neural network model outperformed all other algorithms in terms of predictive accuracy. Specifically, the neural network achieved an impressive accuracy rate of 98.93%. This high level of accuracy signifies the model’s efficacy in accurately predicting outcomes, showcasing its superiority over benchmark models. The detailed exploration and visualization of algorithmic performance provide valuable insights into the strengths of the proposed neural network model in comparison to established benchmarks, establishing it as a robust and reliable choice for the given task.

Upon careful examination of Figure 16 and the accuracy comparison table, it becomes evident that the neural network model outperformed all other algorithms in terms of predictive accuracy. Specifically, the neural network achieved an impressive accuracy rate of 98.93%. This high level of accuracy signifies the model’s efficacy in accurately predicting outcomes, showcasing its superiority over benchmark models. The detailed exploration and visualization of algorithmic performance provide valuable insights into the strengths of the proposed neural network model in comparison to established benchmarks, establishing it as a robust and reliable choice for the given task.

Figure 16. Comparison of algorithms.

Conclusions

In conclusion, this research addresses the vital task of human activity recognition, utilizing a dataset derived from triaxial accelerometer and gyroscope sensors. With 561 features and 10,299 records spanning six distinct classes, namely Sitting, Standing, Laying, Walking, Walking_Downstairs, and Walking_Upstairs, our study employed six machine learning algorithms—naïve Bayes, decision tree, random forest, K-nearest neighbours, logistic regression, and neural network.

Notably, our exploration contributes valuable insights to the field by presenting a comprehensive analysis of the performance of these algorithms in predicting human activities. The experimental results reveal the varied accuracies achieved by each model, with the naïve Bayes classifier demonstrating 76.89%, the decision tree classifier achieving 93.39%, the random forest classifier attaining 96.89%, K-nearest neighbours reaching 96.40%, and logistic regression classifier yielding 98.05%. However, the standout performer among these models is the neural network, boasting an impressive accuracy of 98.93%.

This study’s significant contributions lie in its thorough investigation of diverse machine learning methodologies for human activity recognition, shedding light on the strengths and limitations of each algorithm. Furthermore, our work underscores the efficacy of neural networks in achieving a remarkable accuracy rate, positioning it as a promising approach for future endeavors in human activity recognition. These findings not only enhance our understanding of machine learning applications in this domain but also pave the way for more nuanced and effective methodologies in predicting and classifying human activities.

Data availability

Underlying data

Kaggle: Human Activity Recognition with Smartphones, https://www.kaggle.com/datasets/uciml/human-activity-recognition-with-smartphones

Data are available under the terms of the Creative Commons Zero “No rights reserved” data waiver (CC0 1.0 Public domain dedication).

Extended data

Analysis code

Source code available from: https://github.com/someshchinta/Human_Actiity_recognition

Archived source code at time of publication: https://doi.org/10.5281/zenodo.7108706

License: Apache-2.0

References

1. Schüldt IL, Caputo B: Recognizing human actions: a local SVM approach. Pattern Proceedings of the 17th International Conference on Pattern Recognition. 2004; vol. 23: pp. 32–36. Cambridge, UK.
2. Laptev I, Marszalek M, Schmid C, et al.: Learning realistic human actions from movies. 2008 IEEE Conference on Computer Vision and Pattern Recognition. 2008; vol. 4: pp. 1–8. Anchorage, AK, USA.
3. Yamato J, Ohya J, Ishii K: Recognizing human action in time-sequential images using hidden markov model. Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1992; vol. 1992: pp. 379–385, Champaign, IL, USA.
4. Oliver NM, Rosario B, Pentland AP: A Bayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2000; vol. 22: pp. 831–843.
5. Natarajan P, Nevatia R: View and scale invariant action recognition using multiview shape-flow models. 2008 IEEE Conference on Computer Vision and Pattern Recognition. 2008; pp. 1–8. Anchorage, AK, USA.
6. Ning H, Xu W, Gong Y, Huang T: Latent pose estimator for continuous action recognition. Computer Vision--European Conference on Computer Vision 2008. Marseille, France: Springer; 2008; vol. 43. : pp. 419–433.
7. Vail DL, Veloso MM, Lafferty JD: Conditional random fields for activity recognition. Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems. 2007; vol. 1: pp. 235.
8. Madarshahian R, Caicedo JM: Human Activity Recognition Using Multinomial Logistic Regression. Model Validation and Uncertainty Quantification. Vol. 3. Springer; 2015. Publisher Full Text
9. Kiros R, Zhu Y, Salakhutdinov RR, et al.: Skip-thought vectors. Adv. Neural Inf. Proces. Syst. 2015; 1: 3294–3302.
10. Grushin A, Monner DD, Reggia JA, et al.: Robust human action recognition via long short-term memory. The 2013 International Joint Conference on Neural Networks (IJCNN). 2013; vol. 25: pp. 1–8.
11. Veeriah V, Zhuang N, Qi GJ: Differential recurrent neural networks for action recognition. 2015 IEEE International Conference on Computer Vision (ICCV). 2015; vol. 4: pp. 4041–4049.
12. Du Y, Wang W, Wang L: Hierarchical recurrent neural network for skeleton based action recognition. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015; vol. 23: pp. 1110–1118.
13. Ferrari A, Micucci D, Mobilio M, et al.: Deep learning and model personalization in sensor-based human activity recognition.J. Reliable Intell. Environ.2023; 9: 27–39. Publisher Full Text
14. Li Y, Yang G, Zhidong S, et al.:Human activity recognition based on multienvironment sensor data. Inf. Fusion. 2023; 91: 47–63. Publisher Full Text
15. Sarkar A, Hossain SKS, Sarkar R: Human activity recognition from sensor data using spatial attention-aided CNN with genetic algorithm.Neural. Comput. Appli.2023; 35: 5165–5191. PubMed Abstract | Publisher Full Text | Free Full Text
16. Choudhury N, Soni B:An Adaptive Batch Size-Based-CNN-LSTM Framework for Human Activity Recognition in Uncontrolled Environment. IEEE Trans. Industr. Inform. 2023; 19(10): 10379–10387. Publisher Full Text
17. Gholamiangonabadi D, Grolinger K: Personalized models for human activity recognition with wearable sensors: deep neural networks and signal processing.Appl. Intell.2023; 53: 6041–6061. Publisher Full Text
18. Dua N, Singh SN, Semwal VB, et al.: Inception inspired CNN-GRU hybrid network for human activity recognition.Multimed Tools Appl.2023; 82: 5369–5403. Publisher Full Text
19. Wu H, Zhang Z, Li X, et al.:A novel pedal musculoskeletal response based on differential spatio-temporal LSTM for human activity recognition. Knowl. Based Syst. 2023; 261: 110187. Publisher Full Text
20. Liciotti D, Bernardini M, Romeo L, et al.:A sequential deep learning application for recognising human activities in smart homes. Neurocomputing. 2020; 396: 501–513. Publisher Full Text
21. Priyadarshini I, Sharma R, Bhatt D, et al.: Human activity recognition in cyber-physical systems using optimized machine learning techniques.Cluster Comput.2023; 26: 2199–2215. Publisher Full Text
22. Ronald M, Poulose A, Han D: iSPLInception: An Inception-ResNet Deep Learning Architecture for Human Activity Recognition. IEEE Access. 2021; 9: 68985–69001. Publisher Full Text
23. Poulose A, Kim JH, Han DS: HIT HAR: Human Image Threshing Machine for Human Activity Recognition Using Deep Learning Models. Comput. Intell. Neurosci. 2022; 2022: 1–21. PubMed Abstract | Publisher Full Text | Free Full Text
24. Leung KM: Naive bayesian classifier. Polytechnic University Department of Computer Science/Finance and Risk Engineering. 2007: pp. 123–156.
25. Peterson LE:K-nearest neighbor. Scholarpedia. 2009; 4(2): 1883. Publisher Full Text
26. Swain PH, Hauska H:The decision tree classifier: Design and potential. IEEE Trans. Geosci. Electron. 1977; 15(3): 142–147. Publisher Full Text
27. Liaw A, Wiener M:Classification and regression by randomForest. R news. 2002; 2(3): 18–22.
28. Domínguez-Almendros S, Benítez-Parejo N, Gonzalez-Ramirez AR:Logistic regression models. Allergol. Immunopathol. 2011; 39(5): 295–305. Publisher Full Text
29. Féraud R, Clérot F:A methodology to explain neural network classification. Neural Netw. 2002; 15(2): 237–246. Publisher Full Text

Comments on this article Comments (0)

Version 4

VERSION 4 PUBLISHED 06 Mar 2023

Author details Author details

Keerthi Varadhi
Roles: Conceptualization, Methodology

Chinta Someswara Rao
Roles: Methodology, Validation

GNVG Sirisha
Roles: Writing – Original Draft Preparation

Butchi Raju katari
Roles: Formal Analysis, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (4)

version 4

Revised

Published: 05 Nov 2024, 12:247

https://doi.org/10.12688/f1000research.124164.4

version 3

Revised

Published: 30 Sep 2024, 12:247

https://doi.org/10.12688/f1000research.124164.3

version 2

Revised

Published: 06 Feb 2024, 12:247

https://doi.org/10.12688/f1000research.124164.2

version 1

Published: 06 Mar 2023, 12:247

https://doi.org/10.12688/f1000research.124164.1

© 2024 Varadhi K et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The author(s) is/are employees of the US Government and therefore domestic copyright protection in USA does not apply to this work. The work may be protected under the copyright laws of other jurisdictions when used in those jurisdictions.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

SEE MORE DETAILS

CITE

how to cite this article

Varadhi K, Someswara Rao C, Sirisha G and katari BR. Recognizing human activities using light-weight and effective machine learning methodologies [version 4; peer review: 1 approved, 3 not approved]. F1000Research 2024, 12:247 (https://doi.org/10.12688/f1000research.124164.4)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?

Key to Reviewer Statuses VIEW HIDE

ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Version 4

VERSION 4

PUBLISHED 05 Nov 2024

Revised

Views

Reviewer Report 03 Jan 2025

Anna Ferrari, Università degli studi di Milano-Bicocca, Milano, Italy; University of Geneva, Geneva, Geneva, Switzerland

Not Approved

https://doi.org/10.5256/f1000research.172906.r337744

The authors improved the article. However, I don't see the comparison between their work and the works done in the literature. There are many studies that used the same algorithms on the same datasets. A table for comparison would highlight the authors contribution. Furthermore, more details must be given in the Literature review section.

Now there is a list of works.
What is the main point coming from the literature?
What are the algorithms that are the most suitable,
What are the (still) open questions?

About the Methods section: the authors mentioned several steps before using the data for the classification. What are the steps that have been undertaken? What's the amount of data before and after these steps? Did you remove some data? Were there missing values?

Results: How are you results compared to the existing literature? What's the main learning from your study?

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: computer science, statistics, human activity recognition, time-series data, machine learning, data science

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

CITE

Report a concern

Respond or Comment

Views

Reviewer Report 22 Nov 2024

Alwin Poulose, Indian Institute of Science Education and Research Thiruvananthapuram, Thiruvananthapuram, Kerala, India

Approved

https://doi.org/10.5256/f1000research.172906.r337743

The authors addressed all my ... Continue reading

CITE

Report a concern

Respond or Comment

Version 2

VERSION 2

PUBLISHED 06 Feb 2024

Revised

Views

Reviewer Report 16 Sep 2024

Alwin Poulose, Indian Institute of Science Education and Research Thiruvananthapuram, Thiruvananthapuram, Kerala, India

Approved with Reservations

https://doi.org/10.5256/f1000research.161538.r285921

Please find the following comments on this paper.

The abstract should update with better quality. The current form lacks motivation for research, fundamental idea of research, and experiment result details.
The introduction section has

Please find the following comments on this paper.

The abstract should update with better quality. The current form lacks motivation for research, fundamental idea of research, and experiment result details.
The introduction section has a lack of information. The significant contributions of the paper are not evident in the introduction section.
Please consider the following references in your related work section:
1. Ronald M, et al, 2021 [Ref 1]
2. Poulose A, et al 2022 [Ref 2]
Please add the dataset description.
Please show the confusion matrix in percentage form for better understanding.
The research already exists, and what is the main contribution of this work?

Is the rationale for developing the new method (or application) clearly explained?

Partly
Is the description of the method technically sound?

No
Are sufficient details provided to allow replication of the method development and its use by others?

Yes
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Yes

References

1. Ronald M, Poulose A, Han D: iSPLInception: An Inception-ResNet Deep Learning Architecture for Human Activity Recognition. IEEE Access. 2021; 9: 68985-69001 Publisher Full Text
2. Poulose A, Kim JH, Han DS: HIT HAR: Human Image Threshing Machine for Human Activity Recognition Using Deep Learning Models.Comput Intell Neurosci. 2022; 2022: 1808990 PubMed Abstract | Publisher Full Text

Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Views

Reviewer Report 13 Aug 2024

Anna Ferrari, Università degli studi di Milano-Bicocca, Milano, Italy; University of Geneva, Geneva, Geneva, Switzerland

Not Approved

https://doi.org/10.5256/f1000research.161538.r310366

The authors discuss and compare different machine learning algorithms for Human Activity Recognition.

In general: the article must be better organized in sections and subsections to make it more readable.

1. Introduction:
The introduction shows insufficiency in references when definitions or examples of applications are given. It must be further developed by providing extended context around HAR. Why do wearable technologies play a pivotal role in HAR? Why are machine learning algorithms preferred over traditional time series techniques? What has the research community already done? More references are needed.
Furthermore, the cited algorithms were already used in the research community. Do you compare your results with those of other studies? Are your results aligned with them? Are they better? What's the tangible contribution of this research?

2. Literature review
The literature review must be extended and better explained. The table can be used as a support, but it is insufficient to give a complete overview. In the table, more information is needed, such as the dataset used and results. The table must be corrected: what does the first row (First Name, Last, Name, Grade) refer to?

3. Method
In the method section, the authors describe the data flow. In the literature, this is described as the Activity Recognition Process (ARP), which comprises several phases, such as data acquisition, preprocessing, segmentation, and feature extraction and classification. The authors describe some phases missing from the HAR structure. Furthermore, you describe the data-cleaning phase. How was your dataset cleaned?
Feature fusion pipeline: how did the authors fuse the features?
Data splitting: what's the reason to split the dataset into 60% and 20%?

Classification: In this section, the authors describe a list of the algorithms used for the analysis. It is, however, incomplete in the context of the article. For instance, which hyperparameter did you use for each algorithm? Did you implement a fine-tuning procedure to search for the best hyperparameters? More information about the algorithms is needed.

4. Results:
In this section, the authors present the confusion matrix, the recall, precision, and F1 metrics of each algorithms' classification, and the overall comparison between the algorithms based on total accuracy. However, it shows insufficiency a comparison with the results in the literature. Are your results better than those already achieved? If yes, why?

5. Conclusion:
The authors say "
This study’s significant contributions lie in its thorough investigation of diverse machine learning methodologies for human activity recognition, shedding light on the strengths and limitations of each algorithm." This affirmation should be further justified by explaining how each algorithm setup is made (hyperparameter, training, test set, etc.) and by showing how the authors overcame the algorithms' limitations (when possible).

Is the rationale for developing the new method (or application) clearly explained?

Partly
Is the description of the method technically sound?

Partly
Are sufficient details provided to allow replication of the method development and its use by others?

Partly
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Human Activity Recognition, wearable devices, digital health

CITE

Report a concern

Respond or Comment

Views

Reviewer Report 25 Jun 2024

Kristina Host, University of Rijeka, Rijeka, Croatia

Not Approved

https://doi.org/10.5256/f1000research.161538.r267592

Introduction

The introduction is not easy to read, is written with redundant phrases like “giving rise to a plethora of…” and words such as "burgened” and similar. T it's okay to have one or ... Continue reading

Introduction

The introduction is not easy to read, is written with redundant phrases like “giving rise to a plethora of…” and words such as "burgened” and similar. T it's okay to have one or two, but the whole introduction is full of them, which makes it difficult to read.
In the introduction, there should be described the goal and the motivation behind it. Why this task? What will you accomplish with it?

Literature review

It's important to provide more context and explanation rather than just presenting a table.

Methods

Don’t call the subsection input, rename it to dataset. The dataset with the preprocessing is well described.

Fig 3 is redundant, Fig 4,5 there is no need to put the code, but you should describe what it is on the image.

People are familiar with reading a confusion matrix, so interpreting all the data in this way may not be necessary. All of this should be excluded from the paper, instead of this you should emphasize some key misclassification, make some conclusions why this is happening, are there similarities in the activities in real life? Also you should interpret figure 7,9,11 and other with precision, recall,..

Maybe it would be interesting to put the matrices all together like in a subplot and then compare all the results, which methods on some misclassifications performed better and similar. And also make a table like for accuracy for the other metrics to compare everything
Also, fig. 17 is irrelevant is showing the same as table 2.

It is not clear what is the difference between cnn, grn and nn that is proposed? What is specific for this proposed one? You didn’t put for cnn and grn confusion matrix and other metrics?

Also, why is the model trained for more than 120 epochs? Is there some early stop? How did you decide for how many epochs will you train?

Is the rationale for developing the new method (or application) clearly explained?

No
Is the description of the method technically sound?

Partly
Are sufficient details provided to allow replication of the method development and its use by others?

Partly
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Yes

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Computer vision and human action recognition in sports

CITE

Report a concern

Respond or Comment

Version 1

VERSION 1

PUBLISHED 06 Mar 2023

Views

Reviewer Report 19 Dec 2023

Nurul Amin Choudhury, National Institute of Technology Silchar, Silchar, Assam, India

Not Approved

https://doi.org/10.5256/f1000research.136342.r212045

This paper uses wearable sensor data to use multiple machines and deep learning models to recognise daily human activities. The overall idea of the paper is satisfactory but needs significant revisions and state-of-the-art benchmark comparisons, methodologies inclusion and rigorous testing.
1. The authors fail to describe the dataset adequately. Stating the source of the dataset is not sufficient. Please study the papers like - N. A. Choudhury and B. Soni, "An Adaptive Batch Size-Based-CNN-LSTM Framework for Human Activity Recognition in Uncontrolled Environment," in IEEE Transactions on Industrial Informatics, vol. 19, no. 10, pp. 10379-10387, Oct. 2023, doi: 10.1109/TII.2022.3229522 and many more for detailed information.
2. The paper starts with recognising human activities using raw sensor data. However, the UCI dataset has extracted features from raw sensor data, how the authors incorporated raw sensor data from the UCI-HAR dataset.
3. The paper needs to be better written in the form of scientific information, grammar and overall representation. Please use a software tool and verify the draft.
4. The literature survey needs to be done appropriately. Include recent and good-quality journals and top-tier conference papers.
5. Data Cleaning or pre-processing needs to be explained more clearly, and a feature fusion pipeline must be used for enhanced activity recognition performance.
6. Both the abstract and conclusion could be better written. Please rewrite it by incorporating the paper's contributions and novelties.
7. The models' explanation needs to be revised in the paper by incorporating and highlighting the hyperparameters.
8. Achieved results are not compared with the benchmark models.

Is the rationale for developing the new method (or application) clearly explained?

No
Is the description of the method technically sound?

No
Are sufficient details provided to allow replication of the method development and its use by others?

Partly
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

No

References

1. Choudhury N, Soni B: An Adaptive Batch Size-Based-CNN-LSTM Framework for Human Activity Recognition in Uncontrolled Environment. IEEE Transactions on Industrial Informatics. 2023; 19 (10): 10379-10387 Publisher Full Text

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Human Activity Recognition, AI-ML, Feature Engineering, Domain Adaptation, eHealth Applications, Neural Networks.

CITE

Report a concern

Author Response 13 Apr 2024

Keerthi Varadhi, CSE Department, Gokaraju Rangaraju Institute of Engineering and Technology, Hyderabad, 500090, India

13 Apr 2024

Author Response

1. The authors fail to describe the dataset adequately. Stating the source of the dataset is not sufficient. Please study the papers like - N. A. Choudhury and B. Soni, ... Continue reading 1. The authors fail to describe the dataset adequately. Stating the source of the dataset is not sufficient. Please study the papers like - N. A. Choudhury and B. Soni, "An Adaptive Batch Size-Based-CNN-LSTM Framework for Human Activity Recognition in Uncontrolled Environment," in IEEE Transactions on Industrial Informatics, vol. 19, no. 10, pp. 10379-10387, Oct. 2023, doi: 10.1109/TII.2022.3229522 and many more for detailed information.

Answer: Thank you for the suggestion; we described the dataset in detail in the updated paper.

Top of Form

2. The paper starts with recognizing human activities using raw sensor data. However, the UCI dataset has extracted features from raw sensor data, how the authors incorporated raw sensor data from the UCI-HAR dataset.

Answer: Thank you for your suggestion. We have provided a detailed description of the dataset in the updated paper.

3. The paper needs to be better written in the form of scientific information, grammar and overall representation. Please use a software tool and verify the draft.

Answer: Thank you for your suggestion. We have addressed all grammatical mistakes with the assistance of software tools in the updated paper.

4. The literature survey needs to be done appropriately. Include recent and good-quality journals and top-tier conference papers.

Answer: Thank you for your suggestion. We have incorporated additional recent papers from reputable journals such as Springer, Elsevier, and IEEE into the updated literature review.

5. Data Cleaning or pre-processing needs to be explained more clearly, and a feature fusion pipeline must be used for enhanced activity recognition performance.

Answer: Thank you for your suggestion. We have provided a detailed explanation of the pre-processing steps and included a paragraph on the fusion pipeline in the updated paper.

6. Both the abstract and conclusion could be better written. Please rewrite it by incorporating the paper's contributions and novelties.

Answer: Thank you for your suggestion. We have rewritten the abstract and conclusion, incorporating the contributions into the updated paper.

7. The models' explanation needs to be revised in the paper by incorporating and highlighting the hyperparameters.
Answer: Thank you for your suggestion. We have revised the model explanation in the updated paper, emphasizing the inclusion of hyperparameters.

8. Achieved results are not compared with the benchmark models.

Answer: Thank you for your suggestion. We have compared the results with benchmark models in the updated paper.
1. The authors fail to describe the dataset adequately. Stating the source of the dataset is not sufficient. Please study the papers like - N. A. Choudhury and B. Soni, "An Adaptive Batch Size-Based-CNN-LSTM Framework for Human Activity Recognition in Uncontrolled Environment," in IEEE Transactions on Industrial Informatics, vol. 19, no. 10, pp. 10379-10387, Oct. 2023, doi: 10.1109/TII.2022.3229522 and many more for detailed information.

Answer: Thank you for the suggestion; we described the dataset in detail in the updated paper.

Top of Form

2. The paper starts with recognizing human activities using raw sensor data. However, the UCI dataset has extracted features from raw sensor data, how the authors incorporated raw sensor data from the UCI-HAR dataset.

Answer: Thank you for your suggestion. We have provided a detailed description of the dataset in the updated paper.

3. The paper needs to be better written in the form of scientific information, grammar and overall representation. Please use a software tool and verify the draft.

Answer: Thank you for your suggestion. We have addressed all grammatical mistakes with the assistance of software tools in the updated paper.

4. The literature survey needs to be done appropriately. Include recent and good-quality journals and top-tier conference papers.

Answer: Thank you for your suggestion. We have incorporated additional recent papers from reputable journals such as Springer, Elsevier, and IEEE into the updated literature review.

5. Data Cleaning or pre-processing needs to be explained more clearly, and a feature fusion pipeline must be used for enhanced activity recognition performance.

Answer: Thank you for your suggestion. We have provided a detailed explanation of the pre-processing steps and included a paragraph on the fusion pipeline in the updated paper.

6. Both the abstract and conclusion could be better written. Please rewrite it by incorporating the paper's contributions and novelties.

Answer: Thank you for your suggestion. We have rewritten the abstract and conclusion, incorporating the contributions into the updated paper.

7. The models' explanation needs to be revised in the paper by incorporating and highlighting the hyperparameters.
Answer: Thank you for your suggestion. We have revised the model explanation in the updated paper, emphasizing the inclusion of hyperparameters.

8. Achieved results are not compared with the benchmark models.

Answer: Thank you for your suggestion. We have compared the results with benchmark models in the updated paper.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 13 Apr 2024

Keerthi Varadhi, CSE Department, Gokaraju Rangaraju Institute of Engineering and Technology, Hyderabad, 500090, India

13 Apr 2024

Author Response

1. The authors fail to describe the dataset adequately. Stating the source of the dataset is not sufficient. Please study the papers like - N. A. Choudhury and B. Soni, ... Continue reading 1. The authors fail to describe the dataset adequately. Stating the source of the dataset is not sufficient. Please study the papers like - N. A. Choudhury and B. Soni, "An Adaptive Batch Size-Based-CNN-LSTM Framework for Human Activity Recognition in Uncontrolled Environment," in IEEE Transactions on Industrial Informatics, vol. 19, no. 10, pp. 10379-10387, Oct. 2023, doi: 10.1109/TII.2022.3229522 and many more for detailed information.

Answer: Thank you for the suggestion; we described the dataset in detail in the updated paper.

Top of Form

2. The paper starts with recognizing human activities using raw sensor data. However, the UCI dataset has extracted features from raw sensor data, how the authors incorporated raw sensor data from the UCI-HAR dataset.

Answer: Thank you for your suggestion. We have provided a detailed description of the dataset in the updated paper.

3. The paper needs to be better written in the form of scientific information, grammar and overall representation. Please use a software tool and verify the draft.

Answer: Thank you for your suggestion. We have addressed all grammatical mistakes with the assistance of software tools in the updated paper.

4. The literature survey needs to be done appropriately. Include recent and good-quality journals and top-tier conference papers.

Answer: Thank you for your suggestion. We have incorporated additional recent papers from reputable journals such as Springer, Elsevier, and IEEE into the updated literature review.

5. Data Cleaning or pre-processing needs to be explained more clearly, and a feature fusion pipeline must be used for enhanced activity recognition performance.

Answer: Thank you for your suggestion. We have provided a detailed explanation of the pre-processing steps and included a paragraph on the fusion pipeline in the updated paper.

6. Both the abstract and conclusion could be better written. Please rewrite it by incorporating the paper's contributions and novelties.

Answer: Thank you for your suggestion. We have rewritten the abstract and conclusion, incorporating the contributions into the updated paper.

7. The models' explanation needs to be revised in the paper by incorporating and highlighting the hyperparameters.
Answer: Thank you for your suggestion. We have revised the model explanation in the updated paper, emphasizing the inclusion of hyperparameters.

8. Achieved results are not compared with the benchmark models.

Answer: Thank you for your suggestion. We have compared the results with benchmark models in the updated paper.
1. The authors fail to describe the dataset adequately. Stating the source of the dataset is not sufficient. Please study the papers like - N. A. Choudhury and B. Soni, "An Adaptive Batch Size-Based-CNN-LSTM Framework for Human Activity Recognition in Uncontrolled Environment," in IEEE Transactions on Industrial Informatics, vol. 19, no. 10, pp. 10379-10387, Oct. 2023, doi: 10.1109/TII.2022.3229522 and many more for detailed information.

Answer: Thank you for the suggestion; we described the dataset in detail in the updated paper.

Top of Form

2. The paper starts with recognizing human activities using raw sensor data. However, the UCI dataset has extracted features from raw sensor data, how the authors incorporated raw sensor data from the UCI-HAR dataset.

Answer: Thank you for your suggestion. We have provided a detailed description of the dataset in the updated paper.

3. The paper needs to be better written in the form of scientific information, grammar and overall representation. Please use a software tool and verify the draft.

Answer: Thank you for your suggestion. We have addressed all grammatical mistakes with the assistance of software tools in the updated paper.

4. The literature survey needs to be done appropriately. Include recent and good-quality journals and top-tier conference papers.

Answer: Thank you for your suggestion. We have incorporated additional recent papers from reputable journals such as Springer, Elsevier, and IEEE into the updated literature review.

5. Data Cleaning or pre-processing needs to be explained more clearly, and a feature fusion pipeline must be used for enhanced activity recognition performance.

Answer: Thank you for your suggestion. We have provided a detailed explanation of the pre-processing steps and included a paragraph on the fusion pipeline in the updated paper.

6. Both the abstract and conclusion could be better written. Please rewrite it by incorporating the paper's contributions and novelties.

Answer: Thank you for your suggestion. We have rewritten the abstract and conclusion, incorporating the contributions into the updated paper.

7. The models' explanation needs to be revised in the paper by incorporating and highlighting the hyperparameters.
Answer: Thank you for your suggestion. We have revised the model explanation in the updated paper, emphasizing the inclusion of hyperparameters.

8. Achieved results are not compared with the benchmark models.

Answer: Thank you for your suggestion. We have compared the results with benchmark models in the updated paper.
Competing Interests: No competing interests were disclosed. Close
Report a concern

Comments on this article Comments (0)

Version 4

VERSION 4 PUBLISHED 06 Mar 2023

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2	3	4
Version 4 (revision) 05 Nov 24			read	read
Version 3 (revision) 30 Sep 24
Version 2 (revision) 06 Feb 24		read	read	read
Version 1 06 Mar 23	read

Nurul Amin Choudhury, National Institute of Technology Silchar, Silchar, India
Kristina Host, University of Rijeka, Rijeka, Croatia
Anna Ferrari, Università degli studi di Milano-Bicocca, Milano, Italy; University of Geneva, Geneva, Switzerland
Alwin Poulose, Indian Institute of Science Education and Research Thiruvananthapuram, Thiruvananthapuram, India

Comments on this article

All Comments(0)

Add a comment

Browse by related subjects

Back to all reports

Reviewer Report

2 Views

03 Jan 2025 | for Version 4

Anna Ferrari, Università degli studi di Milano-Bicocca, Milano, Italy; University of Geneva, Geneva, Geneva, Switzerland

2 Views Cite this report Responses(0)

Not Approved

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

computer science, statistics, human activity recognition, time-series data, machine learning, data science

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

7 Views

22 Nov 2024 | for Version 4

Alwin Poulose, Indian Institute of Science Education and Research Thiruvananthapuram, Thiruvananthapuram, Kerala, India

7 Views Cite this report Responses(0)

Approved

The authors addressed all my comments. No more further comments.

Competing Interests

No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

13 Views

16 Sep 2024 | for Version 2

Alwin Poulose, Indian Institute of Science Education and Research Thiruvananthapuram, Thiruvananthapuram, Kerala, India

13 Views Cite this report Responses(0)

Approved With Reservations

Please find the following comments on this paper.

The abstract should update with better quality. The current form lacks motivation for research, fundamental idea of research, and experiment result details.
The introduction section has a lack of information. The significant contributions of the paper are not evident in the introduction section.
Please consider the following references in your related work section:
1. Ronald M, et al, 2021 [Ref 1]
2. Poulose A, et al 2022 [Ref 2]
Please add the dataset description.
Please show the confusion matrix in percentage form for better understanding.
The research already exists, and what is the main contribution of this work?

Is the rationale for developing the new method (or application) clearly explained?

Partly
Is the description of the method technically sound?

No
Are sufficient details provided to allow replication of the method development and its use by others?

Yes
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Yes

References

Competing Interests

No competing interests were disclosed.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

8 Views

13 Aug 2024 | for Version 2

Anna Ferrari, Università degli studi di Milano-Bicocca, Milano, Italy; University of Geneva, Geneva, Geneva, Switzerland

8 Views Cite this report Responses(0)

Not Approved

Is the rationale for developing the new method (or application) clearly explained?

Partly
Is the description of the method technically sound?

Partly
Are sufficient details provided to allow replication of the method development and its use by others?

Partly
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Human Activity Recognition, wearable devices, digital health

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

14 Views

25 Jun 2024 | for Version 2

Kristina Host, University of Rijeka, Rijeka, Croatia

14 Views Cite this report Responses(0)

Not Approved

Introduction

Literature review

It's important to provide more context and explanation rather than just presenting a table.

Methods

Is the rationale for developing the new method (or application) clearly explained?

No
Is the description of the method technically sound?

Partly
Are sufficient details provided to allow replication of the method development and its use by others?

Partly
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Yes

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Computer vision and human action recognition in sports

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

30 Views

19 Dec 2023 | for Version 1

Nurul Amin Choudhury, National Institute of Technology Silchar, Silchar, Assam, India

30 Views Cite this report Responses(1)

Not Approved

Is the rationale for developing the new method (or application) clearly explained?

No
Is the description of the method technically sound?

No
Are sufficient details provided to allow replication of the method development and its use by others?

Partly
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

No

References

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Human Activity Recognition, AI-ML, Feature Engineering, Domain Adaptation, eHealth Applications, Neural Networks.

Respond to this report

Responses (1)

Author Response

13 Apr 2024

Keerthi Varadhi, CSE Department, Gokaraju Rangaraju Institute of Engineering and Technology, Hyderabad, 500090, India

1. The authors fail to describe the dataset adequately. Stating the source of the dataset is not sufficient. Please study the papers like - N. A. Choudhury and B. Soni, "An Adaptive Batch Size-Based-CNN-LSTM Framework for Human Activity Recognition in Uncontrolled Environment," in IEEE Transactions on Industrial Informatics, vol. 19, no. 10, pp. 10379-10387, Oct. 2023, doi: 10.1109/TII.2022.3229522 and many more for detailed information.

Answer: Thank you for the suggestion; we described the dataset in detail in the updated paper.

Top of Form

2. The paper starts with recognizing human activities using raw sensor data. However, the UCI dataset has extracted features from raw sensor data, how the authors incorporated raw sensor data from the UCI-HAR dataset.

Answer: Thank you for your suggestion. We have provided a detailed description of the dataset in the updated paper.

3. The paper needs to be better written in the form of scientific information, grammar and overall representation. Please use a software tool and verify the draft.

Answer: Thank you for your suggestion. We have addressed all grammatical mistakes with the assistance of software tools in the updated paper.

4. The literature survey needs to be done appropriately. Include recent and good-quality journals and top-tier conference papers.

Answer: Thank you for your suggestion. We have incorporated additional recent papers from reputable journals such as Springer, Elsevier, and IEEE into the updated literature review.

5. Data Cleaning or pre-processing needs to be explained more clearly, and a feature fusion pipeline must be used for enhanced activity recognition performance.

Answer: Thank you for your suggestion. We have provided a detailed explanation of the pre-processing steps and included a paragraph on the fusion pipeline in the updated paper.

6. Both the abstract and conclusion could be better written. Please rewrite it by incorporating the paper's contributions and novelties.

Answer: Thank you for your suggestion. We have rewritten the abstract and conclusion, incorporating the contributions into the updated paper.

7. The models' explanation needs to be revised in the paper by incorporating and highlighting the hyperparameters.
Answer: Thank you for your suggestion. We have revised the model explanation in the updated paper, emphasizing the inclusion of hyperparameters.

8. Achieved results are not compared with the benchmark models.

Answer: Thank you for your suggestion. We have compared the results with benchmark models in the updated paper.

View more View less

Competing Interests

No competing interests were disclosed.

Alongside their report, reviewers assign a status to the article:

Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions

[1] 1. Schüldt IL, Caputo B: Recognizing human actions: a local SVM approach. Pattern Proceedings of the 17th International Conference on Pattern Recognition. 2004; vol. 23: pp. 32–36. Cambridge, UK.

[2] 2. Laptev I, Marszalek M, Schmid C, et al.: Learning realistic human actions from movies. 2008 IEEE Conference on Computer Vision and Pattern Recognition. 2008; vol. 4: pp. 1–8. Anchorage, AK, USA.

[3] 3. Yamato J, Ohya J, Ishii K: Recognizing human action in time-sequential images using hidden markov model. Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1992; vol. 1992: pp. 379–385, Champaign, IL, USA.

[4] 4. Oliver NM, Rosario B, Pentland AP: A Bayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2000; vol. 22: pp. 831–843.

[5] 5. Natarajan P, Nevatia R: View and scale invariant action recognition using multiview shape-flow models. 2008 IEEE Conference on Computer Vision and Pattern Recognition. 2008; pp. 1–8. Anchorage, AK, USA.

[6] 6. Ning H, Xu W, Gong Y, Huang T: Latent pose estimator for continuous action recognition. Computer Vision--European Conference on Computer Vision 2008. Marseille, France: Springer; 2008; vol. 43. : pp. 419–433.

[7] 7. Vail DL, Veloso MM, Lafferty JD: Conditional random fields for activity recognition. Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems. 2007; vol. 1: pp. 235.

[8] 8. Madarshahian R, Caicedo JM: Human Activity Recognition Using Multinomial Logistic Regression. Model Validation and Uncertainty Quantification. Vol. 3. Springer; 2015. Publisher Full Text

[9] 9. Kiros R, Zhu Y, Salakhutdinov RR, et al.: Skip-thought vectors. Adv. Neural Inf. Proces. Syst. 2015; 1: 3294–3302.

[10] 10. Grushin A, Monner DD, Reggia JA, et al.: Robust human action recognition via long short-term memory. The 2013 International Joint Conference on Neural Networks (IJCNN). 2013; vol. 25: pp. 1–8.

[11] 11. Veeriah V, Zhuang N, Qi GJ: Differential recurrent neural networks for action recognition. 2015 IEEE International Conference on Computer Vision (ICCV). 2015; vol. 4: pp. 4041–4049.

[12] 12. Du Y, Wang W, Wang L: Hierarchical recurrent neural network for skeleton based action recognition. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015; vol. 23: pp. 1110–1118.

[13] 13. Ferrari A, Micucci D, Mobilio M, et al.: Deep learning and model personalization in sensor-based human activity recognition.J. Reliable Intell. Environ.2023; 9: 27–39. Publisher Full Text

[14] 14. Li Y, Yang G, Zhidong S, et al.:Human activity recognition based on multienvironment sensor data. Inf. Fusion. 2023; 91: 47–63. Publisher Full Text

[15] 15. Sarkar A, Hossain SKS, Sarkar R: Human activity recognition from sensor data using spatial attention-aided CNN with genetic algorithm.Neural. Comput. Appli.2023; 35: 5165–5191. PubMed Abstract | Publisher Full Text | Free Full Text

[16] 16. Choudhury N, Soni B:An Adaptive Batch Size-Based-CNN-LSTM Framework for Human Activity Recognition in Uncontrolled Environment. IEEE Trans. Industr. Inform. 2023; 19(10): 10379–10387. Publisher Full Text

[17] 17. Gholamiangonabadi D, Grolinger K: Personalized models for human activity recognition with wearable sensors: deep neural networks and signal processing.Appl. Intell.2023; 53: 6041–6061. Publisher Full Text

[18] 18. Dua N, Singh SN, Semwal VB, et al.: Inception inspired CNN-GRU hybrid network for human activity recognition.Multimed Tools Appl.2023; 82: 5369–5403. Publisher Full Text

[19] 19. Wu H, Zhang Z, Li X, et al.:A novel pedal musculoskeletal response based on differential spatio-temporal LSTM for human activity recognition. Knowl. Based Syst. 2023; 261: 110187. Publisher Full Text

[20] 20. Liciotti D, Bernardini M, Romeo L, et al.:A sequential deep learning application for recognising human activities in smart homes. Neurocomputing. 2020; 396: 501–513. Publisher Full Text

[21] 21. Priyadarshini I, Sharma R, Bhatt D, et al.: Human activity recognition in cyber-physical systems using optimized machine learning techniques.Cluster Comput.2023; 26: 2199–2215. Publisher Full Text

[22] 22. Ronald M, Poulose A, Han D: iSPLInception: An Inception-ResNet Deep Learning Architecture for Human Activity Recognition. IEEE Access. 2021; 9: 68985–69001. Publisher Full Text

[23] 23. Poulose A, Kim JH, Han DS: HIT HAR: Human Image Threshing Machine for Human Activity Recognition Using Deep Learning Models. Comput. Intell. Neurosci. 2022; 2022: 1–21. PubMed Abstract | Publisher Full Text | Free Full Text

[24] 24. Leung KM: Naive bayesian classifier. Polytechnic University Department of Computer Science/Finance and Risk Engineering. 2007: pp. 123–156.

[25] 25. Peterson LE:K-nearest neighbor. Scholarpedia. 2009; 4(2): 1883. Publisher Full Text

[26] 26. Swain PH, Hauska H:The decision tree classifier: Design and potential. IEEE Trans. Geosci. Electron. 1977; 15(3): 142–147. Publisher Full Text

[27] 27. Liaw A, Wiener M:Classification and regression by randomForest. R news. 2002; 2(3): 18–22.

[28] 28. Domínguez-Almendros S, Benítez-Parejo N, Gonzalez-Ramirez AR:Logistic regression models. Allergol. Immunopathol. 2011; 39(5): 295–305. Publisher Full Text

[29] 29. Féraud R, Clérot F:A methodology to explain neural network classification. Neural Netw. 2002; 15(2): 237–246. Publisher Full Text

Recognizing human activities using light-weight and effective machine learning methodologies

Abstract

Background

Methods

Results

Conclusions

Keywords

Revised Amendments from Version 3

Introduction

Contributions

Literature review

Methods

Figure 1. Human activity recognition structure.

Algorithm

(1)

(2)

(3)

(4)

(5)

Algorithm

(6)

(7)

Algorithm

(8)

Algorithm

Algorithm

Figure 2. Neural network model structure.

(9)

(10)

(11)

(12)

Results and discussion

Dataset

Figure 3. Bar graph for the attribute ‘Activity’.

Figure 4. Box plot.

Figure 5. Confusion matrix for naïve Bayes classification.

Figure 6. Precision, recall and F1-Score for naïve Bayesclassifier.

Figure 7. Confusion matrix for decision tree.

Figure 8. Precision, recall and F1 score for decision tree classifier.

Figure 9. Confusion matrix for K-nearest neighbour.

Figure 10. Precision, recall and F1 score for K-nearest neighbour classifier.

Figure 11. Confusion matrix for random forest.

Figure 12. Precision, recall and F1 score for random forest classifier.

Figure 13. Confusion matrix for logistic regression.

Figure 14. Precision, recall and F1 score for logistic regression classifier.

Figure 15. Loss graph for neural network.

Table 1. Accuracy of naïve Bayes, decision tree, K-nearest neighbour, random forest, logistic regression, neural network.

Figure 16. Comparison of algorithms.

Conclusions

Data availability

Underlying data

Extended data

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated