<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.51368.2</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Method Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Deep supervised hashing for gait retrieval</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 2; peer review: 2 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Sayeed</surname>
                        <given-names>Shohel</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-0052-4870</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Min</surname>
                        <given-names>Pa Pa</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Ong</surname>
                        <given-names>Thian Song</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Faculty of Information Science and Technology, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka, 75450, Malaysia</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:shohel.sayeed@mmu.edu.my">shohel.sayeed@mmu.edu.my</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>22</day>
                <month>6</month>
                <year>2022</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2021</year>
            </pub-date>
            <volume>10</volume>
            <elocation-id>1038</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>20</day>
                    <month>6</month>
                    <year>2022</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2022 Sayeed S et al.</copyright-statement>
                <copyright-year>2022</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/10-1038/pdf"/>
            <abstract>
                <p>
                    <bold>Background:</bold> Gait recognition is perceived as the most promising biometric approach for future decades especially because of its efficient applicability in surveillance systems. Due to recent growth in the use of gait biometrics across surveillance systems, the ability to rapidly search for the required data has become an emerging need. Therefore, we addressed the gait retrieval problem, which retrieves people with gaits similar to a query subject from a large-scale dataset.</p>
                <p>
                    <bold>Methods:</bold> This paper presents the deep gait retrieval hashing (DGRH) model to address the gait retrieval problem for large-scale datasets. Our proposed method is based on a supervised hashing method with a deep convolutional network. We use the ability of the convolutional neural network (CNN) to capture the semantic gait features for feature representation and learn the compact hash codes with the compatible hash function. Therefore, our DGRH model combines gait feature learning with binary hash codes. In addition, the learning loss is designed with a classification loss function that learns to preserve similarity and a quantization loss function that controls the quality of the hash codes</p>
                <p>
                    <bold>Results:</bold> The proposed method was evaluated against the CASIA-B, OUISIR-LP, and OUISIR-MVLP benchmark datasets and received the promising result for gait retrieval tasks.</p>
                <p>
                    <bold>Conclusions: </bold>The end-to-end deep supervised hashing model is able to learn discriminative gait features and is efficient in terms of the storage memory and speed for gait retrieval.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Gait Retrieval</kwd>
                <kwd>Deep Supervised Hashing</kwd>
                <kwd>Convolutional Neural Network</kwd>
                <kwd>Binary codes</kwd>
            </kwd-group>
            <funding-group>
                <funding-statement>The author(s) declared that no grants were involved in supporting this work.</funding-statement>
            </funding-group>
        </article-meta>
        <notes>
            <sec sec-type="version-changes">
                <label>Revised</label>
                <title>Amendments from Version 1</title>
                <p>This latest version is modified according to the reviewer's comments. Firstly, the results and discussion part is more spelt out to provide clarity of our research findings. We also included the limitations of our method for future enhancement. The up-to-date new references are also included based on the reviewer's comments.</p>
            </sec>
        </notes>
    </front>
    <body>
        <sec id="sec1" sec-type="intro">
            <title>Introduction</title>
            <p>Gait recognition is perceived as the most promising biometric approach
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>-
                    <xref ref-type="bibr" rid="ref3">3</xref>
                </sup> among behavioural biometric approaches, especially because of its efficient applicability in surveillance systems. However, gait recognition and identification tasks are becoming increasingly difficult due to the large-scale images and videos generated from surveillance systems. To ease the burden of real-world large dataset problems, researchers have applied person re-identification and retrieval approaches for surveillance video analysis or identify the person of interest. The re-identification approach finds the targeted person given by a query image from different cameras and a different time.
                <sup>
                    <xref ref-type="bibr" rid="ref4">4</xref>,
                    <xref ref-type="bibr" rid="ref5">5</xref>
                </sup> Similar to re-identification, gait retrieval is also used to retrieve people with similar gait from a large-scale dataset given by the query subject. In contrast, gait re-identification considers the one-to-one problem with the top-ranked item within the dataset and from the dataset, and the retrieval normally considers all the top items from the ranked list (from one to many). The retrieval problem is addressed in many biometric applications, such as face retrieval and other large-scale content searches, because of the efficiency in tracking and locating similar content.
                <sup>
                    <xref ref-type="bibr" rid="ref6">6</xref>
                </sup>
            </p>
            <p>To retrieve visually or semantically similar content, the traditional approach is to search for similar contents by ranking the contents from the database based on the similarity with the query features, and the nearest contents are returned. Nevertheless, this approach affects the computation time and memory of large-scale databases. To address these speed and storage issues, hashing methods have been proposed for use in different text, video, and image retrieval tasks.
                <sup>
                    <xref ref-type="bibr" rid="ref7">7</xref>
                </sup>
            </p>
            <p>This paper presents the deep gait retrieval hashing (DGRH) model to address the gait retrieval problem for large-scale datasets. Due to the recent growth in gait data across surveillance systems, the ability to rapidly search for the required data has become an emerging need. Linearly searching for real-value features may affect the computation time and memory storage, so the approximate nearest neighbour (ANN) search approach via hashing has attracted increasing attention. The goal of the hashing approach is to represent the input images as hash codes and learn the similarity of the learned binary codes. Similarity searching can be efficiently implemented when the high-dimensional data are transformed into compact binary codes with hashing functions. Our proposed method used the supervised hashing method with a deep convolutional network. The supervised hashing approach takes label information of each gait to generate binary codes. To extract the discriminant features from the input gait, we use a deep convolutional neural network instead of manually extracting the features. We use the convolutional neural network (CNN) to capture the semantic gait features for feature representation and learn the compact hash codes with the compatible hash function. Therefore, our DGRH model is a supervised hashing model that combines gait feature extraction and binary hash code learning. The pipeline for our proposed (DGRH) model includes:
                <list list-type="bullet">
                    <list-item>
                        <label>&#x2022;</label>
                        <p>The sub-network of convolutional layers and pooling layers is used to extract the gait features</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>The binary hash codes are generated from the last fully connected layer</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>The classification and quantization loss are used to optimize the network and learn the hash function</p>
                    </list-item>
                </list>
            </p>
            <sec id="sec2">
                <title>Related work</title>
                <p>Similarity searching can be efficiently implemented with compact binary codes generated with hashing methods. The current hashing methods can be divided into unsupervised and supervised hashing. Unsupervised hashing methods use unlabelled data and only training data to learn the hash function and perform neighbourhood relation clustering in a Hamming space. For example, kernelized locality-sensitive hashing
                    <sup>
                        <xref ref-type="bibr" rid="ref8">8</xref>
                    </sup> uses the random projection of the hash function and constructs the kernel function to perform the similarity search. Liu 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref9">9</xref>
                    </sup> proposed anchor graph hashing to build a neighbourhood graph that learns binary hash codes to map similarities in a Hamming space. Neighbourhood discriminant hashing (NDH)
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> utilizes the local discriminant information from the neighbourhood structure so that the data labels can be predicted in a Hamming space.</p>
                <p>To reduce the complexity and perform efficient semantic similarity searching, supervised hashing approaches are practised. Supervised hashing takes advantage of label information, pairwise similarity information, or data point similarity. Liu 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref11">11</xref>
                    </sup> proposed supervised hashing with kernels by using image pairs and converted them into binary code to map the data in Hamming distance. The kernel formulation minimizes the distance between similar pairs and maximizes the distance on dissimilar pairs. Shen 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref12">12</xref>
                    </sup> also introduced supervised discrete hashing, which combines linear classification with the generation of hash codes into a model and addresses supervised hashing problems. Then, Lin 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref13">13</xref>
                    </sup> proposed supervised hashing with a two-step model: binary code learning in the first step and hash function learning in the second step.</p>
                <p>The above papers used manual feature learning, which has limitations on the diversity of the dataset and the performance. Therefore, current researchers utilize the advantages of deep networks, which can extract visual features from raw data with minimal pre-processing. Xia 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref14">14</xref>
                    </sup> introduced a two-step paradigm for supervised hash learning with a convolutional neural network. They pre-processed the input images with a pairwise similarity matrix to give the approximated hash codes for training images and then used a convolutional neural network (AlexNet) for the feature learning of input images as well as for learning the hash codes. However, a CNN has limitations for improving hash code learning, and the one-stage method became the norm of later deep supervised hashing methods. Other researchers proposed deep supervised hashing techniques for different kinds of inputs, such as single data, paired data, and triplet pairs of data. Lai 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref15">15</xref>
                    </sup> proposed a deep hashing architecture with triplet pair images as the input and convolutional layers to extract effective image features. They used the divide and encode module to extract the image features into branches that correspond to each hash code, and then the triplet ranking loss function was used to perceive the similarities. Thereafter, Zhang 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref16">16</xref>
                    </sup> were inspired to use this approach for their person re-identification problem because the optimization of triplet ranking is able to capture the variation differences between the intra-class and inter-class rankings. The supervised hashing of semantic similarity based on data pairs has also received attention because of improvements in the quality of hash coding. The deep hashing network (DNH) proposed by
                    <sup>
                        <xref ref-type="bibr" rid="ref17">17</xref>
                    </sup> uses paired data for image representation and a convolutional neural network for feature extraction. The last layer of the deep network, a fully connected layer, is used to generate binary hash codes. To preserve the similarity between the pairs of images, the pairwise cross-entropy loss is adopted, and the pairwise quantization loss is adopted to control the quality of the hash code. The improved version of pairwise deep hashing is presented in DCH (deep Cauchy hashing), which uses the Cauchy distribution to design the pairwise cross-entropy loss.
                    <sup>
                        <xref ref-type="bibr" rid="ref18">18</xref>
                    </sup> The Cauchy cross-entropy loss is adapted from the Bayesian framework and well designed for the Hamming space retrieval.</p>
                <p>To address the gait retrieval problem, Zhou 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref6">6</xref>
                    </sup> presented the kernel-based semantic hashing method and used the Gaussian kernel function to map the gait data into the hashing function. The learned hashing function is later optimized by the triplet ranking loss, and the binary codes of gait data are stored in a database. To retrieve the given query data, the semantic ranking list is obtained based on the Hamming distance between the query data and the gait database. Rauf 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref19">19</xref>
                    </sup> also proposed deep supervised hashing for gait retrieval using triplet pair gait data. They designed the hash model with a three-channel convolutional neural network sharing the same parameter. The hash layer is added after fully connected layers to generate the hash code. The triplet ranking loss is used for optimization, and the associated ranking list is based on labels. Their method outperformed other traditional methods because of the robustness of the CNN in visual feature learning.</p>
            </sec>
            <sec id="sec3">
                <title>Proposed method</title>
                <p>
                    <italic toggle="yes">Gait representation</italic>
                </p>
                <p>As shown in 
                    <xref ref-type="fig" rid="f1">Figure 1</xref>, we use the gait energy image (GEI). The gait energy image is a spatiotemporal gait representation that represents gait features in a single image. GEIs convert a sequence of gait silhouettes into a two-dimensional image that preserves the human motion. GEIs were first introduced by
                    <sup>
                        <xref ref-type="bibr" rid="ref20">20</xref>
                    </sup> to reduce the burden of limited gait training templates. Since the GEI can capture both temporal and spatial information, it has become the most popular gait representation. In addition, GEIs also include information on both the silhouette shape and the dynamic walking motion. GEIs can be obtained by extracting the silhouette of the human and averaging the sequence of silhouettes. The details of computing the gait energy image can be found in.
                    <sup>
                        <xref ref-type="bibr" rid="ref20">20</xref>
                    </sup>
                </p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>Sample gait energy image from the OUISIR-Large Population dataset.</title>
                    </caption>
                    <graphic id="gr1" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/134760/842b0ed6-07bc-4002-974c-c44e4e5ff454_figure1.gif"/>
                </fig>
                <p>
                    <italic toggle="yes">Proposed deep gait retrieval hashing (DGRH) model</italic>
                </p>
                <p>The idea of similarity preservation in a hashing method is that similar codes are generated from semantically similar data. Therefore, mathematically, the goal of the hash function is to learn the image space 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>X</mml:mi>
                        </mml:math>
                    </inline-formula> to the mapping with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>H</mml:mi>
                            <mml:mo>&#x2192;</mml:mo>
                            <mml:mo>:</mml:mo>
                            <mml:mi>X</mml:mi>
                            <mml:mspace width="0.25em"/>
                            <mml:msup>
                                <mml:mfenced close="}" open="{" separators=",">
                                    <mml:mrow>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mn>1</mml:mn>
                                    </mml:mrow>
                                    <mml:mn>1</mml:mn>
                                </mml:mfenced>
                                <mml:mi>K</mml:mi>
                            </mml:msup>
                        </mml:math>
                    </inline-formula> into the Hamming space. Here, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>K</mml:mi>
                            <mml:mspace width="0.25em"/>
                        </mml:math>
                    </inline-formula> is the k-bit binary hash code generated from the input image 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>X</mml:mi>
                        </mml:math>
                    </inline-formula>.</p>
                <p>Suppose 
                    <italic toggle="yes">N</italic> training GEI images are given, denoted by 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msubsup>
                                <mml:mfenced close="}" open="{">
                                    <mml:msub>
                                        <mml:mi>x</mml:mi>
                                        <mml:mi>i</mml:mi>
                                    </mml:msub>
                                </mml:mfenced>
                                <mml:mrow>
                                    <mml:mi>i</mml:mi>
                                    <mml:mo>=</mml:mo>
                                    <mml:mi>i</mml:mi>
                                </mml:mrow>
                                <mml:mi>N</mml:mi>
                            </mml:msubsup>
                        </mml:math>
                    </inline-formula>, and each 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>x</mml:mi>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> is represented by the dimension feature vector D so that 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>x</mml:mi>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2208;</mml:mo>
                            <mml:msup>
                                <mml:mi mathvariant="normal">&#x211d;</mml:mi>
                                <mml:mi>D</mml:mi>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. Therefore, X = [
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>x</mml:mi>
                                <mml:mn>1</mml:mn>
                            </mml:msub>
                            <mml:mo>,</mml:mo>
                            <mml:msub>
                                <mml:mi>x</mml:mi>
                                <mml:mn>2</mml:mn>
                            </mml:msub>
                            <mml:mo>,</mml:mo>
                            <mml:msub>
                                <mml:mi>x</mml:mi>
                                <mml:mn>3</mml:mn>
                            </mml:msub>
                            <mml:mo>,</mml:mo>
                            <mml:mo>&#x2026;</mml:mo>
                            <mml:mo>,</mml:mo>
                            <mml:msub>
                                <mml:mi>x</mml:mi>
                                <mml:mi>N</mml:mi>
                            </mml:msub>
                            <mml:mo stretchy="true">]</mml:mo>
                        </mml:math>
                    </inline-formula> 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2208;</mml:mo>
                            <mml:msup>
                                <mml:mi mathvariant="normal">&#x211d;</mml:mi>
                                <mml:mrow>
                                    <mml:mi>D</mml:mi>
                                    <mml:mo>&#x00d7;</mml:mo>
                                    <mml:mi>N</mml:mi>
                                </mml:mrow>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. The label matrix Y in 
                    <italic toggle="yes">N</italic> training GEIs is denoted as Y=[
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>y</mml:mi>
                                <mml:mn>1</mml:mn>
                            </mml:msub>
                            <mml:mo>,</mml:mo>
                            <mml:msub>
                                <mml:mi>y</mml:mi>
                                <mml:mn>2</mml:mn>
                            </mml:msub>
                            <mml:mo>,</mml:mo>
                            <mml:msub>
                                <mml:mi>y</mml:mi>
                                <mml:mn>3</mml:mn>
                            </mml:msub>
                            <mml:mo>,</mml:mo>
                            <mml:mo>&#x2026;</mml:mo>
                            <mml:mo>,</mml:mo>
                            <mml:msub>
                                <mml:mi>y</mml:mi>
                                <mml:mi>N</mml:mi>
                            </mml:msub>
                            <mml:mo stretchy="true">]</mml:mo>
                        </mml:math>
                    </inline-formula> 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2208;</mml:mo>
                            <mml:msup>
                                <mml:mi mathvariant="normal">&#x211d;</mml:mi>
                                <mml:mrow>
                                    <mml:mi>C</mml:mi>
                                    <mml:mo>&#x00d7;</mml:mo>
                                    <mml:mi>N</mml:mi>
                                </mml:mrow>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>, and the number of classes is denoted as 
                    <italic toggle="yes">C.</italic> If the 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>i</mml:mi>
                        </mml:math>
                    </inline-formula>th GEI belongs to the 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>j</mml:mi>
                        </mml:math>
                    </inline-formula>th class, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>y</mml:mi>
                                <mml:mi mathvariant="italic">ij</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> = 1, and if not, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>y</mml:mi>
                                <mml:mi mathvariant="italic">ij</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> = 0. Therefore, our proposed hashing method will learn the hash codes 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>h</mml:mi>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                            <mml:mo>&#x2208;</mml:mo>
                            <mml:msup>
                                <mml:mfenced close="}" open="{" separators=",">
                                    <mml:mrow>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mn>1</mml:mn>
                                    </mml:mrow>
                                    <mml:mn>1</mml:mn>
                                </mml:mfenced>
                                <mml:mi>K</mml:mi>
                            </mml:msup>
                        </mml:math>
                    </inline-formula> from each input GEI, where 
                    <italic toggle="yes">K</italic> is the length of the binary codes.</p>
                <p>As shown in 
                    <xref ref-type="fig" rid="f2">Figure 2</xref>, our proposed model takes the gait energy image (GEI) as the input gait data and extracts the features using the deep convolutional neural network (CNN) presented by.
                    <sup>
                        <xref ref-type="bibr" rid="ref21">21</xref>
                    </sup> The architecture of the CNN is composed of convolutional and pooling layers, a fully connected layer, and a hashing layer; the last layer of the CNN network generates the binary hash code. There are four convolutional layers with different numbers of filters for each layer (16, 32, 64, 128), and the filter size for all layers is 3*3. After each convolutional layer, there is a max-pooling layer with a stride of 2. The first fully connected layer also takes the number of hidden neurons (1024) as the parameter, and the last fully connected layer serves as the fully connected hash (FCH) layer for hash function learning. The detailed architecture of the CNN proposed by
                    <sup>
                        <xref ref-type="bibr" rid="ref21">21</xref>
                    </sup> is shown in 
                    <xref ref-type="table" rid="T1">Table 1</xref>. We used the leaky rectified linear unit (LeakyReLU) activation function for all the layers except the hash layer. To learn the hash function, we used a hyperbolic tangent (tanh) activation function to compress the output of the last fully connected layers to the range of [&#x2212;1,1]. The tanh function is defined as:
                    <disp-formula id="e1">
                        <mml:math display="block">
                            <mml:mo>tanh</mml:mo>
                            <mml:mfenced close=")" open="(">
                                <mml:mi>z</mml:mi>
                            </mml:mfenced>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mrow>
                                    <mml:msup>
                                        <mml:mi>e</mml:mi>
                                        <mml:mi>z</mml:mi>
                                    </mml:msup>
                                    <mml:mo>&#x2212;</mml:mo>
                                    <mml:msup>
                                        <mml:mi>e</mml:mi>
                                        <mml:mrow>
                                            <mml:mo>&#x2212;</mml:mo>
                                            <mml:mi>z</mml:mi>
                                        </mml:mrow>
                                    </mml:msup>
                                </mml:mrow>
                                <mml:mrow>
                                    <mml:msup>
                                        <mml:mi>e</mml:mi>
                                        <mml:mi>z</mml:mi>
                                    </mml:msup>
                                    <mml:mo>+</mml:mo>
                                    <mml:msup>
                                        <mml:mi>e</mml:mi>
                                        <mml:mrow>
                                            <mml:mo>&#x2212;</mml:mo>
                                            <mml:mi>z</mml:mi>
                                        </mml:mrow>
                                    </mml:msup>
                                </mml:mrow>
                            </mml:mfrac>
                        </mml:math>
                        <label>(1)</label>
                    </disp-formula>
                </p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>The architecture for the proposed deep gait retrieval hashing (DGRH) model.</title>
                    </caption>
                    <graphic id="gr2" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/134760/842b0ed6-07bc-4002-974c-c44e4e5ff454_figure2.gif"/>
                </fig>
                <table-wrap id="T1" orientation="portrait" position="float">
                    <label>Table 1. </label>
                    <caption>
                        <title>The convolutional neural network (CNN) architecture for the proposed model.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top"/>
                                <th align="left" colspan="1" rowspan="1" valign="top">Network layer</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Filter size</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">No of filters</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Stride</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Size of output volume</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">1.
                                    <break/>2.
                                    <break/>3.
                                    <break/>4.
                                    <break/>5.
                                    <break/>6.
                                    <break/>7.
                                    <break/>8.
                                    <break/>9.
                                    <break/>10.</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <bold>Input image</bold>
                                    <break/>
                                    <bold>Conv1</bold>
                                    <break/>
                                    <bold>Pool1 (Max)</bold>
                                    <break/>
                                    <bold>Conv2</bold>
                                    <break/>
                                    <bold>Pool2 (Max)</bold>
                                    <break/>
                                    <bold>Conv3</bold>
                                    <break/>
                                    <bold>Pool3 (Max)</bold>
                                    <break/>
                                    <bold>Conv4</bold>
                                    <break/>
                                    <bold>Pool4 (Max)</bold>
                                    <break/>
                                    <bold>Fully connected layer 1</bold>
                                    <break/>
                                    <bold>Hashing layer</bold>
                                    <break/>
                                    <bold>(tanh)</bold>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="top">3 &#x00d7; 3
                                    <break/>2 &#x00d7; 2
                                    <break/>3 &#x00d7; 3
                                    <break/>2 &#x00d7; 2
                                    <break/>3 &#x00d7; 3
                                    <break/>2 &#x00d7; 2
                                    <break/>3 &#x00d7; 3
                                    <break/>2 &#x00d7; 2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">16
                                    <break/>32
                                    <break/>64
                                    <break/>124</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1
                                    <break/>2
                                    <break/>1
                                    <break/>2
                                    <break/>1
                                    <break/>2
                                    <break/>1
                                    <break/>2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">240&#x00d7; 240 &#x00d7; 1
                                    <break/>238 &#x00d7; 238 &#x00d7; 16
                                    <break/>119 &#x00d7; 119 &#x00d7; 16
                                    <break/>117&#x00d7; 117 &#x00d7; 32
                                    <break/>58&#x00d7; 58&#x00d7; 32
                                    <break/>56&#x00d7; 56 &#x00d7; 64
                                    <break/>28 &#x00d7; 28 &#x00d7; 64
                                    <break/>26&#x00d7; 26 &#x00d7; 124
                                    <break/>13&#x00d7; 13 &#x00d7; 124
                                    <break/>1 &#x00d7;1 &#x00d7; 1048
                                    <break/>1 &#x00d7; 1 &#x00d7; K</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <p>The tanh activation function maps the data into the range of [&#x2212;1,1], and the number of neurons (
                    <italic toggle="yes">K</italic>) for the fully connected layer is the desired length of binary hash codes (such as 16 bits, 32 bits, 64 bits, and so on). The learned hash function from the deep convolutional network is 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>h</mml:mi>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                            <mml:mo>&#x2208;</mml:mo>
                            <mml:msup>
                                <mml:mfenced close="}" open="{" separators=",">
                                    <mml:mrow>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mn>1</mml:mn>
                                    </mml:mrow>
                                    <mml:mn>1</mml:mn>
                                </mml:mfenced>
                                <mml:mrow>
                                    <mml:mi>K</mml:mi>
                                    <mml:mo>&#x00d7;</mml:mo>
                                    <mml:mi>N</mml:mi>
                                </mml:mrow>
                            </mml:msup>
                        </mml:math>
                    </inline-formula> with 
                    <italic toggle="yes">N</italic> GEI images. To calculate the binary hash code (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>b</mml:mi>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                            <mml:mo stretchy="true">)</mml:mo>
                        </mml:math>
                    </inline-formula> from the output of the hash layer, we utilize the 
                    <italic toggle="yes">sign</italic>(.) function; 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>b</mml:mi>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo mathvariant="italic">sign</mml:mo>
                            <mml:mfenced close=")" open="(">
                                <mml:msub>
                                    <mml:mi>h</mml:mi>
                                    <mml:mi>i</mml:mi>
                                </mml:msub>
                            </mml:mfenced>
                        </mml:math>
                    </inline-formula>. Here, the 
                    <italic toggle="yes">sign</italic>(.) function on a vector and a matrix is expressed as
                    <disp-formula id="e2">
                        <mml:math display="block">
                            <mml:mo mathvariant="italic">sign</mml:mo>
                            <mml:mfenced close=")" open="(">
                                <mml:mi>x</mml:mi>
                            </mml:mfenced>
                            <mml:mo>=</mml:mo>
                            <mml:mfenced close="" open="{">
                                <mml:mtable columnalign="left">
                                    <mml:mtr>
                                        <mml:mtd>
                                            <mml:mn>1</mml:mn>
                                            <mml:mo>,</mml:mo>
                                            <mml:mspace width="0.5em"/>
                                            <mml:mi>x</mml:mi>
                                            <mml:mo>&#x2265;</mml:mo>
                                            <mml:mn>0</mml:mn>
                                        </mml:mtd>
                                    </mml:mtr>
                                    <mml:mtr>
                                        <mml:mtd>
                                            <mml:mn>0</mml:mn>
                                            <mml:mo>,</mml:mo>
                                            <mml:mspace width="0.5em"/>
                                            <mml:mtext mathvariant="italic">otherwise</mml:mtext>
                                        </mml:mtd>
                                    </mml:mtr>
                                </mml:mtable>
                            </mml:mfenced>
                        </mml:math>
                        <label>(2)</label>
                    </disp-formula>
                </p>
                <p>
                    <italic toggle="yes">Supervised loss function and optimization</italic>
                </p>
                <p>We introduced the loss function to learn the ability of the hash codes. The designed loss functions measure the similarity-preserving ability of the hash codes. Since the well-learned hash codes have a solid classification ability, we expected that the supervised hash function can generate similar hash codes for the gait inputs with the same label. To preserve the similarity of the hash codes, we used the softmax function, which is a multinomial logistic regression function and suitable for predicting the class labels. The formulation for the softmax function is
                    <disp-formula id="e250">
                        <mml:math display="block">
                            <mml:mi>P</mml:mi>
                            <mml:mfenced close=")" open="(" separators="|">
                                <mml:mrow>
                                    <mml:mi>Y</mml:mi>
                                    <mml:mo>=</mml:mo>
                                    <mml:mi>k</mml:mi>
                                </mml:mrow>
                                <mml:mrow>
                                    <mml:mi>X</mml:mi>
                                    <mml:mo>=</mml:mo>
                                    <mml:msub>
                                        <mml:mi>x</mml:mi>
                                        <mml:mi>i</mml:mi>
                                    </mml:msub>
                                </mml:mrow>
                            </mml:mfenced>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:msup>
                                    <mml:mi>e</mml:mi>
                                    <mml:msub>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>k</mml:mi>
                                    </mml:msub>
                                </mml:msup>
                                <mml:mrow>
                                    <mml:msub>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mi>j</mml:mi>
                                    </mml:msub>
                                    <mml:msup>
                                        <mml:mi>e</mml:mi>
                                        <mml:msub>
                                            <mml:mi>s</mml:mi>
                                            <mml:mi>j</mml:mi>
                                        </mml:msub>
                                    </mml:msup>
                                </mml:mrow>
                            </mml:mfrac>
                            <mml:mspace width="0.75em"/>
                            <mml:mtext>where</mml:mtext>
                            <mml:mspace width="0.75em"/>
                            <mml:mi>s</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mi>f</mml:mi>
                            <mml:mfenced close=")" open="(" separators=";">
                                <mml:msub>
                                    <mml:mi>x</mml:mi>
                                    <mml:mi>i</mml:mi>
                                </mml:msub>
                                <mml:mi>W</mml:mi>
                            </mml:mfenced>
                        </mml:math>
                    </disp-formula>
                </p>
                <p>Here, we denote 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>W</mml:mi>
                                <mml:mi>h</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> as the linear weight vectors that connect the output from the last hash layers, so 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>W</mml:mi>
                                <mml:mi>h</mml:mi>
                            </mml:msub>
                            <mml:mo>&#x2208;</mml:mo>
                            <mml:msup>
                                <mml:mi mathvariant="normal">&#x211d;</mml:mi>
                                <mml:mrow>
                                    <mml:mi>k</mml:mi>
                                    <mml:mo>&#x00d7;</mml:mo>
                                    <mml:mi>N</mml:mi>
                                </mml:mrow>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. The softmax linear function for the learned representation 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>h</mml:mi>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> can be rewritten as
                    <disp-formula id="e3">
                        <mml:math display="block">
                            <mml:mi>P</mml:mi>
                            <mml:mfenced close=")" open="(" separators="|">
                                <mml:mrow>
                                    <mml:mi>y</mml:mi>
                                    <mml:mo>&#x2208;</mml:mo>
                                    <mml:msub>
                                        <mml:mi>C</mml:mi>
                                        <mml:mi>n</mml:mi>
                                    </mml:msub>
                                </mml:mrow>
                                <mml:mrow>
                                    <mml:msub>
                                        <mml:mi>h</mml:mi>
                                        <mml:mi>i</mml:mi>
                                    </mml:msub>
                                    <mml:mo>;</mml:mo>
                                    <mml:msub>
                                        <mml:mi>W</mml:mi>
                                        <mml:mi>h</mml:mi>
                                    </mml:msub>
                                </mml:mrow>
                            </mml:mfenced>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:msup>
                                    <mml:mi>e</mml:mi>
                                    <mml:mrow>
                                        <mml:msubsup>
                                            <mml:mi>w</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>n</mml:mi>
                                                <mml:mspace width="0.5em"/>
                                            </mml:mrow>
                                            <mml:mrow>
                                                <mml:mi>T</mml:mi>
                                                <mml:mspace width="0.5em"/>
                                            </mml:mrow>
                                        </mml:msubsup>
                                        <mml:msub>
                                            <mml:mi>h</mml:mi>
                                            <mml:mi>i</mml:mi>
                                        </mml:msub>
                                    </mml:mrow>
                                </mml:msup>
                                <mml:mrow>
                                    <mml:msubsup>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>j</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>C</mml:mi>
                                    </mml:msubsup>
                                    <mml:msup>
                                        <mml:mi>e</mml:mi>
                                        <mml:mrow>
                                            <mml:msubsup>
                                                <mml:mi>w</mml:mi>
                                                <mml:mrow>
                                                    <mml:mi>j</mml:mi>
                                                    <mml:mspace width="0.5em"/>
                                                </mml:mrow>
                                                <mml:mrow>
                                                    <mml:mi>T</mml:mi>
                                                    <mml:mspace width="0.5em"/>
                                                </mml:mrow>
                                            </mml:msubsup>
                                            <mml:msub>
                                                <mml:mi>h</mml:mi>
                                                <mml:mi>i</mml:mi>
                                            </mml:msub>
                                        </mml:mrow>
                                    </mml:msup>
                                </mml:mrow>
                            </mml:mfrac>
                            <mml:mspace width="0.25em"/>
                        </mml:math>
                        <label>(3)</label>
                    </disp-formula>
                </p>
                <p>To minimize the loss across the training sample to further reduce the classification error, we used the cross-entropy loss function. The function of the cross-entropy loss measures the dissimilarity of the predicted label distribution with the true label probability distribution. Therefore, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mover accent="true">
                                    <mml:mi>Y</mml:mi>
                                    <mml:mo stretchy="true">&#x0302;</mml:mo>
                                </mml:mover>
                                <mml:mi mathvariant="italic">ij</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> is the vector of the predicted label that sample 
                    <italic toggle="yes">i</italic> belongs to class 
                    <italic toggle="yes">j.</italic> The ground truth vector 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>Y</mml:mi>
                                <mml:mi mathvariant="italic">ij</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> is 1 if 
                    <italic toggle="yes">i</italic> belongs to class 
                    <italic toggle="yes">j</italic> and 0 otherwise. Therefore, the cross-entropy loss for 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mover accent="true">
                                    <mml:mi>Y</mml:mi>
                                    <mml:mo stretchy="true">&#x0302;</mml:mo>
                                </mml:mover>
                                <mml:mi mathvariant="italic">ij</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>Y</mml:mi>
                                <mml:mi mathvariant="italic">ij</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> is
                    <disp-formula id="e4">
                        <mml:math display="block">
                            <mml:mrow>
                                <mml:mtable columnalign="left">
                                    <mml:mtr columnalign="left">
                                        <mml:mtd columnalign="left">
                                            <mml:mrow>
                                                <mml:mi>L</mml:mi>
                                                <mml:mrow>
                                                    <mml:mo>(</mml:mo>
                                                    <mml:mrow>
                                                        <mml:msub>
                                                            <mml:mi>Y</mml:mi>
                                                            <mml:mrow>
                                                                <mml:mi>i</mml:mi>
                                                                <mml:mi>j</mml:mi>
                                                            </mml:mrow>
                                                        </mml:msub>
                                                        <mml:mo>,</mml:mo>
                                                        <mml:msub>
                                                            <mml:mover accent="true">
                                                                <mml:mi>Y</mml:mi>
                                                                <mml:mo stretchy="true">&#x0302;</mml:mo>
                                                            </mml:mover>
                                                            <mml:mrow>
                                                                <mml:mi>i</mml:mi>
                                                                <mml:mi>j</mml:mi>
                                                            </mml:mrow>
                                                        </mml:msub>
                                                    </mml:mrow>
                                                    <mml:mo>)</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                        </mml:mtd>
                                        <mml:mtd columnalign="left">
                                            <mml:mrow>
                                                <mml:mo>=</mml:mo>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:mfrac>
                                                    <mml:mn>1</mml:mn>
                                                    <mml:mi>N</mml:mi>
                                                </mml:mfrac>
                                                <mml:mstyle displaystyle="true">
                                                    <mml:munderover>
                                                        <mml:mo>&#x2211;</mml:mo>
                                                        <mml:mrow>
                                                            <mml:mi>i</mml:mi>
                                                            <mml:mo>=</mml:mo>
                                                            <mml:mn>1</mml:mn>
                                                        </mml:mrow>
                                                        <mml:mi>N</mml:mi>
                                                    </mml:munderover>
                                                    <mml:mrow>
                                                        <mml:mstyle displaystyle="true">
                                                            <mml:munderover>
                                                                <mml:mo>&#x2211;</mml:mo>
                                                                <mml:mrow>
                                                                    <mml:mi>j</mml:mi>
                                                                    <mml:mo>=</mml:mo>
                                                                    <mml:mn>1</mml:mn>
                                                                </mml:mrow>
                                                                <mml:mi>C</mml:mi>
                                                            </mml:munderover>
                                                            <mml:mrow>
                                                                <mml:msub>
                                                                    <mml:mi>Y</mml:mi>
                                                                    <mml:mrow>
                                                                        <mml:mi>i</mml:mi>
                                                                        <mml:mi>j</mml:mi>
                                                                    </mml:mrow>
                                                                </mml:msub>
                                                            </mml:mrow>
                                                        </mml:mstyle>
                                                        <mml:mi>log</mml:mi>
                                                        <mml:msub>
                                                            <mml:mover accent="true">
                                                                <mml:mi>Y</mml:mi>
                                                                <mml:mo stretchy="true">&#x0302;</mml:mo>
                                                            </mml:mover>
                                                            <mml:mrow>
                                                                <mml:mi>i</mml:mi>
                                                                <mml:mi>j</mml:mi>
                                                            </mml:mrow>
                                                        </mml:msub>
                                                    </mml:mrow>
                                                </mml:mstyle>
                                            </mml:mrow>
                                        </mml:mtd>
                                    </mml:mtr>
                                    <mml:mtr columnalign="left">
                                        <mml:mtd columnalign="left">
                                            <mml:mrow/>
                                        </mml:mtd>
                                        <mml:mtd columnalign="left">
                                            <mml:mrow>
                                                <mml:mo>=</mml:mo>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:mfrac>
                                                    <mml:mn>1</mml:mn>
                                                    <mml:mi>N</mml:mi>
                                                </mml:mfrac>
                                                <mml:mstyle displaystyle="true">
                                                    <mml:munderover>
                                                        <mml:mo>&#x2211;</mml:mo>
                                                        <mml:mrow>
                                                            <mml:mi>i</mml:mi>
                                                            <mml:mo>=</mml:mo>
                                                            <mml:mn>1</mml:mn>
                                                        </mml:mrow>
                                                        <mml:mi>N</mml:mi>
                                                    </mml:munderover>
                                                    <mml:mrow>
                                                        <mml:mstyle displaystyle="true">
                                                            <mml:munderover>
                                                                <mml:mo>&#x2211;</mml:mo>
                                                                <mml:mrow>
                                                                    <mml:mi>j</mml:mi>
                                                                    <mml:mo>=</mml:mo>
                                                                    <mml:mn>1</mml:mn>
                                                                </mml:mrow>
                                                                <mml:mi>C</mml:mi>
                                                            </mml:munderover>
                                                            <mml:mrow>
                                                                <mml:msub>
                                                                    <mml:mi>Y</mml:mi>
                                                                    <mml:mrow>
                                                                        <mml:mi>i</mml:mi>
                                                                        <mml:mi>j</mml:mi>
                                                                    </mml:mrow>
                                                                </mml:msub>
                                                                <mml:mi>log</mml:mi>
                                                                <mml:mfrac>
                                                                    <mml:mrow>
                                                                        <mml:msup>
                                                                            <mml:mi>e</mml:mi>
                                                                            <mml:mrow>
                                                                                <mml:msubsup>
                                                                                    <mml:mi>w</mml:mi>
                                                                                    <mml:mi>J</mml:mi>
                                                                                    <mml:mi>T</mml:mi>
                                                                                </mml:msubsup>
                                                                                <mml:msub>
                                                                                    <mml:mi>h</mml:mi>
                                                                                    <mml:mn>1</mml:mn>
                                                                                </mml:msub>
                                                                            </mml:mrow>
                                                                        </mml:msup>
                                                                    </mml:mrow>
                                                                    <mml:mrow>
                                                                        <mml:mstyle displaystyle="true">
                                                                            <mml:msubsup>
                                                                                <mml:mo>&#x2211;</mml:mo>
                                                                                <mml:mrow>
                                                                                    <mml:mi>k</mml:mi>
                                                                                    <mml:mo>=</mml:mo>
                                                                                    <mml:mn>1</mml:mn>
                                                                                </mml:mrow>
                                                                                <mml:mi>C</mml:mi>
                                                                            </mml:msubsup>
                                                                            <mml:mrow>
                                                                                <mml:msup>
                                                                                    <mml:mi>e</mml:mi>
                                                                                    <mml:mrow>
                                                                                        <mml:msubsup>
                                                                                            <mml:mi>w</mml:mi>
                                                                                            <mml:mi>k</mml:mi>
                                                                                            <mml:mi>T</mml:mi>
                                                                                        </mml:msubsup>
                                                                                        <mml:msub>
                                                                                            <mml:mi>h</mml:mi>
                                                                                            <mml:mn>1</mml:mn>
                                                                                        </mml:msub>
                                                                                    </mml:mrow>
                                                                                </mml:msup>
                                                                            </mml:mrow>
                                                                        </mml:mstyle>
                                                                    </mml:mrow>
                                                                </mml:mfrac>
                                                            </mml:mrow>
                                                        </mml:mstyle>
                                                    </mml:mrow>
                                                </mml:mstyle>
                                            </mml:mrow>
                                        </mml:mtd>
                                    </mml:mtr>
                                </mml:mtable>
                            </mml:mrow>
                        </mml:math>
                        <label>(4)</label>
                    </disp-formula>
                </p>
                <p>To reduce overfitting and the variance of the network, we introduced 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mn>2</mml:mn>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> regularization terms to generalize the training of the deep network. Here, the regularization term is
                    <disp-formula id="e5">
                        <mml:math display="block">
                            <mml:msub>
                                <mml:mi>R</mml:mi>
                                <mml:mi>w</mml:mi>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mfenced close=")" open="(">
                                <mml:mrow>
                                    <mml:munderover>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>l</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>L</mml:mi>
                                    </mml:munderover>
                                    <mml:msubsup>
                                        <mml:mfenced close="&#x2016;" open="&#x2016;">
                                            <mml:msup>
                                                <mml:mi>W</mml:mi>
                                                <mml:mi>l</mml:mi>
                                            </mml:msup>
                                        </mml:mfenced>
                                        <mml:mi>F</mml:mi>
                                        <mml:mn>2</mml:mn>
                                    </mml:msubsup>
                                </mml:mrow>
                            </mml:mfenced>
                            <mml:mfrac>
                                <mml:mi>&#x03bb;</mml:mi>
                                <mml:mrow>
                                    <mml:mn>2</mml:mn>
                                    <mml:mi>m</mml:mi>
                                </mml:mrow>
                            </mml:mfrac>
                        </mml:math>
                        <label>(5)</label>
                    </disp-formula>
                </p>
                <p>where 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>&#x03bb;</mml:mi>
                        </mml:math>
                    </inline-formula> is the regularization parameter and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msubsup>
                                <mml:mfenced close="&#x2016;" open="&#x2016;">
                                    <mml:mo>.</mml:mo>
                                </mml:mfenced>
                                <mml:mi>F</mml:mi>
                                <mml:mn>2</mml:mn>
                            </mml:msubsup>
                        </mml:math>
                    </inline-formula> indicates the Forbenious norm defined as 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msubsup>
                                <mml:mfenced close="&#x2016;" open="&#x2016;">
                                    <mml:msup>
                                        <mml:mi>W</mml:mi>
                                        <mml:mi>l</mml:mi>
                                    </mml:msup>
                                </mml:mfenced>
                                <mml:mi>F</mml:mi>
                                <mml:mn>2</mml:mn>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:msub>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                            <mml:msub>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                            <mml:msup>
                                <mml:mfenced close=")" open="(">
                                    <mml:msubsup>
                                        <mml:mi>W</mml:mi>
                                        <mml:mi mathvariant="italic">ij</mml:mi>
                                        <mml:mi>L</mml:mi>
                                    </mml:msubsup>
                                </mml:mfenced>
                                <mml:mn>2</mml:mn>
                            </mml:msup>
                            <mml:mo>=</mml:mo>
                            <mml:msup>
                                <mml:mi>W</mml:mi>
                                <mml:mi>T</mml:mi>
                            </mml:msup>
                            <mml:mi>W</mml:mi>
                            <mml:mspace width="0.25em"/>
                        </mml:math>
                    </inline-formula>. Therefore, the final classification loss for our proposed model is as follows:
                    <disp-formula id="e6">
                        <mml:math display="block">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>C</mml:mi>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mspace width="0.75em"/>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mfrac>
                                <mml:mn>1</mml:mn>
                                <mml:mi>N</mml:mi>
                            </mml:mfrac>
                            <mml:munderover>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>i</mml:mi>
                                    <mml:mo>=</mml:mo>
                                    <mml:mn>1</mml:mn>
                                </mml:mrow>
                                <mml:mi>N</mml:mi>
                            </mml:munderover>
                            <mml:munderover>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>j</mml:mi>
                                    <mml:mo>=</mml:mo>
                                    <mml:mn>1</mml:mn>
                                </mml:mrow>
                                <mml:mi>C</mml:mi>
                            </mml:munderover>
                            <mml:msub>
                                <mml:mi>Y</mml:mi>
                                <mml:mi mathvariant="italic">ij</mml:mi>
                            </mml:msub>
                            <mml:mo>log</mml:mo>
                            <mml:mfrac>
                                <mml:msup>
                                    <mml:mi>e</mml:mi>
                                    <mml:mrow>
                                        <mml:msubsup>
                                            <mml:mi>w</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>j</mml:mi>
                                                <mml:mspace width="0.5em"/>
                                            </mml:mrow>
                                            <mml:mrow>
                                                <mml:mi>T</mml:mi>
                                                <mml:mspace width="0.5em"/>
                                            </mml:mrow>
                                        </mml:msubsup>
                                        <mml:msub>
                                            <mml:mi>h</mml:mi>
                                            <mml:mi>i</mml:mi>
                                        </mml:msub>
                                    </mml:mrow>
                                </mml:msup>
                                <mml:mrow>
                                    <mml:msubsup>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>k</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>C</mml:mi>
                                    </mml:msubsup>
                                    <mml:msup>
                                        <mml:mi>e</mml:mi>
                                        <mml:mrow>
                                            <mml:msubsup>
                                                <mml:mi>w</mml:mi>
                                                <mml:mrow>
                                                    <mml:mi>k</mml:mi>
                                                    <mml:mspace width="0.5em"/>
                                                </mml:mrow>
                                                <mml:mrow>
                                                    <mml:mi>T</mml:mi>
                                                    <mml:mspace width="0.5em"/>
                                                </mml:mrow>
                                            </mml:msubsup>
                                            <mml:msub>
                                                <mml:mi>h</mml:mi>
                                                <mml:mi>i</mml:mi>
                                            </mml:msub>
                                        </mml:mrow>
                                    </mml:msup>
                                </mml:mrow>
                            </mml:mfrac>
                            <mml:mo>+</mml:mo>
                            <mml:mfrac>
                                <mml:mi>&#x03bb;</mml:mi>
                                <mml:mrow>
                                    <mml:mn>2</mml:mn>
                                    <mml:mi>m</mml:mi>
                                </mml:mrow>
                            </mml:mfrac>
                            <mml:mfenced close=")" open="(">
                                <mml:mrow>
                                    <mml:munderover>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>l</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>L</mml:mi>
                                    </mml:munderover>
                                    <mml:msubsup>
                                        <mml:mfenced close="&#x2016;" open="&#x2016;">
                                            <mml:msub>
                                                <mml:mi>W</mml:mi>
                                                <mml:mi>h</mml:mi>
                                            </mml:msub>
                                        </mml:mfenced>
                                        <mml:mi>F</mml:mi>
                                        <mml:mn>2</mml:mn>
                                    </mml:msubsup>
                                </mml:mrow>
                            </mml:mfenced>
                        </mml:math>
                        <label>(6)</label>
                    </disp-formula>
                </p>
                <p>
                    <italic toggle="yes">Quantization loss</italic>
                </p>
                <p>The real-value features (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>h</mml:mi>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>) from the hash layer need to be converted into binary codes to perform the retrieval in the Hamming space. To control the learned hash codes&#x2019; quality, we introduced the quantization loss. Discrete optimization of the classification loss function 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>C</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> is very challenging because of the binary constraints 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>h</mml:mi>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                            <mml:mo>&#x2208;</mml:mo>
                            <mml:msup>
                                <mml:mfenced close="}" open="{" separators=",">
                                    <mml:mrow>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mn>1</mml:mn>
                                    </mml:mrow>
                                    <mml:mn>1</mml:mn>
                                </mml:mfenced>
                                <mml:mrow>
                                    <mml:mi>K</mml:mi>
                                    <mml:mo>&#x00d7;</mml:mo>
                                    <mml:mi>N</mml:mi>
                                </mml:mrow>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. To reduce the quantization errors, existing hashing methods apply discrete optimization to the similarity-preserving loss (classification loss) and continuous optimization to the quantization loss. The quantization error (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>Q</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:msub>
                                <mml:mfenced close="&#x2016;" open="&#x2016;">
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mi>h</mml:mi>
                                            <mml:mi>i</mml:mi>
                                        </mml:msub>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mo mathvariant="italic">sgn</mml:mo>
                                        <mml:mfenced close=")" open="(">
                                            <mml:msub>
                                                <mml:mi>h</mml:mi>
                                                <mml:mi>i</mml:mi>
                                            </mml:msub>
                                        </mml:mfenced>
                                    </mml:mrow>
                                </mml:mfenced>
                                <mml:mn>2</mml:mn>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>) is optimized by applying continuous relaxation to the binary constraint. The optimization of the quantization error is very difficult due to the computation intensity and lack of compatibility with the training of the deep network with back-propagation because of the non-differentiable 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo mathvariant="italic">sgn</mml:mo>
                        </mml:math>
                    </inline-formula> function. Therefore, we employed the quantization loss function proposed by,
                    <sup>
                        <xref ref-type="bibr" rid="ref22">22</xref>
                    </sup> which is suitable to control the quantization error. The quantization loss is defined as follows:
                    <disp-formula id="e7">
                        <mml:math display="block">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>Q</mml:mi>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:msub>
                                <mml:mfenced close="&#x2016;" open="&#x2016;">
                                    <mml:mrow>
                                        <mml:mfenced close="|" open="|">
                                            <mml:msub>
                                                <mml:mi>h</mml:mi>
                                                <mml:mi>i</mml:mi>
                                            </mml:msub>
                                        </mml:mfenced>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mn>1</mml:mn>
                                    </mml:mrow>
                                </mml:mfenced>
                                <mml:mn>1</mml:mn>
                            </mml:msub>
                            <mml:mspace width="0.25em"/>
                        </mml:math>
                        <label>(7)</label>
                    </disp-formula>
                </p>
                <p>
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mfenced close="|" open="|">
                                <mml:mo>.</mml:mo>
                            </mml:mfenced>
                        </mml:math>
                    </inline-formula> is an elementwise vector, and we applied the smooth surrogate function to the 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mn>1</mml:mn>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> norm.</p>
                <p>
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mfenced close="&#x2016;" open="&#x2016;">
                                    <mml:mfenced close=")" open="(">
                                        <mml:mo>.</mml:mo>
                                    </mml:mfenced>
                                </mml:mfenced>
                                <mml:mn>1</mml:mn>
                            </mml:msub>
                            <mml:mo>&#x2248;</mml:mo>
                            <mml:mo>log</mml:mo>
                            <mml:mo>cosh</mml:mo>
                            <mml:mfenced close=")" open="(">
                                <mml:mo>.</mml:mo>
                            </mml:mfenced>
                        </mml:math>
                    </inline-formula> to ease the differentiation during back-propagation. The optimized quantization loss can be derived as
                    <disp-formula id="e8">
                        <mml:math display="block">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>Q</mml:mi>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:munderover>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>i</mml:mi>
                                    <mml:mo>=</mml:mo>
                                    <mml:mn>1</mml:mn>
                                </mml:mrow>
                                <mml:mi>N</mml:mi>
                            </mml:munderover>
                            <mml:mo>log</mml:mo>
                            <mml:mo>cosh</mml:mo>
                            <mml:mfenced close=")" open="(">
                                <mml:mrow>
                                    <mml:mfenced close="|" open="|">
                                        <mml:msub>
                                            <mml:mi>h</mml:mi>
                                            <mml:mi>i</mml:mi>
                                        </mml:msub>
                                    </mml:mfenced>
                                    <mml:mo>&#x2212;</mml:mo>
                                    <mml:mn>1</mml:mn>
                                </mml:mrow>
                            </mml:mfenced>
                        </mml:math>
                        <label>(8)</label>
                    </disp-formula>
                </p>
                <p>Therefore, by taking 
                    <xref ref-type="disp-formula" rid="e7">equations (7)</xref> and 
                    <xref ref-type="disp-formula" rid="e9">(9)</xref>, we achieve the DGRH optimization loss:
                    <disp-formula id="e8a">
                        <mml:math display="block">
                            <mml:munder>
                                <mml:mo>min</mml:mo>
                                <mml:msub>
                                    <mml:mi>W</mml:mi>
                                    <mml:mi>h</mml:mi>
                                </mml:msub>
                            </mml:munder>
                            <mml:mi>L</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>C</mml:mi>
                            </mml:msub>
                            <mml:mo>+</mml:mo>
                            <mml:mi>&#x03b2;</mml:mi>
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>Q</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </disp-formula>
                    <disp-formula id="e9">
                        <mml:math display="block">
                            <mml:munder>
                                <mml:mo>min</mml:mo>
                                <mml:msub>
                                    <mml:mi>W</mml:mi>
                                    <mml:mi>h</mml:mi>
                                </mml:msub>
                            </mml:munder>
                            <mml:mi>L</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mfrac>
                                <mml:mn>1</mml:mn>
                                <mml:mi>N</mml:mi>
                            </mml:mfrac>
                            <mml:munderover>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>i</mml:mi>
                                    <mml:mo>=</mml:mo>
                                    <mml:mn>1</mml:mn>
                                </mml:mrow>
                                <mml:mi>N</mml:mi>
                            </mml:munderover>
                            <mml:munderover>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>j</mml:mi>
                                    <mml:mo>=</mml:mo>
                                    <mml:mn>1</mml:mn>
                                </mml:mrow>
                                <mml:mi>C</mml:mi>
                            </mml:munderover>
                            <mml:msub>
                                <mml:mi>Y</mml:mi>
                                <mml:mi mathvariant="italic">ij</mml:mi>
                            </mml:msub>
                            <mml:mo>log</mml:mo>
                            <mml:mfrac>
                                <mml:msup>
                                    <mml:mi>e</mml:mi>
                                    <mml:mrow>
                                        <mml:msubsup>
                                            <mml:mi>w</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>j</mml:mi>
                                                <mml:mspace width="0.5em"/>
                                            </mml:mrow>
                                            <mml:mrow>
                                                <mml:mi>T</mml:mi>
                                                <mml:mspace width="0.5em"/>
                                            </mml:mrow>
                                        </mml:msubsup>
                                        <mml:msub>
                                            <mml:mi>h</mml:mi>
                                            <mml:mi>i</mml:mi>
                                        </mml:msub>
                                    </mml:mrow>
                                </mml:msup>
                                <mml:mrow>
                                    <mml:msubsup>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>k</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>C</mml:mi>
                                    </mml:msubsup>
                                    <mml:msup>
                                        <mml:mi>e</mml:mi>
                                        <mml:mrow>
                                            <mml:msubsup>
                                                <mml:mi>w</mml:mi>
                                                <mml:mrow>
                                                    <mml:mi>k</mml:mi>
                                                    <mml:mspace width="0.5em"/>
                                                </mml:mrow>
                                                <mml:mrow>
                                                    <mml:mi>T</mml:mi>
                                                    <mml:mspace width="0.5em"/>
                                                </mml:mrow>
                                            </mml:msubsup>
                                            <mml:msub>
                                                <mml:mi>h</mml:mi>
                                                <mml:mi>i</mml:mi>
                                            </mml:msub>
                                        </mml:mrow>
                                    </mml:msup>
                                </mml:mrow>
                            </mml:mfrac>
                            <mml:mo>+</mml:mo>
                            <mml:mfrac>
                                <mml:mi>&#x03bb;</mml:mi>
                                <mml:mrow>
                                    <mml:mn>2</mml:mn>
                                    <mml:mi>m</mml:mi>
                                </mml:mrow>
                            </mml:mfrac>
                            <mml:mfenced close=")" open="(">
                                <mml:mrow>
                                    <mml:munderover>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>l</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>L</mml:mi>
                                    </mml:munderover>
                                    <mml:msubsup>
                                        <mml:mfenced close="&#x2016;" open="&#x2016;">
                                            <mml:msub>
                                                <mml:mi>W</mml:mi>
                                                <mml:mi>h</mml:mi>
                                            </mml:msub>
                                        </mml:mfenced>
                                        <mml:mi>F</mml:mi>
                                        <mml:mn>2</mml:mn>
                                    </mml:msubsup>
                                </mml:mrow>
                            </mml:mfenced>
                            <mml:mo>+</mml:mo>
                            <mml:mi>&#x03b2;</mml:mi>
                            <mml:munderover>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>i</mml:mi>
                                    <mml:mo>=</mml:mo>
                                    <mml:mn>1</mml:mn>
                                </mml:mrow>
                                <mml:mi>N</mml:mi>
                            </mml:munderover>
                            <mml:mo>log</mml:mo>
                            <mml:mo>cosh</mml:mo>
                            <mml:mfenced close=")" open="(">
                                <mml:mrow>
                                    <mml:mfenced close="|" open="|">
                                        <mml:msub>
                                            <mml:mi>h</mml:mi>
                                            <mml:mi>i</mml:mi>
                                        </mml:msub>
                                    </mml:mfenced>
                                    <mml:mo>&#x2212;</mml:mo>
                                    <mml:mn>1</mml:mn>
                                </mml:mrow>
                            </mml:mfenced>
                        </mml:math>
                        <label>(9)</label>
                    </disp-formula>
                </p>
                <p>By optimizing the classification loss (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>C</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>), we can learn similarity-preserving quality hash codes and control the learned hash codes with the joint optimization function quantization loss (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>Q</mml:mi>
                            </mml:msub>
                            <mml:mo stretchy="true">)</mml:mo>
                        </mml:math>
                    </inline-formula>. The k-bit binary codes are achieved using the 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo mathvariant="italic">sgn</mml:mo>
                        </mml:math>
                    </inline-formula> function, and the final DGRH optimization loss function is able to reduce the quantization error and increase the retrieval performance. Our proposed DGRH model is trained using adaptive moment estimation (Adam) via back-propagation. In the testing stage, the new query image is converted into hash codes by the trained network. The k-bit binary codes can be obtained using the 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo mathvariant="italic">sgn</mml:mo>
                        </mml:math>
                    </inline-formula> function. Once we obtain the k-bit binary codes from the gallery and query sets, we can compute the Hamming distance between them to obtain the result.</p>
            </sec>
        </sec>
        <sec id="sec4" sec-type="methods">
            <title>Methods</title>
            <sec id="sec5">
                <title>Implementation of proposed method</title>
                <p>The proposed model is developed using the Python programming language. For the data-pre-processing, the Pandas library was used to create and analyse the training and testing gait dataset, and 
                    <ext-link ext-link-type="uri" xlink:href="https://docs.python.org/3/library/pickle.html">Pickle</ext-link> library was used to import and load the data. The proposed model uses the deep learning approach i.e. the convolutional neural network. To build the convolutional neural network, the deep learning framework known as 
                    <ext-link ext-link-type="uri" xlink:href="https://keras.io/">Keras</ext-link> library with the 
                    <ext-link ext-link-type="uri" xlink:href="https://www.tensorflow.org/">TensorFlow</ext-link> backend was used. The Keras library allows to construct the neural network easily, and can perform the training and testing of the proposed model. The other important libraries that were used in the development of the model are 
                    <ext-link ext-link-type="uri" xlink:href="https://numpy.org/">Numpy</ext-link> and 
                    <ext-link ext-link-type="uri" xlink:href="https://docs.python.org/3/library/math.html">Math</ext-link> libraries for array data manipulation and mathematical formula construction. Finally, the 
                    <ext-link ext-link-type="uri" xlink:href="https://matplotlib.org/">Matpoltlib</ext-link> and 
                    <ext-link ext-link-type="uri" xlink:href="https://seaborn.pydata.org/">Seaborn</ext-link> libraries were used for the data visualization of the performance. The source code used for the analysis can be found at Zenodo.
                    <sup>
                        <xref ref-type="bibr" rid="ref26">26</xref>
                    </sup>
                </p>
            </sec>
            <sec id="sec6">
                <title>Datasets and experimental setting</title>
                <p>We evaluated our proposed deep gait retrieval with public benchmark datasets (CASIA-B, OUISIR-LP, and OUISIR-MVLP). Since we are dealing with the retrieval problem, we considered both short-term and long-term retrieval. For short-term retrieval, we divided the datasets into the same conditions (being in view, wearing clothes, carrying a bag) since gait is captured from the camera in a short amount of time. In real-world gait applications, people are captured at different times and views from different cameras. Therefore, we considered the prepared dataset for both same and mixed conditions to further evaluate our proposed gait retrieval framework.</p>
                <p>
                    <italic toggle="yes">CASIA-B dataset</italic>
                </p>
                <p>The CASIA-B gait dataset contains 124 subjects with 11 views and three walking conditions (normal walking, wearing a coat and carrying a bag).
                    <sup>
                        <xref ref-type="bibr" rid="ref23">23</xref>
                    </sup> We evaluated only the same view (90
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x00b0;</mml:mo>
                        </mml:math>
                    </inline-formula>) with the same walking condition for this setting. For the normal walking condition, four walking sequences (nm1- nm4) are used for the training set, and the remaining sequences (nm5 and nm6) are used for the testing set. There are only two walking sequences for both wearing a coat (cl1, cl2) and carrying the bag (bg1, bg). Therefore, we prepared cl1 as the training set and cl2 as the testing set. Likewise, bg1 was used for training and bg2 was used for testing in the condition of carrying a bag. To evaluate the long-term retrieval problem, we mixed three conditions (NM, CL, BG) and prepared the datasets. The training sets (nm1, nm2, nm3, nm4, cl1, bg1) are included, and for the testing set, we used nm5, nm6, cl2, and bg2.</p>
                <p>
                    <italic toggle="yes">OUISIR Large Population (LP) dataset</italic>
                </p>
                <p>To evaluate the proposed framework, we used a subset of the OULP datasets with 1912 subjects and 4 different views (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>55</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                            <mml:mo>,</mml:mo>
                            <mml:mn>65</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                            <mml:mo>,</mml:mo>
                            <mml:mn>75</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                            <mml:mo>,</mml:mo>
                            <mml:mtext mathvariant="italic">and</mml:mtext>
                            <mml:mspace width="0.25em"/>
                            <mml:mn>85</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                        </mml:math>
                    </inline-formula>) following the protocol in.
                    <sup>
                        <xref ref-type="bibr" rid="ref24">24</xref>
                    </sup> The training and testing dataset is divided equally with 956 subjects each. Therefore, each subject also has eight GEIs with 4 views and 2 sequences. For the evaluation under the same condition, we prepared the datasets with only the same views (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>55</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>55</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                            <mml:mo>,</mml:mo>
                            <mml:mn>65</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>65</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                            <mml:mo>,</mml:mo>
                            <mml:mn>75</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>75</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                            <mml:mo>,</mml:mo>
                            <mml:mtext mathvariant="italic">and</mml:mtext>
                            <mml:mspace width="0.25em"/>
                            <mml:mn>85</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>85</mml:mn>
                            <mml:mo>&#x00b0;</mml:mo>
                        </mml:math>
                    </inline-formula>). Under mixed conditions, each subject with 4 different views is combined for evaluation. Therefore, each subject has eight GEIs (2 walking sequences * 4 views) to perform long-term gait retrieval.</p>
                <p>
                    <italic toggle="yes">OUISIR-Multiview Large Population (MVLP) dataset</italic>
                </p>
                <p>OUISIR-MVLP includes 10307 subjects; 5114 subjects are males, and 5193 are females. For the experiments in the OU-MVLP dataset, we followed the protocol setting of
                    <sup>
                        <xref ref-type="bibr" rid="ref25">25</xref>
                    </sup> which divided the dataset into nearly equal groups with 5153 subjects for the training set and 5154 subjects for the testing set. As we pre-processed the gait sequences into gait energy images with normalized dimensions (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>128</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:mn>88</mml:mn>
                            <mml:mo stretchy="true">)</mml:mo>
                        </mml:math>
                    </inline-formula>, we obtained 28 GEIs with 14 different views and 2 walking sequences for each. To evaluate our proposed method, we used only four views (0
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x00b0;</mml:mo>
                        </mml:math>
                    </inline-formula>, 30
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x00b0;</mml:mo>
                        </mml:math>
                    </inline-formula>, 60
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x00b0;</mml:mo>
                        </mml:math>
                    </inline-formula>, and 90
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x00b0;</mml:mo>
                        </mml:math>
                    </inline-formula>) for both the same-condition and mixed-condition settings since the GEIs with 180
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x00b0;</mml:mo>
                        </mml:math>
                    </inline-formula> view differences are flipped versions of the images and are considered same-view pairs based on the perspective of the projection. Only the same view was used to perform the short-term gait retrieval. Under mixed conditions, we created datasets with all four views for each subject. Therefore, the number of subjects was still the same, and each subject had eight sequences (2 walking sequences * 4 views).</p>
            </sec>
            <sec id="sec7">
                <title>Evaluation criteria</title>
                <p>We adopted the Hamming space retrieval approach to evaluate the performance of our proposed model. In hashing, similarity-preserving hash codes are represented instead of data points to increase the speed of retrieval and reduce the storage space. The common methods for searching hash-based codes are the Hash lookup table and the Hamming ranking. The Hamming ranking uses the Hamming distance between the query image and the images from the database. The Hamming distance is computed using a bitwise operation, and the ranked list is generated according to the distance. The returned ranked list is in ascending order with the nearest neighbour in the database with the query. For the hash lookup table approach, lookup tables are constructed with the data points in the database within the Hamming radius (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>r</mml:mi>
                        </mml:math>
                    </inline-formula>) of the query. Hash lookup, also known as bucket searching, tries to retrieve all 
                    <italic toggle="yes">r</italic>-neighbours for each query.</p>
                <p>The k-bit hash code lengths for the proposed model are 16 bits, 32 bits, 48 bits and 64 bits. The retrieval results are evaluated in three different matrices: precision curves with respect to the Hamming radius, precision curves based on different returned images and the mean average precision (MAP). The precision of 
                    <italic toggle="yes">r</italic> (P@r) is calculated, and the accuracy of the returned images based on the Hamming distance of the query and database images is less than or equal to r (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2264;</mml:mo>
                            <mml:mi>r</mml:mi>
                            <mml:mo stretchy="true">)</mml:mo>
                            <mml:mo>.</mml:mo>
                        </mml:math>
                    </inline-formula> Here, we set 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>r</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mn>2</mml:mn>
                            <mml:mspace width="0.25em"/>
                        </mml:math>
                    </inline-formula>for the hash lookup and constructed the precision curves within the Hamming radius (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi mathvariant="normal">P</mml:mi>
                            <mml:mo>@</mml:mo>
                            <mml:mi mathvariant="normal">r</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mn>2</mml:mn>
                            <mml:mo stretchy="true">)</mml:mo>
                            <mml:mspace width="0.25em"/>
                        </mml:math>
                    </inline-formula>with respect to different bit lengths. In Hamming ranking, the precision is calculated by the given top returned image. Therefore, we analysed the precision curves with different numbers of top returned images (P@N).</p>
                <p>The MAP (mean average precision) is the most important metric to evaluate the hashing algorithms. The ranked list achieved by the Hamming distance between the database and each query is evaluated against the given top returned images. First, we compute the average precision for each query with
                    <disp-formula id="e10">
                        <mml:math display="block">
                            <mml:mi mathvariant="italic">AP</mml:mi>
                            <mml:mo>@</mml:mo>
                            <mml:mi mathvariant="normal">N</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mrow>
                                    <mml:msubsup>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>n</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>N</mml:mi>
                                    </mml:msubsup>
                                    <mml:mi>P</mml:mi>
                                    <mml:mfenced close=")" open="(">
                                        <mml:mi>n</mml:mi>
                                    </mml:mfenced>
                                    <mml:mi>&#x03b4;</mml:mi>
                                    <mml:mfenced close=")" open="(">
                                        <mml:mi>n</mml:mi>
                                    </mml:mfenced>
                                </mml:mrow>
                                <mml:mrow>
                                    <mml:msubsup>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:msup>
                                                <mml:mi>n</mml:mi>
                                                <mml:mo>&#x2032;</mml:mo>
                                            </mml:msup>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>N</mml:mi>
                                    </mml:msubsup>
                                    <mml:mi>&#x03b4;</mml:mi>
                                    <mml:mfenced close=")" open="(">
                                        <mml:msup>
                                            <mml:mi>n</mml:mi>
                                            <mml:mo>&#x2032;</mml:mo>
                                        </mml:msup>
                                    </mml:mfenced>
                                </mml:mrow>
                            </mml:mfrac>
                            <mml:mspace width="0.25em"/>
                        </mml:math>
                        <label>(10)</label>
                    </disp-formula>
                </p>
                <p>where 
                    <italic toggle="yes">N</italic> represents the top returned images in Hamming ranking and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>P</mml:mi>
                            <mml:mfenced close=")" open="(">
                                <mml:mi>n</mml:mi>
                            </mml:mfenced>
                        </mml:math>
                    </inline-formula> is the precision of the top-N retrieved results. Then,
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mspace width="0.25em"/>
                            <mml:mi>&#x03b4;</mml:mi>
                            <mml:mfenced close=")" open="(">
                                <mml:mi>n</mml:mi>
                            </mml:mfenced>
                        </mml:math>
                    </inline-formula> is 1 if the n-th retrieved result is in the list and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>&#x03b4;</mml:mi>
                            <mml:mfenced close=")" open="(">
                                <mml:mi>n</mml:mi>
                            </mml:mfenced>
                        </mml:math>
                    </inline-formula> = 0 otherwise. Then, we can calculate the mean AP for all the testing queries to obtain the mean average precision (MAP). The larger the number of MAPs is, the better the quality of retrieval performance.</p>
            </sec>
        </sec>
        <sec id="sec8" sec-type="results|discussion">
            <title>Results and discussion</title>
            <p>The mean average precision (MAP) of different datasets with the code length (16,32,48,64) for the same condition are described in 
                <xref ref-type="table" rid="T2">Table 2</xref>, 
                <xref ref-type="table" rid="T3">Table 3</xref> and 
                <xref ref-type="table" rid="T4">Table 4</xref>. In these tables, the MAP is calculated from the top-100 returned images from the given queries. The precision curves within the Hamming radius (
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mi mathvariant="normal">P</mml:mi>
                        <mml:mo>@</mml:mo>
                        <mml:mi mathvariant="normal">r</mml:mi>
                        <mml:mo>=</mml:mo>
                        <mml:mn>2</mml:mn>
                        <mml:mo stretchy="true">)</mml:mo>
                    </mml:math>
                </inline-formula> are also illustrated for different code lengths in 
                <xref ref-type="fig" rid="f3">Figures 3</xref> to 
                <xref ref-type="fig" rid="f5">5</xref>. Additionally, based on the top-N returned images, precision curves are also shown for further evaluation of the proposed model.</p>
            <table-wrap id="T2" orientation="portrait" position="float">
                <label>Table 2. </label>
                <caption>
                    <title>MAP of the CASIA-B dataset in the same-condition setting.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top"/>
                            <th align="left" colspan="4" rowspan="1" valign="top">Mean average precision (MAP)</th>
                        </tr>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Dataset</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">16 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">32 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">48 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">64 bits</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>nm-nm</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.62</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.67</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.78</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.84</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>cl-cl</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.57</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.63</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.76</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.81</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>bg-bg</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.56</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.59</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.68</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.78</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <table-wrap id="T3" orientation="portrait" position="float">
                <label>Table 3. </label>
                <caption>
                    <title>MAP for the OUISIR-LP dataset in the same-condition setting.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top"/>
                            <th align="left" colspan="4" rowspan="1" valign="top">Mean average precision (MAP)</th>
                        </tr>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Dataset</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">16 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">32 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">48 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">64 bits</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>CASIA-B</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.69</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.68</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.81</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.87</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>OUISIR-LP</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.81</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.84</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.97</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.98</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>OUISIR-MVLP</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.47</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.53</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.56</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.65</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <table-wrap id="T4" orientation="portrait" position="float">
                <label>Table 4. </label>
                <caption>
                    <title>MAP for the OUISIR- MVLP dataset in the same-condition setting.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th colspan="1" rowspan="1"/>
                            <th align="left" colspan="4" rowspan="1" valign="top">Mean average precision (MAP)</th>
                        </tr>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Dataset</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">16 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">32 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">48 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">64 bits</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>0-0</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.32</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.39</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.43</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.48</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>30-30</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.48</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.54</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.56</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.52</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>60-60</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.47</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.51</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.57</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.59</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>90-90</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.51</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.49</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.54</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.56</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                <label>Figure 3. </label>
                <caption>
                    <p>Comparison of the (a) precision curves at a Hamming radius of 2 (P@r = 2) with different bit lengths and the (b) precision curves of the top-N returned images on the CASIA-B dataset.</p>
                </caption>
                <graphic id="gr3" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/134760/842b0ed6-07bc-4002-974c-c44e4e5ff454_figure3.gif"/>
            </fig>
            <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                <label>Figure 4. </label>
                <caption>
                    <p>Comparison of the (a) precision curves at a Hamming radius of 2 (P@r = 2) with different bit lengths and the (b) precision curves of the top-N returned images on the OUISIR-LP dataset.</p>
                </caption>
                <graphic id="gr4" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/134760/842b0ed6-07bc-4002-974c-c44e4e5ff454_figure4.gif"/>
            </fig>
            <fig fig-type="figure" id="f5" orientation="portrait" position="float">
                <label>Figure 5. </label>
                <caption>
                    <p>Comparison of the (a) precision curves at a Hamming radius of 2 (P@r = 2) with different bit lengths and the (b) precision curves of the top-N returned images on the OUISIR-MVLP dataset.</p>
                </caption>
                <graphic id="gr5" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/134760/842b0ed6-07bc-4002-974c-c44e4e5ff454_figure5.gif"/>
            </fig>
            <p>The result of the MAP for the CASIA-B dataset is shown in 
                <xref ref-type="table" rid="T2">Table 2</xref>. Since the model is evaluated for the short term retrieval, both the training and testing are in same condition. According to the results, the CASIA-B dataset has the lowest MAP values for the carrying condition (bg-bg) since the motion features of the gait are affected by human walking motion. In addition, normal walking (nm-nm) achieves the highest MAP because of its uncorrupted gait features, and the clothing condition (cl-cl) achieves the second-best result. Even though the appearance of the human gait is not affect by carrying a bag, it will disturb the motion of the gait signatures. The OUISIR large population dataset with four different views is evaluated under the same-conditions. The dataset is prepared with only same views (55&#x00b0;&#x2013;55&#x00b0;, 65&#x00b0;&#x2013;65&#x00b0;, 75&#x00b0;&#x2013;75&#x00b0;, 85&#x00b0;&#x2013;85&#x00b0;) for both the training set and testing set. The MAP for the OUISIR-LP dataset is quite similar for all of the views since the motion features of the different views are observable in the side view. The highest MAP values are achieved at (85&#x00b0;) and a 64-bit code length. On the other hand, the lowest is recorded in 55&#x00b0; which captured side-view of the gait with limited walking motion. However, the average MAP for all different views obtained the desirable result for the retrieval performance. The precision curve within the Hamming radius (is also illustrated for different code lengths in 
                <xref ref-type="fig" rid="f3">Figure 3</xref>. The value of precision is the highest for the 85&#x00b0;&#x2013;85&#x00b0; by 0.72. Within the hamming radius of 2 in the look up table, the value of precision is the lowest for the 16 bit code length. Based on the top-N returned images, the precision curves are also shown in Figure 4.9 for further evaluation of the proposed model. The precision curves is constructed based on hamming ranking from the top returned image (10&#x2013;100). The value of curve is falls as the rise of the return images. As seen in curve, the evaluation pair (55&#x00b0;&#x2013;55&#x00b0;) recorded the lowest value for 100 returned images. Compared to the other datasets, the MAP of the OUISIR-MVLP datasets is the lowest overall for the training and testing pairs since this dataset has the largest number of subjects. The highest MAP is in the lateral view pair (90&#x00b0;&#x2013;90&#x00b0;), and the lowest is in the frontal view pair (0&#x00b0;&#x2013;0&#x00b0;) because of its failure to capture the fully gait motion from the front. For the bitwise comparison, the hashing performance is better for 48 bits and 64 bits. As the bit length increases, the MAP also increases for the proposed model. Therefore, the precision curves @ top-N are constructed based on the precision values at 64 bits with different returned images (10, 50, 100), as shown in 
                <xref ref-type="fig" rid="f5">Figure 5</xref>.</p>
            <p>The evaluation result for the mixed condition is shown in 
                <xref ref-type="table" rid="T5">Table 5</xref>. The purpose of the mixed condition is to address long-term gait retrieval. The CASIA-B dataset explores different walking conditions, such as carrying bags and wearing conditions, in the same view. The objective was for the individual gait to be captured from different times while the appearances of the gait was changed. A motion of the gait may affect if someone carries the bag because of the weight of the bag, and the wearing a coat will disturb the outlying structure of the gait. According to the results, the MAP of the mixed condition in the CASIA-B dataset is able to achieve a higher performance than the same condition. Hamming distance is able to overcome the covariate factors in the retrieval task. To analyse the retrieval performance in view changes, the OUISIR-LP dataset and OUISIR-MVLP datasets are subjected to experiments with four different views. The angular difference for the OUISIR-LP is 10&#x00b0;, and the largest difference is only 30&#x00b0;. Therefore, the MAP for the OUISIR dataset is the highest among other datasets, and the results of the precision curves for a Hamming radius of 2 and top-returned images are also desirable. The MAP for the OUISIR-MVLP dataset is the lowest among the datasets, which is probably due to the large angular difference in the view changes. The gait is captured from both frontal and lateral views ranging from 0&#x00b0; to 90&#x00b0;. The motion and gait features might not be observed well in the frontal view compared to the lateral view. However, the retrieval performances are quite desirable given the large population dataset with larger view changes. In terms of the bit comparison, the 64-bit scheme also achieves better precision results than the other bit lengths. Therefore, the precision curves for the top-returned images are illustrated based on the precision of 64 bits, and the results are shown in 
                <xref ref-type="fig" rid="f6">Figure 6</xref>. According to the result, the highest precision value based on the different number of bits is from the OUISIR-LP dataset with 4 different views and small angular differences. The CASIA-B dataset is also able to achieve the desired result, and the lowest value is for the OUISIR-MVLP dataset. In 
                <xref ref-type="fig" rid="f6">Figure 6</xref>, we compare the precision values based on the number of top-returned images in 64 bits. The OUISIR-LP dataset is also able to achieve a high precision value, while the OUISIR-MVLP dataset has the lowest precision with a large number of angular views.</p>
            <table-wrap id="T5" orientation="portrait" position="float">
                <label>Table 5. </label>
                <caption>
                    <title>MAP for different datasets with the mixed condition.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top"/>
                            <th align="left" colspan="4" rowspan="1" valign="top">Mean average precision (MAP)</th>
                        </tr>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Dataset</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">16 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">32 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">48 bits</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">64 bits</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>55-55</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.59</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.57</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.62</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.64</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>65-65</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.64</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.63</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.68</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.72</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>75-75</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.61</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.66</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.68</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.74</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">
                                <bold>85-85</bold>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.69</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.76</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.74</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.78</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <fig fig-type="figure" id="f6" orientation="portrait" position="float">
                <label>Figure 6. </label>
                <caption>
                    <p>Comparison of the (a) precision curves at a Hamming radius of 2 (P@r = 2) with different bit lengths and the (b) precision curves of the top-N returned images on the different datasets.</p>
                </caption>
                <graphic id="gr6" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/134760/842b0ed6-07bc-4002-974c-c44e4e5ff454_figure6.gif"/>
            </fig>
            <sec id="sec9">
                <title>Comparison with other existing methods</title>
                <p>The performance of the DGRH (deep gait retrieval hashing) method is compared against the current gait retrieval works with the same datasets. There are only two papers that address the gait retrieval problem.
                    <sup>
                        <xref ref-type="bibr" rid="ref6">6</xref>,
                        <xref ref-type="bibr" rid="ref19">19</xref>
                    </sup> Therefore, the first method to be evaluated is the kernel-based semantic hashing method (KSH) proposed by Zhou 
                    <italic toggle="yes">et al</italic>.
                    <sup>
                        <xref ref-type="bibr" rid="ref6">6</xref>
                    </sup> The KSH uses the handcrafted feature learning approach with the Gaussian kernel function to generate the hashing function, and a semantic ranking list is used to retrieve the gait data. Another gait retrieval method that is compared is the deep hashing method presented by Rauf 
                    <italic toggle="yes">et al</italic>.,
                    <sup>
                        <xref ref-type="bibr" rid="ref19">19</xref>
                    </sup> who used deep supervised hashing methods with triplet deep learning channels. This method takes the triplet pairs of gait data into the shared triplet channel to calculate the hash function, and the triplet ranking loss is used to retrieve the query from the ranking list.</p>
                <p>The proposed framework is compared with the existing methods in 
                    <xref ref-type="fig" rid="f7">Figure 7</xref>. The mean average precision (MAP) of the top-100 returned images is used as the evaluation criterion for the retrieval performance.</p>
                <fig fig-type="figure" id="f7" orientation="portrait" position="float">
                    <label>Figure 7. </label>
                    <caption>
                        <title>Comparison of the mean average precision MAP in different methods.</title>
                    </caption>
                    <graphic id="gr7" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/134760/842b0ed6-07bc-4002-974c-c44e4e5ff454_figure7.gif"/>
                </fig>
                <p>The mixed condition in different datasets is analysed since gait retrieval is mostly a long-term retrieval problem. According to the results, our proposed method outperforms the other two methods. The retrieval performance of our proposed method is better because of the deep representation of the gait features, and the strength of the CNN can learn better information about the gait motion. The pairwise-based or triplet-based loss in hashing might cause a data imbalance problem because of its complex data preparation for suitable data pairs. In addition, this approach can also suffer from optimization problems. The combination of the classification loss and quantization loss in the proposed method can effectively predict gait labels and control the quality of the hash codes.</p>
            </sec>
        </sec>
        <sec id="sec10" sec-type="conclusion">
            <title>Conclusion</title>
            <p>This paper proposed the deep gait retrieval hashing (DGRH) model to address gait retrieval. The DGRH uses supervised deep hashing to retrieve the individual gait from the given query. The deep convolutional neural network is used to extract the gait features and generate the hash codes from the last layer of the network. The hash function is learned by optimization of the classification loss and quantization loss, and then gait retrieval is performed in the Hamming Space. The end-end-end hashing model is able to learn discriminative gait features and is efficient in terms of the storage memory and speed. The proposed method is evaluated on three different public datasets and outperforms other methods. The proposed method is evaluated and tested in large public datasets and different convriate factors, it forward to record better good performance compared to other methods. However, there is a room for further improvement. This research addressed the two different covariate factors subject-related (wearing condition and carrying condition) for the CASIA-B dataset and camera viewpoint on OUISIR-LP and OUISIR-MVLP dataset. The proposed models did not included the environmental-related covariate factors, such as raining or snowing which may occurs in real-life situation for the surveillance system. The walking speed of the individual gait is also another covariate factor that need to be addressed.</p>
        </sec>
        <sec id="sec11">
            <title>Data availability</title>
            <sec id="sec12">
                <title>Underlying data</title>
                <p>CASIB-B Dataset: The dataset is provided by The Institute of Automation, Chinese Academy of Sciences (CASIA) for the research purposes. We used the dataset B from the CASIA Gait Dataset which available on 
                    <ext-link ext-link-type="uri" xlink:href="http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp">http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp</ext-link> by signing the release agreement.</p>
                <p>OU-ISIR LP Dataset: The OU-ISIR Gait Database, Large Population Dataset is provided by The Institute of Scientific and Industrial Research (ISIR), Osaka University (OU). We used that dataset from 
                    <ext-link ext-link-type="uri" xlink:href="http://www.am.sanken.osaka-u.ac.jp/BiometricDB/GaitLP.html">http://www.am.sanken.osaka-u.ac.jp/BiometricDB/GaitLP.html</ext-link> by signing the release agreement for research purposes.</p>
                <p>OU-ISIR MVLP Dataset: The OU-ISIR Gait Database, Multi-View Large Population Dataset (OU-MVLP) is provided by The Institute of Scientific and Industrial Research (ISIR), Osaka University (OU). We used the dataset from 
                    <ext-link ext-link-type="uri" xlink:href="http://www.am.sanken.osaka-u.ac.jp/BiometricDB/GaitMVLP.html%20">http://www.am.sanken.osaka-u.ac.jp/BiometricDB/GaitMVLP.html</ext-link> by signing the release agreement for research purpose.</p>
                <p>All the datasets can be obtained by signing the release agreement under research purpose.</p>
            </sec>
        </sec>
        <sec id="sec13">
            <title>Software availability</title>
            <p>Source code available from: 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/papamin/Deep-Supervised-Hashing-for-Gait-Retrieval/tree/v1.0.1">https://github.com/papamin/Deep-Supervised-Hashing-for-Gait-Retrieval/tree/v1.0.1</ext-link>.</p>
            <p>Archived source code at the time of publication: 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.5256521">https://doi.org/10.5281/zenodo.5256521</ext-link>.
                <sup>
                    <xref ref-type="bibr" rid="ref26">26</xref>
                </sup>
            </p>
            <p>License: 
                <ext-link ext-link-type="uri" xlink:href="https://opensource.org/licenses/GPL-3.0">GPL 3.0</ext-link>.</p>
        </sec>
    </body>
    <back>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chao</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>He</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>GaitSet: Regarding Gait as a set For CROSS-VIEW gait recognition.</article-title>
                    <source>

                        <italic toggle="yes">Proc AAAI Conf Artificial Intelligence.</italic>
</source>
                    <year>2019</year>;<volume>33</volume>:<fpage>8126</fpage>&#x2013;<lpage>8133</lpage>.
                    <pub-id pub-id-type="doi">10.1609/aaai.v33i01.33018126</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rida</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Almaadeed</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Almaadeed</surname>
                            <given-names>S</given-names>
                        </name>
</person-group>:
                    <article-title>Robust gait recognition: A comprehensive survey.</article-title>
                    <source>

                        <italic toggle="yes">IET Biometrics.</italic>
</source>
                    <year>2018</year>;<volume>8</volume>(<issue>1</issue>):<fpage>14</fpage>&#x2013;<lpage>28</lpage>.
                    <pub-id pub-id-type="doi">10.1049/iet-bmt.2018.5063</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <label>3</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yan</surname>
                            <given-names>WQ</given-names>
                        </name>
</person-group>:
                    <article-title>Human gait recognition based On Frame-by-frame Gait Energy images and Convolutional Long short-term memory.</article-title>
                    <source>

                        <italic toggle="yes">Int J Neural Syst.</italic>
</source>
                    <year>2019</year>;<volume>30</volume>(<issue>1</issue>):<fpage>1950027</fpage>.
                    <pub-id pub-id-type="pmid">31747820</pub-id>
                    <pub-id pub-id-type="doi">10.1142/S0129065719500278</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Xiao</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>B</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Joint Detection and Identification Feature Learning for Person Search.</article-title>
                    <source>

                        <italic toggle="yes">2017 IEEE Conf Computer Vision Pattern Recognition (CVPR).</italic>
</source>
                    <year>2017</year>.
                    <pub-id pub-id-type="doi">10.1109/CVPR.2017.360</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Elharrouss</surname>
                            <given-names>O</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Almaadeed</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Al-Maadeed</surname>
                            <given-names>S</given-names>
                        </name>
</person-group>:
                    <article-title>Gait recognition for person re-identification.</article-title>
                    <source>

                        <italic toggle="yes">J Supercomput.</italic>
</source>
                    <year>2021</year>;<volume>77</volume>:<fpage>3653</fpage>&#x2013;<lpage>3672</lpage>.
                    <pub-id pub-id-type="doi">10.1007/s11227-020-03409-5</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhou</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Huang</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Q</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Kernel-Based Semantic Hashing for Gait Retrieval.</article-title>
                    <source>

                        <italic toggle="yes">IEEE Transactions on Circuits and Systems for Video Technology.</italic>
</source>
                    <year>2018</year>;<volume>28</volume>(<issue>10</issue>):<fpage>2742</fpage>&#x2013;<lpage>2752</lpage>.</mixed-citation>
            </ref>
            <ref id="ref7">
                <label>7</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Shen</surname>
                            <given-names>HT</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Song</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Hashing for similarity search: A survey.</article-title>
                    <source>

                        <italic toggle="yes">CoRR.</italic>
</source>
                    <year>2014</year>.</mixed-citation>
            </ref>
            <ref id="ref8">
                <label>8</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kulis</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Grauman</surname>
                            <given-names>K</given-names>
                        </name>
</person-group>:
                    <article-title>Kernelized locality-sensitive hashing for scalable image search.</article-title>
                    <source>

                        <italic toggle="yes">2009 IEEE 12th Int Conf Computer Vision.</italic>
</source>
                    <year>2009</year>.
                    <pub-id pub-id-type="doi">10.1109/ICCV.2009.5459466</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref9">
                <label>9</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kumar</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Hashing with graphs.</article-title>
                    <source>

                        <italic toggle="yes">In Proc.</italic>
</source>
                    <publisher-name>ICML</publisher-name>;<year>2011</year>.</mixed-citation>
            </ref>
            <ref id="ref10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tang</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>Z</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Neighborhood Discriminant Hashing for Large-Scale Image Retrieval.</article-title>
                    <source>

                        <italic toggle="yes">IEEE Transactions Image Processing.</italic>
</source>
                    <year>2015</year>;<volume>24</volume>(<issue>9</issue>):<fpage>2827</fpage>&#x2013;<lpage>2840</lpage>.
                    <pub-id pub-id-type="doi">10.1109/TIP.2015.2421443</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ji</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Supervised hashing with kernels.</article-title>
                    <source>

                        <italic toggle="yes">2012 IEEE Conf Computer Vision Pattern Recognition.</italic>
</source>
                    <year>2012</year>.
                    <pub-id pub-id-type="doi">10.1109/CVPR.2012.6247912</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Shen</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Shen</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>W</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Supervised Discrete Hashing.</article-title>
                    <source>

                        <italic toggle="yes">2015 IEEE Conf Computer Vision Pattern Recognition (CVPR).</italic>
</source>
                    <year>2015</year>.</mixed-citation>
            </ref>
            <ref id="ref13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lin</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yang</surname>
                            <given-names>H-F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hsiao</surname>
                            <given-names>J-H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Deep learning of binary hash codes for fast image retrieval.</article-title>
                    <source>

                        <italic toggle="yes">2015 IEEE Conf Computer Vision Pattern Recognition Workshops (CVPRW).</italic>
</source>
                    <year>2015</year>.
                    <pub-id pub-id-type="doi">10.1109/CVPRW.2015.7301269</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Xia</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pan</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lai</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Supervised hashing for image retrieval via image representation learning.</article-title>
                    <source>

                        <italic toggle="yes">Proc AAAI Conf Artificial Intelligence.</italic>
</source>
                    <year>2014</year>; pages.<fpage>2156</fpage>&#x2013;<lpage>2162</lpage>.</mixed-citation>
            </ref>
            <ref id="ref15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lai</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pan</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Simultaneous feature learning and hash coding with deep neural networks.</article-title>
                    <source>

                        <italic toggle="yes">2015 IEEE Conf Computer Vision Pattern Recognition (CVPR).</italic>
</source>
                    <year>2015</year>.
                    <pub-id pub-id-type="doi">10.1109/CVPR.2015.7298947</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lin</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Bit-Scalable Deep Hashing With Regularized Similarity Learning for Image Retrieval and Person Re-Identification.</article-title>
                    <source>

                        <italic toggle="yes">IEEE Transactions on Image Processing.</italic>
</source>
                    <year>2015</year>;<volume>24</volume>(<issue>12</issue>):<fpage>4766</fpage>&#x2013;<lpage>4779</lpage>.
                    <pub-id pub-id-type="doi">10.1109/TIP.2015.2467315</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref17">
                <label>17</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhu</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Long</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Deep hashing network for efficient similarity retrieval.</article-title>
                    <source>

                        <italic toggle="yes">In Thirtieth AAAI Conference on Artificial Intelligence.</italic>
</source>
                    <year>2016</year>.</mixed-citation>
            </ref>
            <ref id="ref18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Cao</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Long</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>B</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Deep Cauchy Hashing for Hamming Space Retrieval.</article-title>
                    <source>

                        <italic toggle="yes">2018 IEEE/CVF Conference Computer Vision Pattern Recognition.</italic>
</source>
                    <year>2018</year>.
                    <pub-id pub-id-type="doi">10.1109/CVPR.2018.00134</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rauf</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Huang</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>L</given-names>
                        </name>
</person-group>:
                    <article-title>Gait Retrieval: A Deep Hashing Method for People Retrieval in Video.</article-title>
                    <source>

                        <italic toggle="yes">Communications in Computer and Information Science Pattern Recognition.</italic>
</source>
                    <year>2016</year>:<fpage>383</fpage>&#x2013;<lpage>391</lpage>.
                    <pub-id pub-id-type="doi">10.1007/978-981-10-3002-4_32</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Han</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bhanu</surname>
                            <given-names>B</given-names>
                        </name>
</person-group>:
                    <article-title>Individual recognition using gait energy image.</article-title>
                    <source>

                        <italic toggle="yes">IEEE Transactions Pattern Analysis Machine Intelligence.</italic>
</source>
                    <year>2006</year>;<volume>28</volume>(<issue>2</issue>):<fpage>316</fpage>&#x2013;<lpage>322</lpage>.
                    <pub-id pub-id-type="doi">10.1109/TPAMI.2006.38</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref21">
                <label>21</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Min</surname>
                            <given-names>PP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sayeed</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ong</surname>
                            <given-names>TS</given-names>
                        </name>
</person-group>:
                    <article-title>Gait Recognition Using Deep Convolutional Features.</article-title>
                    <source>

                        <italic toggle="yes">2019 7th Int Conf Information Communication Technology (ICoICT).</italic>
</source>
                    <year>2019</year>.
                    <pub-id pub-id-type="doi">10.1109/ICoICT.2019.8835194</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Gong</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lazebnik</surname>
                            <given-names>S</given-names>
                        </name>
</person-group>:
                    <article-title>Iterative quantization: A procrustean approach to learning binary codes.</article-title>
                    <source>

                        <italic toggle="yes">Cvpr.</italic>
</source>
                    <year>2011</year>;<volume>2011</volume>.
                    <pub-id pub-id-type="doi">10.1109/CVPR.2011.5995432</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref23">
                <label>23</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Yu</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tan</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tan</surname>
                            <given-names>T</given-names>
                        </name>
</person-group>:
                    <article-title>A Framework for Evaluating the Effect of View Angle, Clothing and Carrying Condition on Gait Recognition.</article-title>
                    <source>

                        <italic toggle="yes">18th Int Conf Pattern Recognition (ICPR'06).</italic>
</source>
                    <year>2006</year>.
                    <pub-id pub-id-type="doi">10.1109/ICPR.2006.67</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Muramatsu</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Makihara</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yagi</surname>
                            <given-names>Y</given-names>
                        </name>
</person-group>:
                    <article-title>Cross-view gait recognition by fusion of multiple transformation consistency measures.</article-title>
                    <source>

                        <italic toggle="yes">IET Biometrics.</italic>
</source>
                    <year>June 2015</year>;<volume>4</volume>:<fpage>62</fpage>&#x2013;<lpage>73(11)</lpage>.
                    <pub-id pub-id-type="doi">10.1049/iet-bmt.2014.0042</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref25">
                <label>25</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Takemura</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Makihara</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Muramatsu</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Multi-view large population gait dataset and its performance evaluation for cross-view gait recognition.</article-title>
                    <source>

                        <italic toggle="yes">IPSJ transactions on Computer Vision and Applications.</italic>
</source>
                    <year>2018</year>; vol.<volume>10</volume>, no.<issue>1</issue>.</mixed-citation>
            </ref>
            <ref id="ref26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sayeed</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Min</surname>
                            <given-names>PP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ong</surname>
                            <given-names>TS</given-names>
                        </name>
</person-group>:
                    <article-title>Deep Supervised Hashing for Gait Retrieval (v1.0.1).</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2021</year>.
                    <pub-id pub-id-type="doi">10.5281/zenodo.5256521</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report141676">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.134760.r141676</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Islam</surname>
                        <given-names>Md. Rajibul</given-names>
                    </name>
                    <xref ref-type="aff" rid="r141676a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-0565-6917</uri>
                </contrib>
                <aff id="r141676a1">
                    <label>1</label>Department of Computer Science and Engineering, University of Asia Pacific, Dhaka, Bangladesh</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>27</day>
                <month>6</month>
                <year>2022</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2022 Islam MR</copyright-statement>
                <copyright-year>2022</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport141676" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.51368.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors have addressed my concerns. The manuscript quality has been greatly increased with all the modifications to the manuscript. I believe it currently meets the standards expected for F1000Research articles.</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Machine learning, Artificial intelligence, Data mining.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report96765">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.54529.r96765</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Islam</surname>
                        <given-names>Md. Rajibul</given-names>
                    </name>
                    <xref ref-type="aff" rid="r96765a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-0565-6917</uri>
                </contrib>
                <aff id="r96765a1">
                    <label>1</label>Department of Computer Science and Engineering, University of Asia Pacific, Dhaka, Bangladesh</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>15</day>
                <month>12</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Islam MR</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport96765" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.51368.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This study demonstrates the deep gait retrieval hashing (DGRH) model to address the gait retrieval problem using public benchmark datasets (CASIA-B, OUISIR-LP, and OUISIR-MVLP). This method is based on a supervised hashing method with a deep convolutional network. I think this paper proposes an interesting scheme of gait retrieval. The analysis process and results presented in this paper are technically sound. I do have a few comments for the authors. I would recommend it being indexed if these were addressed. 
                <list list-type="order">
                    <list-item>
                        <p>Supervised hashing methods with a well-designed deep convolutional neural network have been used by various researchers in the literature. However, the authors have compared their results with only two papers that demonstrated hashing method to address the gait retrieval problem. What about the other techniques that had been used to solve the gait retrieval problem? Does the proposed method outperform those techniques too?</p>
                    </list-item>
                    <list-item>
                        <p>&#x201c;Comparison with other existing methods&#x201d; section requires improvement. The comparison statements in this section should be supported by analytical outcomes. For instance, it should be mentioned that which outcome of this article demonstrates that data imbalance and optimization problems can be solved by the proposed method and how.</p>
                    </list-item>
                    <list-item>
                        <p>The statement, &#x201c;The hashing method is efficient in terms of the storage memory and speed&#x201d; has been written at the beginning and end of the article but no results/theoretical explanation has been presented to support it.</p>
                    </list-item>
                </list>
            </p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Machine learning, Artificial intelligence, Data mining.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report96767">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.54529.r96767</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Amalarethinam</surname>
                        <given-names>D. I. George</given-names>
                    </name>
                    <xref ref-type="aff" rid="r96767a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-1315-1495</uri>
                </contrib>
                <aff id="r96767a1">
                    <label>1</label>Jamal Mohamed College, Tiruchirappalli, Tamil Nadu, India</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>28</day>
                <month>10</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Amalarethinam DIG</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport96767" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.51368.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>
                <list list-type="bullet">
                    <list-item>
                        <p>The salient features of gait are analysed using three different benchmark dataset in connection with storage memory and speed.</p>
                    </list-item>
                    <list-item>
                        <p>The data Tables 3 , 4,&#x00a0;and 5 need&#x00a0;to be spelled out so as to provide more clarity towards further proceedings by others.</p>
                    </list-item>
                    <list-item>
                        <p>Though the paper is technically sound, the limitations or negative aspects of the proposed methodology need&#x00a0;to be included.</p>
                    </list-item>
                    <list-item>
                        <p>The recent year (2020) references may be provided towards strengthening of the paper.</p>
                    </list-item>
                </list>
            </p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Partly</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Partly</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
</article>
