Network-based Observability and Controllability Analysis of Dynamical Systems: the NOCAD toolbox [version 2; peer review: 2 approved]

The network science-based determination of driver nodes and sensor placement has become increasingly popular in the field of dynamical systems over the last decade. In this paper, the applicability of the methodology in the field of life sciences is introduced through the analysis of the neural network of Caenorhabditis elegans. Simultaneously, an Octave and MATLAB-compatible NOCAD toolbox is proposed that provides a set of methods to automatically generate the relevant structural controllability and observability associated measures for linear or linearised systems and compare the different sensor placement methods.

This article is included in the Mathematical, Physical, and Computational Sciences collection.

Reviewer Status
Invited Reviewers

Introduction
In the life sciences, the determination of driver nodes in networks that play a significant role in the emergence or treatment of diseases is an intensively researched field 1 . The importance of determining the proper driver nodes, i.e. the ones that ensure physically feasible controllability with the minimum cardinality and energy requirement, in biological networks, or more generally in any dynamical system, is unequivocal, and the amount of research concerning network science has increased rapidly. A detailed study of the control principles in biological networks has already been published 2 . A review about the utilisation of the network science-based determination of driver nodes has also been published that introduced the results of the analysis of the protein-protein interaction (PPI) networks, Caenorhabditis elegans neuronal network, neurochemical rat brain network, Saccharomyces cerevisiae cell cycle networks, Epithelial Mesenchymal Transition (EMT) network, myeloid differentiation regulatory network and Th differentiation network, moreover, the identification of drug targets was also presented 3 .
The network science-based analysis of dynamical systems has spread rapidly as it provides simple and efficient tools to analyse the structural controllability of any linear or linearised system 1 . In terms of controlling the human signalling network, the role of different proteins was also systematically analysed with the toolset of network controllability in 4 to highlight the role of cancer-associated genes. Target control with objective-guided optimisation (TCO) was introduced to control a set of variables (or targets) of interest while the number of drivers and constrained nodes were minimised and maximised, respectively. This method is capable of determining the leading phenotype transitions in biological networks that can be identified as drug targets 5 . In large-scale human liver metabolic networks (HLMN), the driver metabolites have essential functions, moreover, the role of transport reactions and extracellular metabolites in terms of controlling HLMN have revealed the importance of the environment of human liver metabolism with regard to the health of the liver 6 . Using statistical analysis, a subset of critical control nonprotein-coding RNAs (ncRNAs) enriched by human disease can also be determined 7 .
In intra-cellular networks, to understand the information flow, a natural control system was utilised and the robustness of such a control was analysed 8 .
The contribution of this paper is to introduce the novel toolbox, NOCAD 9 , and its applicability in the life sciences through the example of the local network of 131 frontal neurons of Caenorhabditis elegans 10 . The proposed toolbox is also suitable for the comprehensive analysis of any linear or linearised dynamical systems through their static network representation [11][12][13] . Although in the literature the phrase dynamical network is commonly used, it does not mean that the nodes or connections are temporal but refers to the network of dynamical systems. In the nonlinear case, the methodology needs further clarification because for small nonlinear examples the results can be incorrect 14 and the cardinality of the assigned sensors underestimated 15 . As a result, this toolbox deals with only the linear case, nonlinear system-related methods will be implemented later. In the following sections, the representation of linear systems as well as their structural controllability and observability are introduced. Then the theoretical background of the methodology is presented and the implemented functions and measurements introduced through the network of rostral ganglia of C.elegans.

Existing software
Although considerable research has utilised this method 16 , a flexible software tool which may be used to support research in this field has yet to be designed. Parallel studies have resulted in a collection of applications, toolboxes, plug-ins and scripts that analyse and determine several structural properties of genes, protein-protein interactions and even social or urban networks. Most of these applications only analyse the structural properties of static networks and just a handful of them utilise these structural properties to draw conclusions concerning the dynamics of the system investigated. As our toolbox belongs to the second group, in the following section, the available applications and programs of this group are elaborated on.
A brief summary of the available tools with expanded functionalities is given in Table 1. Applications or software packages implemented in Python and capable of analysing the controllability and observability of dynamical systems are: graph-control 17 and WDNfinder 18 . The advantage of Python-based development lies in its widespread use and the countless methods and packages implemented in this language, including the tools developed for network analysis 19 . Although in Python the focus is on developing a broad software package for complex systems analysis, this has yet to be fulfilled and all of the available solutions have limitations. The graph-control toolbox only analyses the impact of network topology on the number of inputs and implements the fast matching algorithm 20 . Even though WDNfinder only determines the minimum driver node set (MDS) and classifies nodes based on MDS, it is incapable of facilitating extended analysis.
Additionally, the CytoCtrlAnalyser 23 plug-in for Cytoscape 25 has been developed, which was implemented in Java and offers graphical user interfaces as well. It evaluates control centrality, control capacity and classifies nodes for biomolecular networks. Furthermore, the Ecological Network Analysis in R software package (enaR) provides some dynamical analysis functions and can generate models to analyse ecological networks in the R environment 24 . As can be seen, both software packages deal with special kinds of networks. The netctrl program can determine the driver nodes and switchboard dynamics model for any complex network 21 . CONTEST is a MATLAB toolbox which can analyse the dynamics of complex systems, but these dynamics do not cover the structural controllability and observability properties 22 of the analysed system. Although the presented software packages ensure the design of a controllable and observable system, they do not provide the opportunity to analyse the designed system exhaustively. These functions are helpful in terms of supporting the work of experts, but are insufficient for the sophisticated analysis of systems.

Methods
In the background of the toolbox the linear systems and their structural controllability and observability properties are stood 26 . A linear time-invariant (LTI) system is commonly described by its state-space representation that consists of the state equation (Eq. 1) and the output equation (Eq.2).
To ensure controllability (or observability) using a minimum number of inputs (or outputs), a brute force approach should generate 2 N -1 configurations of matrix B (or C). To solve this challenging task, the maximum set of disjoint edges is generated by the maximum matching algorithm 1 . Two edges are disjointed if they do not share a common starting point or endpoint. The matched nodes are the endpoints of the edges that are a member of the maximum set of disjoint edges, the others are unmatched. Then the unmatched nodes that are generated based on A are the sensor nodes, where outputs should be placed to grant structural observability, while the unmatched nodes generated based on A T are the driver nodes, where inputs should be placed to grant structural controllability. A T is also the adjacency matrix of the network representation that is the input of the toolbox. It is very important to note that the result of maximum matching is not unique, and it is possible that the matching is perfect, i.e. no unmatched nodes have resulted. In our implementation, the canonical decomposition of Dulmage-Mendelsohn was utilised to calculate maximum matching 28 .
For a better understanding, we illustrate the aforementioned definitions by a small example in Figure 1 that contains the command interneurons AVAL, AVAR, AVBL, AVBR, AVDL and AVDR from the frontal neural network of neurons and synapses in C. elegans.
With the help of the presented Octave-and MATLAB-compatible toolbox, experts can create, analyse and improve any type of dynamical systems. As the structure of the dynamical systems is generally represented by their adjacency matrix and linear dynamical systems can be described by the state-space model that contains the dynamical, input, output and feedthrough matrices, the Octave/MATLAB programming language is a perfect environment to handle these matrices and provide comprehensive functionalities based on them. With the use of NOCAD 9 , experts and researchers can effectively determine the input and output matrices of state-space models, calculate system-specific qualitative measurements (e.g. diameter, relative degree, control centrality and robustness of the system, etc.) and improve the system to satisfy the relative degree-based requirements. The workflow of the toolbox can be seen in Figure 2.

Implementation
According to the aforementioned approach, the implemented functions of the toolbox were divided into three modules as follows: (1) network mapping module, (2) system characterisation module and (3) improvements and robustness module. The input of the first module is the adjacency matrix of the network to be analysed. The second module requires the matrices of the dynamical system generated by the first module. The result of the second module is a structure that is also the input of the third module.
The network mapping module creates a dynamical system from a given network structure, i.e. the necessary matrices of the state-space model are generated for the topology in such a way, that the created system is structurally controllable and structurally observable. The determination of the input and output matrices can be achieved by the path finding and signal sharing methods 11 , which modify the result of the maximum matching algorithm.
The system characterisation module performs the calculation of 49 numerical measures to qualify the dynamical system based on its structure. The implemented measures, on the one hand, are well-known static measures (e.g. the number of nodes and edges, closeness and betweenness centralities), and, on the other hand, measures that characterise the dynamics of the system (e.g. structural controllability, observability, control centrality and relative degree). This module can also be used for the purpose of simple network analysis. In this example, due to the symmetric edge pairs between the nodes, the matching is perfect, i.e. all the nodes are matched. In this case, structural controllability and observability can be granted by selecting any node as a driver node and any node as a sensor node. Figure 2. Workflow of the utilisation of the NOCAD toolbox. The network mapping module provides two methods to create a dynamical system based on the topology of the state variables. The system characterisation module generates more than 49 measures to analyse, classify and characterise the developed system. The improvement and robustness module offers five algorithms to improve the system with additional inputs (observers) as well as outputs (controllers) and can analyse the robustness of the designed system.
The improvement and robustness module integrates two main functions. On the one hand, it enables the input and output configurations of the system to be extended in such a way that the relative degree of the modified system does not exceed the initially defined threshold. For this purpose, this module implements five methods, namely the set covering-based grassroot and retrofit methods 12 , the centrality measures-based method 12 , the modified Clustering Large Applications based on Simulated Annealing algorithm (mCLASA), and the Geodesic Distance-based Fuzzy c-Medoid Clustering with Simulated Annealing algorithm (GDFCMSA) 12,13 . On the other hand, this module allows users to examine the robustness of the extended configurations by removing nodes from the network representation and by checking the structural controllability and structural observability of the damaged system.
Although the last module seems to be out of line at first, its existence is reasonable. The importance of the controllability of a complex system has already been addressed 1 . In terms of control theory, the relative degree is an important measure to describe how fast the system can be influenced or how sluggish it is. In the field of biology, this "speed" is also important, e.g. the time elapsed between taking a painkiller and feeling its effect. The implemented methods are introduced in detail in the cited articles and the manual of the NOCAD toolbox.

Operation
In order to use the NOCAD toolbox 9 , installation of Octave or MATLAB is required. Then the directories of the toolbox must be copied into the working directory, or the directories of the toolbox must be added to the paths. The functions were implemented in Octave 5.1.0 and MATLAB R2016a on a Windows 64-bit system. On other operating systems, or with other Octave or MATLAB versions, proper operation is not guaranteed. Our toolbox is independent of other Math-Works toolboxes, it uses only the octave-networks-toolbox 29 and the greedy set covering implementation 30 .

Use cases
In this section, the main functionalities of the NOCAD toolbox 9 are presented through the analysis of the local network of 131 frontal neurons of Caenorhabditis elegans. The first step in the workflow is to create a state-space model based on the adjacency matrix that presents the structural description of the system that, in this case, has the size of 131×131 according to the 131 frontal neurons.
Two methods, path finding and signal sharing are proposed that were implemented to correct the insufficient result of maximum matching. Both methods are modified versions of the maximum matching algorithm. The maximum matching method determined the following 12 neurons to be driver nodes: RMEL, RMER, SIADL, SIADR, SIAVL, SIAVR, SIBDL, SIBDR, SIBVL, SIBVR, SMDDR and URYDR, moreover, determined 12 sensor nodes that correspond to the following neurons: AINL, ASHL, ASIR, ASJR, AWAL, IL2DL, IL2DR, IL2L, SIBDL, URBL, URBR and URYDL. As no critical strongly connected components were present, the results were identical in the case of both the path finding and signal sharing methods.
After utilising the second module of the toolbox, the measures that qualify the whole network with one value are introduced, as presented in Table 2. The network contains 131 neurons and 764 synapses. The density shows that the number of edges is less than a twentieth of the possible maximum, and the diameter of the system, namely the longest shortest path in the network that presents its structure, is 9. The degree variance is 44.3299 which is relatively high given the size of the network, while the Freeman's centrality is 0.2057. The relative degree of the system is also 4. The Pearson correlation coefficient shows that the in-in, in-out and out-out correlations are slightly assortative in nature, while the out-in correlation is likely to be disassortative. The system is controllable and observable. As no loop is present in the network, the percentage of loops relative to edges is 0%. As 77 symmetrical connections are present between 687 connected node pairs, the percentage of the symmetric edge pairs is 11.2082%.
The second module generates node centrality measures that can reveal structurally important nodes. Since the generated measures can be presented by large tables, they are attached in Excel format to the toolbox 9 . This analysis shows that one of the most important values is the highest degree of the nodes, which belongs to RIAR, an interneuron located in the nerve ring 31 . As Scott's centrality is a normalised degree, the most important node is once again RIAR. The closeness of node x i is calculated as the ratio of the number of nodes reachable from x i to the sum of their distances from x i . The higher value indicates the more central position of the node, and now RIAL is the most central element. The betweenness centrality shows how many shortest paths intercept the given node. If a node has a high value, then it is a critical node in the structure. The highest value belongs to neurotransmitter RIH that is a serotonin 32 . The PageRank assigns a percentage value to each node, based on their centrality roles if Markov-chains are modelled. The measure referred to as correlation shows the proportion of the number of edges of neighbours' and the number of neighbours. This information is useful when determining the assortativity of the system. The control centrality and observe centrality measures determine how many state variables can be influenced or observed by the nodes.
The determined driver and sensor nodes can be classified into four groups 33 . According to these groups, four phenomena can provide driver or sensor nodes. Firstly, source nodes when the node has no incoming edges, thus, a dedicated input is needed. Secondly, dilation, when the generated set of child nodes has higher cardinality than the number of parent nodes. A distinction is made between internal dilation and external dilation, in the former the child node is not a leaf, i.e. it has children, while in the latter the child is a leaf node, i.e. it has no children. The last type is the inaccessible nodes when the node has an incoming edge and no dilation is present, but the node is not reachable by a directed path from any of the inputs. These types are important properties, e.g. the existence of dilation or inaccessibility is detrimental to complete structural controllability 3 . The controlling and observing matrices are sparse matrices as only the columns of drivers and sensors contain nonzero values. The values show the number of derivations necessary to influence or observe a state variable in the system. Next, the similarity of the driver and sensor nodes is presented. This similarity is based on how similar the set of nodes is, which can be reached for driving or observing. Furthermore, the necessary derivation to influence or observe them is also part of the comparison. R c and R o are the simple reachability matrices. They show which nodes can be controlled or observed by a given node in its structural meaning, i.e. the existence of a directed path between the nodes is shown. In R c , the i th column shows which nodes can control node i. From the other viewpoint, elements in row i highlight those nodes which can be controlled by node i. It is very important that R c is only a reachability matrix, the structural controllability of the reachable nodes is not granted by a node that can reach them, but in some cases the structural controllability problem can be reduced to a reachability problem 34 . The R o matrix can be interpreted analogously with regard to observability.
Finally, measures of edge centrality are generated by the system characterisation module. The betweenness has the same meaning as in the case of nodes, that is, it yields the number of shortest paths that intercept the edge 35 . From this perspective, the most critical synapsis is the one between the command interneuron AVAL and amphid ADLL with a value of 640.5833. The endpoint similarity shows how similar the influenced and observed sets of the state variables with regard to the endpoints of edges are. This metric has a high value if the edge is part of a cycle or creates a bridge in the network. As no bridges are present in this network, only cycles can be recognised by this measure. The edge similarity shows how similar the roles of edges are, and it allows redundancies, to be located.
For the demonstration of the last module, four plus one methods were applied to the neural network of C. elegans. The set covering-based grassroot method (SetCovGr) optimises the placement of driver nodes and sensor nodes to provide an initially demanded relative degree, but this method does not take into account the original input and output configurations also, thus, structural controllability and observability is not granted also. The other four methods grant controllability and observability by expanding the minimal configurations.
They are the centrality measures-based (CentMeas) retrofit, set covering-based retrofit (SetCovRet), modified Clustering Large Applications based on Simulated Annealing (mCLASA) and Geodesic Distance-based Fuzzy c-Medoid Clustering with Simulated Annealing algorithm (GDFCMSA) methods 13 . These methods were utilised with the following parameters: the required relative degree was set at 2, while the alpha parameter of the cost function was set at 0.5 13 . The results can be seen in Table 3. The number of assigned driver nodes varies significantly when different methods are applied. The centrality measures-based method assigned the most driver nodes to the system. Thus, this method results in the smallest cost, but the difference is irrelevant, most of the methods resulted in a cost of 1.5. The increase of the number of the driver nodes decreases the mean relative degree, which is the lowest in the case of the centrality measures-based method.
The robustness of the configuration was also analysed. In each scenario, a node was removed from the network. Using the leave-one-out strategy, the network with the altered configuration remains controllable in 115 scenarios. As for the sensor nodes, the difference is not as significant between the methods as in the case of the driver nodes. Critical nodes were also generated. A node is critical if the system becomes uncontrollable or unobservable if the node is removed. The determined critical nodes and the names of selected driver and sensor nodes can be found in the Excel file attached to the toolbox.

Conclusions
Although numerous papers have utilised the network-based determination of driver and sensor nodes, a flexible toolbox that may be used to support the analysis has yet to be designed. To fill this gap, in this article the Octave-and MATLAB-compatible NOCAD toolbox 9 was proposed to support the networkbased controllability and observability analysis of dynamical systems, and through the analysis of the neural network of C.elegans, the applicability of the toolbox in the life sciences was presented. The toolbox offers two methods to design a structurally controllable and observable system based on the adjacency matrix (A T ). The designed system can be analysed by 49 qualitative measures both from structural and dynamical points of view. The toolbox serves five methods to improve the designed system by adding new inputs and outputs to it, thus, its relative degree can be decreased. Then the robustness of the individual designs can also be evaluated. The modular structure of the toolbox supports the facile improvement of the modules by adding new functions and the toolbox can be extended by new modules as well. Even though the modules are built on each other, most of their functions can also be used independently from each other.
Although our goal in this paper is to draw the attention of researchers of life sciences to the services provided by the NOCAD toolbox, it can be utilised in practice in various fields of sciences as well, for example, it enables social networks to be controlled in the economy, transaction networks to be analysed in finance or dynamical systems to be designed in engineering.

Data availability
All data underlying the results are available as part of the article and no additional source data are required.

Software availability
Source code available from: https://github.com/abonyilab/ NOCAD.  The revised version is clearly more focused and directed to researchers is life science, although its application is much wider. It has been shortened and some figure removed. Discussions are more objective and some technicalities corrected.
Some more general discussions that have been suggested in the first review were simply added as assertions. This helps keep the paper compact, which is understandable.
As a second notice, though, we still think that the third module could be discussed in more depth. The toolbox provides five ("four plus one") methods to analyze the network robustness (from a structural controllability/observability point of view) and to place driver/sensor in such a way that the relative degree of the network system does not exceed a specified threshold. Although the relevance of the third module is now more well-justified, it is not yet clear what is the difference between these five methods. What are the pros and cons of each method? How can this be perceived in the studied neural network of C. elegans?
We found no compatibility issues with Octave in this version.
Minor problems: In the definition of the observability matrix there is a "^" missing. The last T within the square brackets should be a transpose. 1.
"Since the generated measures can be presented by large tables, they are attached in Excel format to the toolbox [9]." Ref.
[9] is a link to the old toolbox (in Zenodo website). It would be nice to link the site with current version (GitHub).

2.
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Control theory, control of networked, nonlinear dynamics.
We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Gilles Didier
Institut Montpelliérain Alexander Grothendieck (IMAG), CNRS, University of Montpellier, Montpellier, France Thanks for your revised manuscript which addressed most of my comments. In particular, the structure of the paper was improved and the notions used in your toolbox are now introduced. I also appreciated the new example which provides a better illustration of the possibilities of your package. Though I am not a native speaker myself, I feel that some issues remained in your text. Please find below suggestions to improve the writing.

Abstract:
I suggest the following alternative: "The network science-based determination of driver nodes and sensor placement has become increasingly popular in the field of dynamical systems over the last decade. We developed an Octave and MATLAB-compatible NOCAD toolbox which implements various methods to compute relevant structural, controllability and observability measures associated to linear or linearised systems and to compare different sensor placement methods. We illustrated the use of our toolbox in life sciences by applying it to the analyse the neural network of Caenorhabditis elegans."

Introduction:
An alternative first paragraph: "In the life sciences, the determination of nodes that play a significant role in networks (e.g., in the emergence or the treatment of diseases) is an intensively researched field^1. In particular, determining the proper driver nodes, i.e. the ones that ensure physically feasible controllability with the minimum cardinality and energy requirement, in biological networks, or more generally in dynamical systems, is an important question. We refer to 2 for a detailed study of control principles in biological networks. A recent review paper presents presents the application of the network science-based determination of driver nodes to the analysis of various biological networks (protein-protein interaction -PPI-networks, Caenorhabditis elegans neuronal network, neurochemical rat brain network, Saccharomyces cerevisiae cell cycle networks, Epithelial Mesenchymal Transition -EMT-network, myeloid differentiation regulatory network and Th differentiation network) and to the identification of drug targets^3." I am not sure to well understand your sentence: "This method is capable of determining the leading phenotype transitions in biological networks that can be identified as drug targets 5" Do you mean something like: "This method is capable of determining genes controlling phenotype transitions in biological networks, thus may help to indentify drug target candidates^5" ?
Please reformulate: "In large-scale human liver metabolic networks (HLMN), the driver metabolites have essential functions, moreover, the role of transport reactions and extracellular metabolites in terms of controlling HLMN have revealed the importance of the environment of human liver metabolism with regard to the health of the liver^6. Using statistical analysis, a subset of critical control nonprotein-coding RNAs (ncRNAs) enriched by human disease can also be determined^7 . In intracellular networks, to understand the information flow, a natural control system was utilised and the robustness of such a control was analysed^8". The paragraph is a bit confusing and the relation with your work is not obvious. I guess that the examples presented was analysed with the same approaches as those implemented in your toolbox but this should be stated somewhere.
Find below, an alternative suggestion for the last paragraph. I am not sure that it is the right place for the second sentence. "The aim of this paper is to introduce the novel toolbox, NOCAD^9 , and to illustrate its applicability in the life sciences through the analyse of the local network of 131 frontal neurons of Caenorhabditis elegans^10. The NOCAD toolbox is also suitable for the comprehensive analysis of any linear or linearised dynamical systems through their static network representation^11-13. We emphasise that in our context, the expression "dynamical network" does not implies that the nodes or connections change with time but refers to networks arising from dynamical systems. The current version of NOCAD deals with only the linear case. Nonlinear system-related methods will be implemented later. In the nonlinear case, the methodology needs further clarification because for small nonlinear examples the results can be incorrect 14 and the cardinality of the assigned sensors underestimated^15 . Theoretical background and formal definitions of the representation of linear systems and of their structural controllability and observability are provided below." Other comments: page 4 col 1 first paragraph "provide the opportunity" -> "allow".
○ page 4 col 1 second paragraph "stood" -> "fixed". ○ page 4 col 2 first paragraph "if the rank of the controllability matrix is equal to the number of state variables, rank(C) = N" -> "if the rank of the controllability matrix C is equal to the number of state variables, i.e., rank(C) = N" (same suggestion for the sentence about the observability matrix). ○ page 4 col 2 "For a better understanding, we illustrate the aforementioned definitions by a small example in Figure 1 that contains the command interneurons AVAL, AVAR, AVBL, AVBR, AVDL and AVDR from the frontal neural network of neurons and synapses in C. elegans." -> " Figure 1 illustrates the aforementioned definitions with a small example which contains the command interneurons AVAL, AVAR, AVBL, AVBR, AVDL and AVDR of the C. elegans' frontal neural network." ○ page 5 col 1 end of paragraph 1: "can be seen" -> "is displayed". ○ page 5 col 2 par 2: remove the comma after "in such a way". ○ page 6 col 1 last par.: your last sentence is too long. ", moreover," -> ". It also". ○ page 7 col 1 par 1: "when determining" -> "to determine". ○ page 7 col 1 par 2: "source nodes when the node has no incoming edges, thus," -> "source nodes are nodes with no incoming edges, for which". ○ page 7 col 1 par 2: "dilation, when" -> "dilation occurs when". ○ page 7 col 1 par 2: "between internal dilation and external dilation, in the former the child node is not a leaf, i.e. it has children, while in the latter the child is a leaf node, i.e. it has no children." -> "between internal dilation nodes and external dilation nodes according to whether their child node is an internal node, i.e., has children, or a leaf, i.e., has no children, respectively." ○ page 7 col 2 last par.: "Although numerous papers used the network-based determination of driver and sensor nodes, a flexible toolbox that may be used to support the analysis has yet to be designed. To fill this gap, in this article the Octave-and MATLAB-compatible NOCAD toolbox 9 was proposed to support the network-based controllability and observability analysis of dynamical systems, and through the analysis of the neural network of C.elegans, the applicability of the toolbox in the life sciences was presented." -> "Although numerous papers have utilised the network-based determination of driver and sensor nodes, a flexible toolbox implementing this approach has yet to be designed. To fill this gap, we propose the Octave-and MATLAB-compatible NOCAD toolbox 9 which ○ performs the network-based controllability and observability analysis of dynamical systems. The applicability of the toolbox in the life sciences was demonstrated through the analysis of the neural network of C.elegans." Competing Interests: No competing interests were disclosed.
Reviewer Expertise: My main area of research is applied mathematics. I worked on biological networks (regulatory and interaction networks).

Version 1
Reviewer In this paper the authors describe the NOCAD toolbox developed for the analysis of some aspects of networks as, for instance, structural controllability and observability. The topic of the paper is not only very interesting but also timely as issues relating to controllability and observability of networks could be fundamental in a number of practical situations. Hence to have a nice set of tools to analyze dynamical networks is welcome.
The toolbox described is useful in a specific range of problems. Here we focus on the controllability and observability properties of networks, as prompted by the title. The tools presented focus on structural controllability and observability of linear systems. The assumption of linearity is not mentioned in the abstract. As a matter of fact, the opening phrase of the last paragraph in the introduction should be copied to the abstract.
The nonlinear case and other definitions of controllability and observability, such as, dynamical and symbolical are not mentioned in the paper nor are handled by the toolbox. This is an important remark as it is now known that some algorithms underestimate the cardinality of the set of sensor nodes when applied to nonlinear systems.

The paper
In the second paragraph of the Introduction, the authors mention the importance of "determining the proper driver nodes". We wonder if the average reader would know what that is. In clarifying this point the authors would like to address first what is "a" proper set of driving nodes (e.g. one that will guarantee full controllability). However, in order to specify "the" proper set, possibly some more detailed measure of controllability should be employed. For instance, suppose two sets of driving nodes S1 and S2, both with the same cardinality, such that the network is fully controllable either from S1 or from S2. Hence S1 is "a" proper set of driving nodes (assuming that by proper the authors are referring to full controllability) and S2 is another one. Now, it could well be that using S1 less energy is required to drive the network from one state x(ti) to another x(tf) when compared to S2. In that case, S1 and S2 are not totally equivalent. Structural controllability and observability are unable on their own to provide this distinction.
Following the same vein, we wonder if the average reader of F1000Research would know the distinction of static and dynamic networks. A word about this would be profitable, especially because the toolbox refers mainly to the second class.
In some parts of the paper the authors refer to matrix A as the "state transition" matrix. In continuous-time this is incorrect. In discrete-time this is only correct for a transition time of one sampling period. Matrix A is called the "dynamical matrix". State transition matrix is something else.

Relevance to the journal
The MS introduction is written in such a way that it instigates interest from the target audience of this journal (mathematical biology). However, the techniques implemented in the toolboxes are usually found around the network science community, which includes several other fields that range from power systems to social networks. To show more coherence to this journal, the authors should provide some example of application on a "real-world network" under a biological context. This is also interesting to show how the results provided by the toolbox can be useful to draw conclusions under a practical context other than a "toy problem" as presented.

Background
The MS content is not sufficient to understand the techniques implemented in the toolbox. We know that the paper does not aim at providing extensive background, but some could be helpful. A reader coming from a control theory background might expect the toolbox to provide results based on Kalman's definition of controllability and observability. However, the toolbox is based on Liu and coworker's maximum matching algorithm which is based, in turn, on Lin's structural definition (1974) (which is not even mentioned in the MS). Which definition of controllability and observability the toolbox is based on should be crystal clear in the main text and abstract. Moreover, the notions of structural controllability and observability should be presented to the reader and how they interplay with the more well-known notion in Kalman's sense (e.g. Lin's definition is only a necessary condition for Kalman's definition).
Although we think that the definitions of controllability and observability should be mentioned on the main text, the authors provide some background on the implemented maximum matching algorithms in the toolbox. This, however, should be mentioned explicitly in the MS.

Examples
The paper furnishes examples to illustrate some of the features of the new toolbox. The network topology is the same in each example, which facilitates understanding and comparison. In some cases, more discussion would be welcome. For instance, in the last example (Figure 7) not all the indices are clearly defined to the user, for instance the critical nodes are given as x2, x4 and x7. From the context it seems that if any of these nodes is lost, the network would become uncontrollable, but this is not directly stated.
In referring to Figure 5 the authors speak in terms of Reachability matrices: Rc and Ro. It is clear from Figure 5 that Rc shows which nodes have a path to node i (the ith column). As for Ro, it seems that we should look row-wise instead of column-wise, is that right? Anyhow, we do not think it is adequate to say that Rc shows which nodes can be controlled from another one. The word "control" is perhaps too general. We would suggest just to say that Rc shows which nodes can be reached from another one.

Toolbox
The toolbox is subdivided in three modules. The first one, "network mapping module", implements the maximum matching algorithm (and related modifications) to return a structurally controllable (observable) network with the smallest set of driver (sensor) nodes. This is a welcome feature, especially for a MATLAB environment, which we are unaware of any alternative.
The second module, "system characterization module", provides several graph measures that are available on other MATLAB-based toolboxes, but are indeed useful to assess the network controllability and observability properties of a system. Thus, its presence is justifiable.
The third module, "improvements and robustness module", is a set of functions which specifically implement previous results of the authors (e.g. Refs. 20 1 , 21 2 ). It seems quite specific, but nevertheless the toolbox relevance is justifiable in great part for its module one.
The toolbox seems fast and no bugs were found in its implementations as far as the MATLAB environment is concerned. See further comments on some compatibility issues when using Octave.

The target audience of the toolbox
The toolbox is applicable to any kind of network (graph), be it directed or undirected, weighted or unweighted, and so on. However, although general, the authors should discuss in the MS what are the kinds of networks where a structural analysis of controllability and observability are more useful. For instance, a linearization of a power system model modelled by interconnected Kuramoto oscillators yield a dynamical matrix "A" whose corresponding adjacency graph is not only highly connected but also undirected. Consequently, the toolbox points out that only one driver node is needed to structurally control the network, independently of its size "n"(which is true in this case according to Lin's definition). However, this does not give insight to the problem since basically any node can be chosen as a driver node, and only one node being sufficient seems quite unrealistic. The question is: For what kind of networks is the structural approach more interesting (and hence the toolbox)? It could be that the toolbox could be extra-helpful in the context of more sparse and directed networks (with higher hierarchy).

Some nitpicking:
Sometimes it is not clear whether the authors refer to the dynamical matrix "A" or the adjacency matrix "A^T". We recommend that different nomenclatures be used, such as "A" for the dynamical matrix and "A_{Adj}" for the adjacency matrix. This should be changed in the MS and manual.
Regarding the installation of the NOCAD toolbox. We noticed that the NOCAD already comes with a "octave-network-toolbox" folder which does not have all the necessary functions to use Module 2. Thus, we had to download the "octave-network-toolbox-master" folder in Ref. 22 to have access to all needed functions. Is this necessary or is the NOCAD toolbox really missing some functions?
Some statements in Section "Use Cases", Paragraph 5, are redundant with the information already present in Fig. 4. Instead of repeating the same information, it might be more interesting to make some comments on how useful some of these network measures can be to design better and more robust networks from a control and observation point-of-view.
There are some issues in the Octave version of this toolbox. For instance, function "heatmaps" has some bugs in Octave but works well in MATLAB. This happens because the function "colormap" from Octave accepts the argument "hot" but not "Hot". The authors should do a careful review of the Octave toolbox and check for further compatibility issues.

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Partly
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Partly Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Control theory, control of networked, nonlinear dynamics.
We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.

Author Response 10 Sep 2019
Janos Abonyi, University of Pannonia, Veszprém, Hungary Dear Reviewers, We are grateful for your useful remarks.
In the following, we provide a detailed report about how we improved the paper based on your valuable comments and suggestions. We hope that the modifications have significantly improved the understandability of the paper.
Sincerely yours, Janos Abonyi In this paper the authors describe the NOCAD toolbox developed for the analysis of some aspects of networks as, for instance, structural controllability and observability. The topic of the paper is not only very interesting but also timely as issues relating to controllability and observability of networks could be fundamental in a number of practical situations. Hence to have a nice set of tools to analyze dynamical networks is welcome.
The toolbox described is useful in a specific range of problems. Here we focus on the controllability and observability properties of networks, as prompted by the title. The tools presented focus on structural controllability and observability of linear systems. The assumption of linearity is not mentioned in the abstract. As a matter of fact, the opening phrase of the last paragraph in the introduction should be copied to the abstract.
The nonlinear case and other definitions of controllability and observability, such as, dynamical and symbolical are not mentioned in the paper nor are handled by the toolbox. This is an important remark as it is now known that some algorithms underestimate the cardinality of the set of sensor nodes when applied to nonlinear systems. Thank you for this invaluable remark, we have emphasised in both the Abstractand ○ Introductionthat the toolbox is applicable to the analysis of linear and linearized systems rather than nonlinear systems for which incorrect results may be suggested.

The paper
In the second paragraph of the Introduction, the authors mention the importance of "determining the proper driver nodes". We wonder if the average reader would know what that is. In clarifying this point the authors would like to address first what is "a" proper set of driving nodes (e.g. one that will guarantee full controllability). However, in order to specify "the" proper set, possibly some more detailed measure of controllability should be employed. For instance, suppose two sets of driving nodes S1 and S2, both with the same cardinality, such that the network is fully controllable either from S1 or from S2. Hence S1 is "a" proper set of driving nodes (assuming that by proper the authors are referring to full controllability) and S2 is another one. Now, it could well be that using S1 less energy is required to drive the network from one state x(ti) to another x(tf) when compared to S2. In that case, S1 and S2 are not totally equivalent. Structural controllability and observability are unable on their own to provide this distinction. Thank you for your valuable remark, we have expanded the relevant part of the paper and emphasized that during optimisation the cardinality and energy demand should be minimised, while the configurations should provide structural controllability and observability.

Following the same vein, we wonder if the average reader of F1000Research would know the distinction of static and dynamic networks. A word about this would be profitable, especially because the toolbox refers mainly to the second class.
We are grateful for this suggestion, in the Introductionwe have clarified the nomenclature as well as system and network classes used in the manuscript.

○
In some parts of the paper the authors refer to matrix A as the "state transition" matrix. In continuous-time this is incorrect. In discrete-time this is only correct for a transition time of one sampling period. Matrix A is called the "dynamical matrix". State transition matrix is something else.
Thank you for this invaluable remark, we have corrected the incorrect nomenclature.

Relevance to the journal
The MS introduction is written in such a way that it instigates interest from the target audience of this journal (mathematical biology). However, the techniques implemented in the toolboxes are usually found around the network science community, which includes several other fields that range from power systems to social networks. To show more coherence to this journal, the authors should provide some example of application on a "real-world network" under a biological context. This is also interesting to show how the results provided by the toolbox can be useful to draw conclusions under a practical context other than a "toy problem" as presented.
We are grateful for this useful suggestion. We have improved the Introductionand, with the aid of references, introduced the networks that are already utilised in network-based structural controllability and observability analysis. In addition, the example problem has been replaced by the frontal neural network of C. elegans, a well-known standard biological dataset.

Background
The MS content is not sufficient to understand the techniques implemented in the toolbox. We know that the paper does not aim at providing extensive background, but some could be helpful. We are grateful for your useful remarks and have improved the section entitled Methods, moreover, extended it with the theoretical background to dynamical systems, definitions of structural controllability as well as observability, and the applied methods. The proposed works from the literature have been cited in the relevant sections and for this suggestion we would like to express our sincere gratitude.

Examples
The paper furnishes examples to illustrate some of the features of the new toolbox. The network topology is the same in each example, which facilitates understanding and comparison. In some cases, more discussion would be welcome. For instance, in the last example (Figure 7) not all the indices are clearly defined to the user, for instance the critical nodes are given as x2, x4 and x7. From the context it seems that if any of these nodes is lost, the network would become uncontrollable, but this is not directly stated.
In referring to Figure 5 the authors speak in terms of Reachability matrices: Rc and Ro. It is clear from Figure 5 that Rc shows which nodes have a path to node i (the ith column). As for Ro, it seems that we should look row-wise instead of column-wise, is that right? Anyhow, we do not think it is adequate to say that Rc shows which nodes can be controlled from another one. The word "control" is perhaps too general. We would suggest just to say that Rc shows which nodes can be reached from another one. We are grateful for your valuable remark and have expressed the results provided by the toolbox as well as inserted a more thorough introduction to the results. According to the reachability matrices, control and reach derived from several dynamical systems overlap to some extent when the problem with regard to controllability was reduced to one concerning reachability.

Toolbox
The toolbox is subdivided in three modules. The first one, "network mapping module", implements the maximum matching algorithm (and related modifications) to return a structurally controllable (observable) network with the smallest set of driver (sensor) nodes. This is a welcome feature, especially for a MATLAB environment, which we are unaware of any alternative.
The second module, "system characterization module", provides several graph measures that are available on other MATLAB-based toolboxes, but are indeed useful to assess the network controllability and observability properties of a system. Thus, its presence is justifiable.
The third module, "improvements and robustness module", is a set of functions which specifically implement previous results of the authors (e.g. Refs. 201,212). It seems quite specific, but nevertheless the toolbox relevance is justifiable in great part for its module one.
The toolbox seems fast and no bugs were found in its implementations as far as the MATLAB environment is concerned. See further comments on some compatibility issues when using Octave.
We are grateful for your useful remarks and have highlighted in the section entitled Implementationthe importance of the third module for dynamical systems in general and biological systems specifically.

The target audience of the toolbox
The toolbox is applicable to any kind of network (graph), be it directed or undirected, weighted or unweighted, and so on. However, although general, the authors should discuss in the MS what are the kinds of networks where a structural analysis of controllability and observability are more useful. For instance, a linearization of a power system model modelled by interconnected Kuramoto oscillators yield a dynamical matrix "A" whose corresponding adjacency graph is not only highly connected but also undirected. Consequently, the toolbox points out that only one driver node is needed to structurally control the network, independently of its size "n"(which is true in this case according to Lin's definition). However, this does not give insight to the problem since basically any node can be chosen as a driver node, and only one node being sufficient seems quite unrealistic. The question is: For what kind of networks is the structural approach more interesting (and hence the toolbox)? It could be that the toolbox could be extra-helpful in the context of more sparse and directed networks (with higher hierarchy). Thank you for your valuable remark. We believe no "better" types of networks for structural analysis are known since the importance of a node in a complex system from a structural point of view can be excessive or critical with regard to reliability, independent of the directedness in the network. The hardest aspect of these analyses is the evaluation of the results that should be provided by experts, moreover, this may cause undirected or directed and sparser or denser networks to be preferred.

Some nitpicking:
Sometimes it is not clear whether the authors refer to the dynamical matrix "A" or the adjacency matrix "A^T". We recommend that different nomenclatures be used, such as "A" for the dynamical matrix and "A_{Adj}" for the adjacency matrix. This should be changed in the MS and manual.
Thank you for this invaluable remark, we have used the nomenclature "A^T" to emphasise the transposition, i.e. the connection between the dynamical and adjacency matrices. We are grateful for the thorough testing of the toolbox. Unfortunately, some function from the octave-networks-toolbox was excluded during its upload, but this has been corrected.
○ Some statements in Section "Use Cases", Paragraph 5, are redundant with the information already present in Fig. 4. Instead of repeating the same information, it might be more interesting to make some comments on how useful some of these network measures can be to design better and more robust networks from a control and observation point-of-view.
Thank you for your remark, we have replaced the example network, therefore, the diagrams have been removed and the relevant parts improved. We hope that this problem has been resolved.

○
There are some issues in the Octave version of this toolbox. For instance, function "heatmaps" has some bugs in Octave but works well in MATLAB. This happens because the function "colormap" from Octave accepts the argument "hot" but not "Hot". The authors should do a careful review of the Octave toolbox and check for further compatibility issues.
Thank you for your remark. Interestingly, our system did not alert us to this problem. We have corrected the source code as suggested since both parameters were applied to our configurations, however, we kindly ask you to check your version of Octave as the problem can also be caused by the use of an older version of this program.
C and D as illustrated with the first example. It should be interesting to apply the approach on a biological network, for instance a protein-protein interaction network or a regulatory network, in order to illustrate the interest of the results obtained in this context. In the second example (Figure 3), input and output nodes seem to be set arbitrarily in order to get a system complex-enough to show the function of the second module of the toolbox (incidence matrix is the same as that of the first example). Is it possible to illustrate the toolbox with an example in which the input and outputs are not fixed? I am actually concerned with the fact that the use cases do not really show how the toolbox can be used on a real dataset, which may not have natural input and outputs.  The ideas implemented in the toolbox are interesting, but I feel that their presentation has to be improved. First, I suggest you get your text proof-read by a native speaker. Second, many concepts used in the manuscript should be at least briefly recalled. Third, the article lacks a detailed description of the input (in the general sense, including user interaction) of the toolbox. The basic input seems to be the incidence matrix of a directed network but, if I well understood, the toolbox allows to set input and output nodes and has network design capabilities. Thank you for this remark. We used Grammarly and asked a professional proofreader to correct the second version of the manuscript. We have tried our best to minimize the number of grammatical errors.
○ At the beginning of the section entitled Methods, the representation of linear systems, definitions of structural controllability as well as observability, and introductions to the methodology in addition to maximum matching were included, therefore, a review of the cited literature was deemed unnecessary. The inputs of the modules were clarified in the section entitled Implementation. ○ representation. In addition, in the section entitled Implementation, each of the inputs to the modules was described separately. I found the paragraph about the modularity of the toolbox is not very useful. It could be shortened or even discarded.
We are grateful for your valuable suggestion and have deleted this paragraph.

Use cases
If I well understood, the input is the incidence matrix A from which are computed the matrices B, C and D as illustrated with the first example. It should be interesting to apply the approach on a biological network, for instance a protein-protein interaction network or a regulatory network, in order to illustrate the interest of the results obtained in this context. Thank you for your valuable remark. We have replaced the example network with the frontal neural network of C. elegans to draw attention to the easily understandable measures rather than the useful results. Therefore, the results from the analysis of the frontal neural network of C. elegans are presented, while in the manual the example network is still available.

○
In the second example (Figure 3), input and output nodes seem to be set arbitrarily in order to get a system complex-enough to show the function of the second module of the toolbox (incidence matrix is the same as that of the first example). Is it possible to illustrate the toolbox with an example in which the input and outputs are not fixed? I am actually concerned with the fact that the use cases do not really show how the toolbox can be used on a real dataset, which may not have natural input and outputs. Since we replaced the example network with the frontal neural network of C. elegans, it is not necessary to apply the fixed inputs and outputs, therefore, we hope that this problem has been resolved. As the example network was changed, we hope this remark has been acted on.

○
The different types of nodes displayed in Figure 5 should be defined rather than just refer to [24]3 .
Thank you for this invaluable remark. We have expanded this part of the paper and, with references, mentioned the importance of these types in biological networks.

Could you please give an idea about what kind of biological networks can be studied by the means of your third module? This part is not very intuitive from a biological point of view.
We are grateful for this useful suggestion. We have dedicated a paragraph to the importance of this module in the section entitled Implementation. Given the improved theoretical background, we hope that the presence of this module is more suitable.

Is the rationale for developing the new software tool clearly explained? Partly
As no widespread and approved tool for the network-based structural controllability and observability analysis of dynamical systems is known, we propose our toolbox for this purpose, and its applicability is introduced with the aid of a real network from a ○ standard biological dataset.

Is the description of the software tool technically sound? Partly
We have improved the introduction to the theoretical background, therefore, the paper can be regarded as an independent work that introduces the advantages of the methodology and implemented toolbox.

Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others? Partly
We have stated the input used in this article and the toolbox, moreover, a well-known example was used in order of maximal reproducibility.

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? No
We have included a more detailed introduction to the theoretical background in order to provide sufficient information concerning the expected outputs and results.

Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Partly
We hope that the frontal neural network of C. elegans and its results represent well the applicability of the toolbox, and believe in the importance of such a novel toolbox, even though numerous publications utilise this methodology. The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias • You can publish traditional articles, null/negative results, case reports, data notes and more • The peer review process is transparent and collaborative • Your article is indexed in PubMed after passing peer review • Dedicated customer support at every stage • For pre-submission enquiries, contact research@f1000.com