ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article
Revised

Modified recurrent equation-based cubic spline interpolation for missing data recovery in phasor measurement unit (PMU)

[version 2; peer review: 1 approved, 1 approved with reservations, 1 not approved]
PUBLISHED 22 Dec 2022
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Research Synergy Foundation gateway.

This article is included in the Artificial Intelligence and Machine Learning gateway.

Abstract

Background: Smart grid systems require high-quality phasor measurement unit (PMU) data for proper operation, control, and decision-making. Missing PMU data may lead to improper actions or even blackouts. While the conventional cubic interpolation methods based on the solution of a set of linear equations to solve for the cubic spline coefficients have been applied by many researchers for the interpolation of missing data, the computational complexity increases non-linearly with increasing data size.
Methods: In this work, a modified recurrent equation-based cubic spline interpolation procedure for recovering missing PMU data is proposed. The recurrent equation-based method makes the computations of spline constants simpler. Using PMU data from the State Load Despatch Center (SLDC) in Madhya Pradesh, India, a comparison of the root mean square error (RMSE) values and time of calculation (ToC) is calculated for both methods.
Results: The modified recurrent relation method could retrieve missing values 10 times faster when compared to the conventional cubic interpolation method based on the solution of a set of linear equations.  The RMSE values have shown the proposed method is effective even for special cases of missing values (edges, continuous missing values).
Conclusions: The proposed method can retrieve any number of missing values at any location using observed data with a minimal number of calculations.

Keywords

phasor measurement unit, missing data, data recovery, smart grid, interpolation, cubic spline, data quality, data pre-processing

Revised Amendments from Version 1

The data and plots in the Results and Discussion section have been updated after some issues were highlighted by the reviewers. Briefly, we re-examined our implementation and due to an oversight, we determined that we had incorrectly calculated the imputation values, thus affecting the RMSE and ToC values. This issue has been rectified and the updated results reflect the correct values.

We also made grammatical and typological corrections to improve the readability of the paper.

See the authors' detailed response to the review by Wun She Yap
See the authors' detailed response to the review by Mathias Foo
See the authors' detailed response to the review by Shaik Mullapathi Farooq

Introduction

The worldwide growing power systems highlight the need for better monitoring and control mechanisms to avoid major blackouts. Smart grids are intelligent systems that facilitate the development of communication, network, and computing technologies, protocols, and standards to integrate power system elements for two-way communication. This time-synchronized high-precision measurement device that is also known as a synchrophasor or Phasor Measurement Unit (PMU), gives clear information on the working of the entire grid. The PMU is used to monitor and control the power grid. It can help in providing real-time measurements by eliminating adverse conditions like blackouts. These combined characteristics of data availability, timeliness, and communication network contribute to the better performance of the PMU system. Although the role, impact,1 architecture, technology,2 applications, functionality, standards, and evolution of PMU (timing, measurement, communication, and data storage) have been released since 1995, the North American Synchro Phasor Initiative (NASPI) has highlighted the importance of data quality.3 Data quality issues, their potential causes, and consequences are elaborated.46 Generally, incomplete or missing data might affect the functionality of the entire system.7 Hence, a way to handle missing values in PMU is mandatory for the effective functioning of the entire grid system.

In this paper, a modified recurrent equation-based method termed the Alpha Method (AM) for PMU missing data problem is proposed. The results are compared with the tri-diagonal matrix-based conventional cubic spline interpolation for the spline coefficients which is also termed the Linear Equations Method (LEM). In this approach, a series of linear equations are solved using the modified recurrent equation to obtain a relationship between points on a spline, which is then used to estimate any missing values on the spline. We compare the proposed method to the more traditional method of solving linear equations, namely using tri-diagonal matrix or termed as the Linear Equations Method (LE) in this paper.

Literature review

The need to fill in the missing values in PMU and potential causes has been reviewed.57 These works imply the need for missing data recovery techniques for PMU data to enhance the accuracy of the decision-making process and show the data quality and security risks associated with the missing data in PMU. One of the popular approaches is the matrix completion (MC) based on missing data recovery.812 The MC is the most exploited technique, however, a few of these were only theoretical approaches and a few approaches were only tested with simulated data.

Interpolation-based missing data recovery techniques1315 propose a reconstruction of missing values by a spatial interpolation or spatio-temporal interpolation of the values. Yet they require historical data of the same channel’s or time’s data for the interpolation. A few of the advanced/hybrid approaches16,17 like k-nearest-neighbor and recurrent relation-based interpolations are not yet applied over the PMU data.

Missing data is a common problem in all fields of study; hence a variety of solutions are found to be effective based on the data pattern, data processing model, and data quality needs. However, adopting any conventional techniques available for treating missing values can get complex especially when solving the high precision and volume of PMU data.15 Therefore, there is a need for a missing data recovery method for PMU data. NASPI presents a variety of data requirements, attributes, and data quality problems for both static data and real-time data. There is a need for designing an effective data recovery method to work without the need for historical data processing and training.3 So, a data-driven recovery technique capable of recovering missing entries with available or observed data is much needed. Moreover, the technique should not get complex and time-consuming when the size of the data grows.

Methods

Cubic spline interpolation is a widely used polynomial interpolation method for functions of one variable. Let f be a function from RtoR. It is assumed that the value of fis known only at x1x2.xixnand let fxi=ai. Piecewise cubic spline interpolation is the problem of finding the bi, ciand di coefficients of the cubic polynomials SFifor0in1 written in the form:

(1)
SFix=ai+bixxi+cixxi2+dixxi3

Where x can take any value between xi and xi+1. That is,

(1a)
SFixi=ai

Let the first-order derivative of equation (1) be:

(2)
SFix=bi+2cixxi+3dixxi2

The first-order derivative at xi for values of 1in1 will be

(2a)
SFixi=bi

And the second-order derivative be:

(3)
SFix=2ci+6dixxi

The second-order derivative at xi for values of 1in1 will be:

(3a)
SFixi=2ci

For a smooth fit between the adjacent pieces, the cubic spline interpolation requires that the following conditions hold:

  • 1. The cubic functions should intersect at the points left and right, for i=0ton1

    (4)
    SFixi+1=SFi+1xi=ai+1

  • 2. For each cubic function to join smoothly with its neighbors, the splines should have continuous first and second derivatives at the data points i=1,,n1:

    (5)
    SFixi+1=SFi+1xi=bi+1
    (6)
    SFixi+1=SFi+1xi=2.ci+1

If hi= xi+1xi and if hi is equal for all ivalues, following Revesz,17 the relation between coefficients ai and ci can be resolved:

(7)
ci1+4ci+ci+1=3h2ai12ai+ai+1
(8)
bi=ai+1ai1hi2ci+ci+13hi
(9)
di=13.hici+1ci

Equation (6) represents a system of linear equations for the unknowns ci for 0in. As the values of aiare known, the value of ci can be found by solving the tri-diagonal matrix-vector equationAx=B. While there are n+1 numbers of ci constants, equation (6) yields only (n-2) equations. Based on the nature or type of spline assumed two more equations representing the boundary conditions of the spline. In general, two types of splines may be considered: natural cubic spline and clamped cubic spline.

For natural cubic spline interpolation, the following boundary conditions are assumed: c0=cn=0.0. That is, the second derivatives of the splines at the endpoints are assumed to be zero. Based on equation (4), a system of (N+1) linear equations of (N+1) variables can be formulated as:

(10)
A=100000001410000001410000000014100000014100000001,x=c0c1cn,andB=03h2a02a1+a23h2an22an1+an0

For clamped cubic spline interpolation the following boundary conditions are assumed: b0=f(x0) and bn=f(xn), where the derivatives f(x0) and f(xn), are known constants. Thus, based on the boundary conditions assumed both natural and cubic splines result in n+1 system of linear equations. The resulting system of n+1 linear equations can be used to get unique solutions by any of the standard methods for solving a system of linear equations.

Once the values of ci are found, the bi and di values can be obtained using equations (8) and (9) respectively. Similarly, under clamped spline interpolation,

(11)
A=210000001410000001410000000014100000014100000012,x=c0c1cnandB=3h2a1a03hfx03h2a02a1+a23h2an22an1+an3hfx03h2anan1

Recurrence equation-based solution

Revesz,17 chose boundary conditions that need to solve the tri-diagonal system given in equation (6) where xirational variables ei rational constants, r is a non-zero rational constant and A is:

(12)
A=r10000001410000001410000000014100000014100000001,x=x1x2xn1xnandb=e1e2en1en

The first row of the new matrix in (6) is shown to be equivalent to the first row of the clamped b matrix e1 is

(13)
e1=3r2ha1a0hfx0+1r2c~1
where, c~1 is an estimate of c1 and r = 2+33.732.17

The chosen boundary conditions are such that the first row of the new matrix was the same as that of clamped cubic spline and while that of the last row was that of the natural cubic spline fixing the value of cn as 0.

(14)
xi+xi+1r=0ki11keikrk+1

Let 0,ifor1<in1andn, respectively be:

0=0
(15)
i=eii1r=0ki11keikrk+1
n=en

Based on the above, the closed form of solution for xi can be given as:

(16)
xi=0kni1rki+k

The above equation solves xi no matter exactly what the initial values for ei. This leads to a faster evaluation of the cubic spline than solving a tri-diagonal system. The major advantage of the method is when new measurements are added to the system. While conventional tri-diagonal matrix-based algorithm requires a complete redo of the entire computation, equation (14) leads to a faster update for each in only with the addition of the term:

(17)
1rn+1in+i
and xn+1=n+1. Similarly, i constants can be updated by adding a single term en+1

The system of linear equations given in equation (7), in general, is solved by the standard solution of linear equations in the matrix form Ax=b. Alternatively, it could be solved for n variables by the recurrence relations given equations (16) and (17). The two methods, the first using the tri-diagonal matrix-based solution for the spline coefficients is termed the Linear Equations Method (LE method) and the second one using recurrence relations is termed the Alpha Method (AM). The algorithmic procedure for LE method and AM are given below.

Algorithmic procedure for regular tridiagonal matrix-based Linear Equation Method (LE)

Step 1: Given the initial vector with missing values, separate them into two sets of vectors, the observed values vector Robs and the missing values vector RMiss, having sizes of NO and NM, respectively, such that NO+NM=N.

Step 2: Robs vector at xi values of the (NO-1) splines shall be theai coefficient vector.

Step 3: Usingai, generate the RHS vector B given in equation (11).

Step 4: Generate a square coefficient matrix A as given in equation (11)

Step 5: Solve for the civector is given in (11), using the relation Ax = B

Step 6: Applyingci in equations compute the bianddi coefficient vectors for n-2 points of the Robs.

Step 7: Using the values of ai, bi, cianddi, missing values can be found by the equation (1) re-written as:

(18)
SFix=aibixi+cixi2dixi3+bi2cixi+3dixi2x+ci3dixix2

Where x represents the missing positions, between xi and xi+1 of spline i.

Algorithmic procedure for recurrent equation-based Alpha Method (AM)

Step 1: Given the initial vector with missing values, separate them into two sets of vectors, the observed values vector Robs and the missing values vector RMiss, having sizes of NO and NM, respectively, such that NO+NM=N.

Step 2: The Robs vector at xi values of the (NO-1) splines is theai coefficient vector.

Step 3: Usingai, generate the RHS vector B given in equation (11).

Step 4: Set 0=0andn=en, calculate the alpha vector using the relation.

i=eii1r=0ki11keikrk+1 for i values ranging from 1 to NO-1

Step 5: Set xn=n and solve for ci values using the relation.

ci=0kni1rki+k

Step 6: Applyingci in equations compute the bi and di coefficient vectors for n-2 points of the Robs.

Step 7: Using the values ofai,bi, cianddi, missing values can also be found using equation (18), re-written here again for convenience:

(18)
SFix=aibixi+cixi2dixi3+bi2cixi+3dixi2x+ci3dixix2

Where x represents the missing positions, between xi and xi+1 of spline i.

The modifications are as follows: In the AM method rather than computing E, alpha vectors andci coefficients for the full range of NO-1 data points only the RHS, E vector, was calculated for the full range of NO-1 data points, while alpha vector and ci were calculated only for iandi+1 data elements, wherei is the missing data element. For the imputation ofi the element, only the Ei vector for all NO-1 data points, i vector and ci vectors for iandi+1 and bi and di coefficients were essential for the calculation ith missing element and its imputation.

In addition, using the AM, an effective procedure was demonstrated for the computation of the following cases: (i) missing first and the last element of the data vector, (ii) missing multiple data points at the beginning and the end, and (iii) missing multiple elements anywhere in the data vector. That is in equation (18), when the current values of A [i] are replaced either with A [N-1] or A [i-1] based on the position of missing edge values or continuous values the ToC and RMSE values have improved significantly.

The formula for RMSE is:

RMSE=i=1NPredictedActual2N

Results and discussion

A comparison between LE method and AM method is shown here for the imputation of one-min real PMU system data having a size of 1490 data points for each of the 25 heterogeneous variables obtained from five different PMUs. Since our data does not have any missing values, we artificially introduced the missing values, of 10%, 20%, 30% in random.

A sample of one-minute PMU data for five PMUs’ was used in the study.18 One minute of PMU data with 10%, 20%, and 30% missing data for five PMUs were evaluated.

When the AM method was employed, the average root mean squared error (RMSE) values were 0.83, 1.47, and 2.16 for 10%, 20%, and 30% of missing PMU data, respectively. This can be seen in Figure 1. Moreover, for the same performance, the AM method showed significant improvements in its time of calculation (ToC) as shown in Figure 2. The average ToCs for the proposed AM method were 1.35, 1.41, and 1.23s when recovering 10%, 20%, and 30% of its missing data.

By comparison, LE method had ToC values of 18.83, 16.02, 16.58s for 10%, 20%, and 30% of its missing data, respectively. The proposed method reduced the ToC by a factor of approximately 10 times.

eb14ec42-1fa7-49af-a3a8-7dfd78c80729_figure1.gif

Figure 1. Comparison of RMSE values.

eb14ec42-1fa7-49af-a3a8-7dfd78c80729_figure2.gif

Figure 2. Comparison of Time of Calculation (ToC).

Conclusions

In this study, the proposed AM method was compared with the LE method. However, because of the proliferation of the data, there is a need for customization of this technique to handle a high volume of data to reduce computational time and power. In the proposed method, the approaches demonstrated a reduced computational effort and time of calculation for solving the coefficient vectors. This study has made the following contributions: (i) the recurrent relation-based alpha method has been effectively employed in the imputation of PMU data and its advantages are demonstrated as an effective and efficient alternative to the conventional technique, and (ii) an effective procedure for handling missing values in special cases (edge, continuous values) is shown, which has not been addressed clearly in other methods. The proposed method has proven effective, and it only requires 10% effort in comparison to the LE method. Future research will focus on the application of the modified recurrent method in the analysis of real-time or stream PMU data.

Data availability

Underlying data

Harvard Dataverse: Underlying data for ‘Modified recurrent equation-based cubic spline interpolation for missing data recovery in phasor measurement unit (PMU)’, ‘PMU data’, https://doi.org/10.7910/DVN/Y2LLJJ.18

This project contains the following underlying data:

  • - Data file: pmu1-1m-10.tab – One minute of data from PMU1 with 10% missing data

  • - Data file: pmu1-1m-20.tab – One minute of data from PMU1 with 20% missing data

  • - Data file: pmu1-1m-30.tab – One minute of data from PMU1 with 30% missing data

  • - Data file: pmu2-1m-10.tab – One minute of data from PMU2 with 10% missing data

  • - Data file: pmu2-1m-20.tab – One minute of data from PMU2 with 20% missing data

  • - Data file: pmu2-1m-30.tab – One minute of data from PMU2 with 30% missing data

  • - Data file: pmu3-1m-10.tab – One minute of data from PMU3 with 10% missing data

  • - Data file: pmu3-1m-20.tab – One minute of data from PMU3 with 20% missing data

  • - Data file: pmu3-1m-30.tab – One minute of data from PMU3 with 30% missing data

  • - Data file: pmu4-1m-10.tab – One minute of data from PMU4 with 10% missing data

  • - Data file: pmu4-1m-20.tab – One minute of data from PMU4 with 20% missing data

  • - Data file: pmu4-1m-30.tab – One minute of data from PMU4 with 30% missing data

  • - Data file: pmu5-1m-10.tab – One minute of data from PMU5 with 10% missing data

  • - Data file: pmu5-1m-20.tab – One minute of data from PMU5 with 20% missing data

  • - Data file: pmu5-1m-30.tab – One minute of data from PMU5 with 30% missing data

  • - README.txt

Data are available under the terms of the Creative Commons Zero “No rights reserved” data waiver (CC0 1.0 Public domain dedication).

The dataset presented in the work was obtained as real-world data from a regional Electricity authority in India.

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 28 Feb 2022
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Thangaraj S, Goh VT and Yap TTV. Modified recurrent equation-based cubic spline interpolation for missing data recovery in phasor measurement unit (PMU) [version 2; peer review: 1 approved, 1 approved with reservations, 1 not approved]. F1000Research 2022, 11:246 (https://doi.org/10.12688/f1000research.73182.2)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 22 Dec 2022
Revised
Views
12
Cite
Reviewer Report 27 Nov 2023
Wun She Yap, Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul Rahman, Kajang, Malaysia 
Approved with Reservations
VIEWS 12
The paper proposed a method to perform missing data recovery on PMU data. It uses a modified approach based on an established technique i.e. recurrent equation-based cubic spline interpolation. Although the paper can be difficult to follow in some sections ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Yap WS. Reviewer Report For: Modified recurrent equation-based cubic spline interpolation for missing data recovery in phasor measurement unit (PMU) [version 2; peer review: 1 approved, 1 approved with reservations, 1 not approved]. F1000Research 2022, 11:246 (https://doi.org/10.5256/f1000research.142368.r220241)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 18 Jan 2024
    Vik Tor Goh, Faculty of Engineering, Multimedia University, Cyberjaya, 63100, Malaysia
    18 Jan 2024
    Author Response
    The literature review has been amended to improve its readability and conciseness.

    We will include such visual aids in our future publications. Thank you for the suggestion.

    The necessary ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 18 Jan 2024
    Vik Tor Goh, Faculty of Engineering, Multimedia University, Cyberjaya, 63100, Malaysia
    18 Jan 2024
    Author Response
    The literature review has been amended to improve its readability and conciseness.

    We will include such visual aids in our future publications. Thank you for the suggestion.

    The necessary ... Continue reading
Views
14
Cite
Reviewer Report 03 Jan 2023
Mathias Foo, School of Engineering, University of Warwick, Coventry, UK 
Approved
VIEWS 14
The authors have addressed all my comments ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Foo M. Reviewer Report For: Modified recurrent equation-based cubic spline interpolation for missing data recovery in phasor measurement unit (PMU) [version 2; peer review: 1 approved, 1 approved with reservations, 1 not approved]. F1000Research 2022, 11:246 (https://doi.org/10.5256/f1000research.142368.r158516)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 28 Feb 2022
Views
35
Cite
Reviewer Report 12 Jul 2022
Shaik Mullapathi Farooq, Department of Computer Science and Engineering, K. S. R. M. College of Engineering (UGC-Autonomous), Kadapa, Andhra Pradesh, India 
Not Approved
VIEWS 35
The manuscript proposes recurrent relation based alpha method to interpolate missing PMU data. Further, the authors try to prove that the proposed method reduces computational complexity. 

However, the comments are as follows, 
  1. The
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Farooq SM. Reviewer Report For: Modified recurrent equation-based cubic spline interpolation for missing data recovery in phasor measurement unit (PMU) [version 2; peer review: 1 approved, 1 approved with reservations, 1 not approved]. F1000Research 2022, 11:246 (https://doi.org/10.5256/f1000research.76818.r139819)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 22 Dec 2022
    Vik Tor Goh, Faculty of Engineering, Multimedia University, Cyberjaya, 63100, Malaysia
    22 Dec 2022
    Author Response
    The purpose of this preliminary paper is to introduce our work in missing data recovery using cubic spline interpolation, namely the mathematical foundation and algorithmic logic. These details have been ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 22 Dec 2022
    Vik Tor Goh, Faculty of Engineering, Multimedia University, Cyberjaya, 63100, Malaysia
    22 Dec 2022
    Author Response
    The purpose of this preliminary paper is to introduce our work in missing data recovery using cubic spline interpolation, namely the mathematical foundation and algorithmic logic. These details have been ... Continue reading
Views
31
Cite
Reviewer Report 10 Mar 2022
Mathias Foo, School of Engineering, University of Warwick, Coventry, UK 
Approved with Reservations
VIEWS 31
In general, there is promising aspect of the proposed method but it has to be conveyed in a clearer manner. Here are my comments.
  1. In Introduction section, the authors state that the comparison will be made
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Foo M. Reviewer Report For: Modified recurrent equation-based cubic spline interpolation for missing data recovery in phasor measurement unit (PMU) [version 2; peer review: 1 approved, 1 approved with reservations, 1 not approved]. F1000Research 2022, 11:246 (https://doi.org/10.5256/f1000research.76818.r125681)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 27 Dec 2023
    Vik Tor Goh, Faculty of Engineering, Multimedia University, Cyberjaya, 63100, Malaysia
    27 Dec 2023
    Author Response
    The idea of cubic spline is the development of a series of unique cubic polynomials that are fitted between the data points. Based on four continuity relations between points in ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 27 Dec 2023
    Vik Tor Goh, Faculty of Engineering, Multimedia University, Cyberjaya, 63100, Malaysia
    27 Dec 2023
    Author Response
    The idea of cubic spline is the development of a series of unique cubic polynomials that are fitted between the data points. Based on four continuity relations between points in ... Continue reading

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 28 Feb 2022
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.