COVID-19 forecast for 13 Caribbean countries using ARIMA modeling for confirmed, death, and recovered cases [version 1; peer review: awaiting peer review]

Background: The rapid spread of the Covid-19 virus in the Caribbean region has led to increased surveillance with an increasing trend of confirmed cases of COVID-19 in 13 Caribbean countries. Our study aims to analyze the impact of Covid-19 (SARS nCoV-2) in 13 Caribbean countries in terms of the number of confirmed cases, deaths, and recovered cases. Methods: The study uses the ARIMA model based on the time series pattern according to data retrieved from John Hopkins University. The data were analyzed using Stata 14 SE software for the period January 22, 2020, and August 16, 2021, and forecasted till December 31, 2021. All chosen models were compared with other models in terms of various factors like AIC/BIC, log-likelihood, p-value significance, coefficient < 1, and 5% significance. The ACF and PACF graphs were plotted to reduce bias and select the best-fitting model. Results: The results show the predicted trend in terms of confirmed, death, and recovered cases of COVID-19 for 13 Caribbean countries. The projected ARIMA model forecast for the period December 16, 2021, to December 31, 2021, shows 2470272 (95% CI 2438965 2501579) confirmed cases, 27220 (95% CI 26886 27555) deaths, and 818105 (95% CI 818085 818125) recovered cases related to Covid-19. The final ARIMA model chosen for confirmed COVID-19 cases, several deaths, and recovered cases are ARIMA (9,2,4), ARIMA (1,2,1), and ARIMA (1,2,1), respectively. Conclusions As per the results of the forecasted COVID-19 models, there is a steady rise in confirmed, death, and recovered cases during the period June 1, 2020, until November 30, 2020, and April 1, 2021, until June 15, 2021. It shows an increasing trend for confirmed and recovered COVID-19 cases and a slowing of the number of deaths.


Introduction
According to the World Health Organization (WHO) and the Caribbean Public Health Agency (CARPHA), the ongoing pandemic caused by the SARS nCoV-2 virus is a grave concern worldwide, including in the 13 Caribbean countries, namely Antigua and Barbuda, Bahamas, Barbados, Cuba, Dominica, Dominican Republic, Grenada, Haiti, Jamaica, Saint Kitts and Nevis, Saint Lucia, Saint Vincent and the Grenadines, and Trinidad and Tobago. The risk of transmission of the COVID-19 virus in the Caribbean is high due to the high inflow of tourists and the return of nationals from all over the world. The islands also have a large number of inter-island migrants who returned to their mother states. Interestingly, the islands also have in-built advantages in controlling the pandemic because individual islands can be isolated and inter-island spread can be controlled. However, the region has particular disadvantages because its small population size causes its data to be overseen in global discussions. The Caribbean is home to only 0.56% of the world population. The number of confirmed COVID-19 cases is comparatively less compared to the U.S which has more than nine million confirmed cases till Oct 15, 2020. There is an inevitable risk of a second wave, as new cases can emerge in the region due to the opening of borders. In addition, there is the problem of illegal migrants in many of these countries who are unwilling to approach the health system once they realize that they need help. Porous borders are an unacknowledged source of worry in this crisis.
WHO, the Pan American Health Organization (PAHO), CARPHA, local health administrations, and other international organizations have developed risk assessment levels that establish COVID-19 information centers in different regions 1,2 . All CARPHA member states have been advised to increase surveillance mechanisms, including random tests, lockdown measures, increase public awareness, and implement national preparedness plans. PAHO, based in Trinidad and Tobago, Barbados, and Jamaica, is proactively collaborating with global organizations including the WHO and the various Ministries of Health in each of the Caribbean countries to convey information concerning the latest developments, public health policies, lockdown measures, therapeutics, and diagnostic modalities including an overview of rapid testing among vulnerable populations 3 . PAHO is also collaborating with various Caribbean countries regarding the state of public awareness, the use of masks, maintenance of social distancing, and relaxation of lockdown norms in a phased manner 4-6 . Regional and international health authorities are assisting with infection control strategies, essential drugs, and the supply and training of essential health care workers. The Caribbean Community (CARICOM) along with its member states has been working with the chief medical officers and local health administrators of various Caribbean countries to establish COVID-19 testing facilities, quarantine zones, and the National Emergency Operations Center for COVID-19 briefings 7 . In addition to the health sector, the growing COVID-19 pandemic has also affected tourism and businesses in the Caribbean.
To assess the spread of the SARS nCoV-2 virus in the region, we intend to project a short-term forecast in terms of the number of confirmed cases, deaths, and recovered cases in the Caribbean region. Data analysis involves the auto-regressive integrated moving average (ARIMA) model which forecasts future values in a time series based upon its past values, lags, and forecast errors. The predicted forecast will help CARPHA, PAHO, and other regional health organizations in the Caribbean organize resources and better prepare to assess risk in terms of community transmission from imported cases, coordinate regional preparedness with Incident Management Teams (IMT) and prepare travel guidelines for local populations.

Data
The data for the Stata analysis is freely available for academic and research purposes through John Hopkins University GitHub Repository. It is one of the largest online communities for health care professionals, health institutions, and medical researchers. The data is maintained by John Hopkins University in collaboration with the Center for Systems Science and Engineering (CSSE) and technical support from ESRI and John Hopkins University Physics Laboratory 8,9 . The data include information concerning confirmed, death, and recovered cases of COVID-19 from 13 Caribbean countries, namely Antigua and Barbuda, Bahamas, Barbados, Cuba, Dominica, Dominican Republic, Grenada, Haiti, Jamaica, Saint Kitts and Nevis, Saint Lucia, Saint Vincent and the Grenadines, and Trinidad and Tobago (in alphabetical order). The time frame of the data starts from January 22, 2020, since situation reports from WHO about SARS nCoV-2 were published starting from January 21, 2020. Our study has been forecast until December 31, 2021.

Study design and ARIMA modeling
The data consist of time series patterns with information concerning province/state, country, longitude, latitude, and confirmed COVID-19 cases for each respective country. Similar time series data were provided for COVID-19 deaths and recovered cases, respectively, from January 22, 2020, to August 16, 2021. The data was compiled in a Microsoft Excel spreadsheet, merged using a 'sum' argument entered as a formula for each specific date from January 22, 2020, and forecast until December 31, 2021. It was converted to .dta format for analysis using Stata 14 SE software ARIMA time series analysis. The analysis can be performed using PSPP software, written in C. It has a conventional command line with a graphical interface. The software uses the GNU Scientific Library and it is freely available for use. To declare the data as time series, the date variable was formatted using the Stata command 'format' and '%td' in DDMMYYYY and quarterly format. It also involves the Stata command 'tsset', which declares the data as time series for a while using lead and lag operators. Also, the Stata command 'tsfill' was used to fill out the gaps in the time series model if there are any. The stationarity of the time series model was checked using the Stata 'd' differencing operator 10 . The differencing operator generates lags between current and previous values. Second-order differencing was performed using the 'd2' operator based on the lag values of the first-order differencing. Graphical representation of first-order and second-order differencing was made using two-way line plots using the Stata command 'two-way' and 'tsline' as shown in Figure 1. It is also very useful to check the stationarity of the time series using graphical plotting as second-order differencing plots diminish around zero. These plots are essential to check the time series stationarity as they fit numerically on the x and y-axis respectively (Figure 1, Figure 2).
Based on the graphical representation, the ARIMA model can be formed using ACF and PACF graphs using the Stata command 'ac' and 'pac' using first-order differencing and second-order differencing variables, respectively, as shown in Figure 2. ACF plots involve autocorrelation of prior values in a time series with various lags 11,12 . MA can be derived from ACF plots graphically which are involved in building ARIMA models for further analysis. In an ACF graph, the lines situated in the shaded region are within acceptable regions (95% CI) while the lines outside the shaded region are lags that are autocorrelated in a series and the values are taken up as MA for ARIMA modeling. ACF plots are based upon Barlett's formula for MA(q) processes involving pointwise confidence intervals. Similarly, PACF graphs are based upon the selection of the partial autocorrelation in the selected time series. It involves a confidence interval calculated using standard error 1/ (the root of n). The graph may also include residual variance for each lag. The values of AR are derived from PACF graphs which can be used for ARIMA model forecasting. The value of I depends on the differencing of the first order (I=1) or second-order (1 = 2). The models are chosen based upon smaller AIC/BIC values, which indicates a better fitting model. Other features like loglikelihood ratio, p-value significance, and coefficient < 1 and 5% significance can also be checked to compare various models [13][14][15][16] . The survey was analyzed using regular regression and poststratification methods. It involved predictive analysis and health informatics with the help of Farr Institute wherein complex and adaptive modeling techniques were used for large-scale data mining, statistics, and analysis.

Results
Based on our data, out of the 13 Caribbean countries, the highest number of confirmed cases of COVID-19 is seen in the Dominican Republic, followed by Cuba, Jamaica, Haiti, Trinidad, and Tobago, Bahamas, Barbados, Antigua, and Barbuda, Grenada, St. Lucia, St. Vincent, and the Grenadines, Dominica, St. Kitts and Nevis, in this order 17 . Similarly, Caribbean countries with the highest number of death cases reported are the Dominican Republic, followed by Cuba, Haiti, Bahamas, Jamaica, Trinidad and Tobago, Barbados, Antigua and Barbuda, in this order. The highest number of recovered patients were from the Dominican Republic, followed by Cuba, Trinidad and Tobago, Jamaica, Barbados, Bahamas, Haiti, St. Lucia, Antigua and Barbuda, Dominica, Grenada, St. Vincent and the Grenadines, in this order. Our study demonstrates the future trend of the COVID-19 crisis in the Caribbean and its impact on health services and local communities.
The projected ARIMA forecast shows a linear increase for COVID-19 confirmed cases along with an increase in recovered patients in the 13 Caribbean countries. The time series was checked for stationarity through first-order and second-order differencing along with the Dickey-Fuller test. The results were graphically plotted using PACF and ACF to derive AR and MA for forecasting model building (Table 1). Data analysis for the time series model was checked for stationarity as shown in Figure 1 using Stata differencing operators, and graphs were plotted along the axis for the time frame from January 22, 2020, to December 31, 2021. Figure 2 shows the ACF and PACF models for the number of confirmed cases, deaths, and recovered cases of COVID-19 plotted to find the best fitting model based on Bartlett's formula for the MA (q) 95% CI bands.   Figure 3 shows current and projected forecast trends for final COVID-19 selected models -confirmed cases ARIMA (9,2,4), number of deaths ARIMA (1,2,1), and recovered cases ARIMA (1,2,1), respectively. Table 1 offers a comparison of various models along with the significance of the p-value and the AIC / BIC values. Table 2 shows the current and projected forecasts for the Caribbean region in terms of the number of confirmed cases, deaths, and recovered cases of COVID-19 (Table 2).

Limitations
The study was limited to data regarding COVID-19 confirmed, death & recovered cases from 13 Caribbean Countries. We were unable to access information on vulnerable populations or demographic information, and therefore the study is limited to projecting only a general trend about Covid-19 for the Caribbean.

Discussion
Our study forecasts the likelihood of the pandemic over the next two months. It shows an increasing trend for confirmed and recovered COVID-19 cases and a slowing of the number of deaths. This would reflect the gradual easing of the lockdown restrictions. This is consistent with a study conducted by Sumner et al., 2020, who argue that the economic impact of COVID-19 will be drastic, specifically in the Latin America and the Caribbean (LAC) region, with potential short-term health and financial consequences 18 . Another study by Simbana-Rivera et al., 2020, performs a comparative analysis of the burden of COVID-19 among Latin American and Caribbean countries following government norms such as social distancing and wearing masks to slow the spread of the virus among local communities 19 . Our study focuses exclusively on the Caribbean and should help Caribbean countries not only prepare themselves for the upcoming impact of the Covid-19 spread but also reduce its economic impact on the very fragile economies of the smaller islands.
Our approach has been validated by several other studies that have used ARIMA modeling for short-term predictions. A study by Benvenuto et al., 2020, focuses on the application of ARIMA modeling on the COVID-19 epidemic. It adds that various institutions will benefit from this forecast as ARIMA modeling involves incidence descriptive analysis of data along with incidence forecast and overall trend analysis 20 . Another article by Dehesh et al., 2020, shows the importance of prediction models and analyzing the trend of Covid-19 as a global pandemic for various countries 16 . Chintalapudi et al., 2020 recommend a data-driven model approach and highlight the significance of government policies based upon forecasting and modeling, which can help to reduce COVID-19 cases in Italy. The study commends preventive measures like regular hand washing, disinfection, wearing masks, travel restrictions, and suspension of public gatherings based upon modeling and forecasts 21 . Although most Caribbean countries are dependent on tourism as a major economic activity, we recommend a guarded approach to open borders the already weak economies may not be able to sustain a fresh wave of the pandemic. A study by Jenkins et al. discusses Caribbean trade relations with China in terms of foreign direct investment and its impact on the economy 22 . The consequences of the COVID-19 outbreak have significantly impacted Caribbean countries, even as China continues to be a prominent supporting partner. Another study by Rodrguez-Morales, MacGregor and, Kanagarajah, 2020 on COVID-19 discusses the importance of carefully assessing the situation based on scientific knowledge sharing and coordinated efforts to revitalize and uplift Caribbean communities amid the COVID-19 crisis 23 . We agree with them and recommend that as countries gear up to open the borders, they should introduce enhanced screening/ monitoring at entry and exit ports.

Conclusion
The outbreak of COVID-19 is a grave concern in the Caribbean region specifically due to the high risk of transmission among the local community. CARPHA has issued certain guidelines for its member states, including the 13 Caribbean countries based on WHO recommendations as mentioned in our study. Our cumulative projected ARIMA model for COVID-19 positive patients, deaths, and recovered cases in the 13 Caribbean countries offer insight for the government and health organizations operating in the region to assess pandemic preparedness plans and revise risk levels based on current and forecasted trends.