Investigating gateway effects using the PATH study

Background: A recent meta-analysis of nine cohort studies in youths reported that baseline ever e-cigarette use strongly predicted cigarette smoking initiation in the next 6-18 months, with an adjusted odds ratio (OR) of 3.62 (95% confidence interval 2.42-5.41). A recent e-cigarette review agreed there was substantial evidence for this “gateway effect”. As the number of confounders considered in the studies was limited we investigated whether the effect might have resulted from inadequate adjustment, using Waves 1 and 2 of the US PATH study. Methods: Our main analyses considered Wave 1 never cigarette smokers who, at Wave 2, had data on smoking initiation.We constructed a propensity score for ever e-cigarette use from Wave 1 variables, using this to predict ever cigarette smoking. Sensitivity analyses accounted for other tobacco product use, linked current e-cigarette use to subsequent current smoking, or used propensity scores for ever smoking or ever tobacco product use as predictors. We also considered predictors using data from both waves, attempting to reduce residual confounding from misclassified responses. Results: Adjustment for propensity dramatically reduced the unadjusted OR of 5.70 (4.33-7.50) to 2.48 (1.85-3.31), 2.47 (1.79-3.42) or 1.85 (1.35-2.53), whether adjustment was made as quintiles, as a continuous variable or for the individual variables. Additional adjustment for other tobacco products reduced this last OR to 1.59 (1.14-2.20). Sensitivity analyses confirmed adjustment removed most of the gateway effect. Control for residual confounding also reduced the association. Conclusions: We found that confounding is a major factor, explaining most of the observed gateway effect. However, our analyses are limited by small numbers of new smokers considered and the possibility of over-adjustment if taking up e-cigarettes affects some predictor variables. Further analyses are intended using Wave 3 data to try to minimize these problems, and clarify the extent of any true gateway effect.


Introduction
In youths, use of e-cigarettes ("vaping") and cigarette smoking are strongly associated, as shown in, e.g. Canada (Aleyan et al., 2018), France (Dautzenberg et al., 2016), Great Britain (Eastwood et al., 2015, Korea (Lee et al., 2014) and Poland (Goniewicz et al., 2014), as well as the U.S. (e.g. Cooper et al., 2016;Dutra & Glantz, 2014;Wills et al., 2017). Since vaping significantly reduces exposure to harmful constituents compared to smoking (National Academies of Sciences Engineering and Medicine, 2018), one might expect risks from vaping to be substantially lower (Nutt et al., 2014). While the benefits of introducing e-cigarettes would seem clear for smokers switching to e-cigarettes who would have continued smoking otherwise, for established smokers who are helped to quit, and for individuals who would otherwise have started smoking who start vaping instead, there are possible downsides. While risk increases may be modest for smokers intending to quit who vape instead, and for smokers who vape but retain their usual cigarette consumption, there is concern that vaping may encourage youths to start smoking who would otherwise not have done so, this possibility being the focus of our paper.
There are two potential contributors to any observed association between vaping and the subsequent initiation of smoking. One is "common liability", with youths who choose to vape already possessing attributes which make them more likely to smoke, and the other is a true causal effect of vaping, the so-called "gateway effect".
Obtaining evidence to determine the extent to which any observed association is actually due to a true gateway effect, and not confounded by common liability is not straightforward, and may need to be addressed not only, as here, by detailed study of data from a prospective cohort study, but by looking at trends in smoking prevalence in countries where vaping has increased (Etter, 2018;Lee et al., 2018).
Recent analyses (Soneji et al., 2017) based on nine U.S. cohort studies in young people have linked previous vaping to subsequent initiation of smoking. This publication reported that among baseline never-smokers, ever vaping at baseline strongly predicted initiation in the next six to 18 months, with an odds ratio (OR) of 3.62 (95% confidence interval (CI) 2.42-5.41) after adjustment for various predictors of initiation. Similarly baseline past 30-day vaping also predicted subsequent 30-day cigarette use (OR 4.25,. A recent review of e-cigarettes (National Academies of Sciences Engineering and Medicine, 2018) considered this to provide "substantial evidence" of a gateway effect, noting the "wide range of covariates" that the relevant studies had considered, and thought it "unlikely" that confounding entirely accounts for the association, as reductions in the association following adjustment were not consistently observed.
While some studies (Barrington-Trimis et al., 2016;Primack et al., 2015;Primack et al., 2016) do report that the association increases following adjustment, many more (Best et al., 2018;Conner et al., 2018;Hammond et al., 2017;Hornik et al., 2016;Leventhal et al., 2015;Loukas et al., 2018;Lozano et al., 2017;Miech et al., 2017;Spindle et al., 2017;Unger et al., 2016;Watkins et al., 2018;Wills et al., 2017) report a decrease. Furthermore, though adjusted associations are usually statistically significant, adjustment is often limited. Factors never considered include, for example, school performance, parental smoking and peer attitudes to smoking. In order to gain better insight into the magnitude of any true gateway effect, information from a large cohort study which collected data on very many factors would therefore clearly be useful, as would gaining some insight into the extent of bias resulting from misclassification of such variables.
Here we report detailed analyses of the gateway effect based on Wave 1 (2013-2014) and Wave 2 (2014-2015) of the Population Assessment of Tobacco and Health (PATH) study (Berry et al., 2019b;Hyland et al., 2017), a longitudinal cohort study in the U.S. supported by federal funds. The databases provide extensive information on tobacco product use and on many other factors possibly linked to smoking initiation. At each Wave, data are separately collected for youths aged 12-17 and adults aged 18+, and our analyses, which concern smoking Amendments from Version 1 The abstract final sentence now reads "Further analyses are intended using Wave 3 data to try to minimize these problems and clarify the extent of any gateway effect". Minor changes also made earlier in the abstract. A new paragraph in the introduction "There are two potential contributors …… where vaping has increased" cites two new references, one mentioned by Dr Hanewinkel, the other another paper discussing general considerations. The final sentence of the first paragraph now ends "…. there is concern that vaping may encourage youths to start smoking who would otherwise not have done so, this possibility being the focus of our paper". The next paragraph now starts "Recent analyses (Soneji et al, 2017) based on .…". The next paragraph now starts "A recent review of e-cigarettes … considered this to provide "statistical evidence".…". The final sentence of the next paragraph now starts "In order to gain better insight into the magnitude of any true gateway effect, information from .…". The discussion now starts "We have described analyses aimed at deriving further insight into the magnitude of any true "gateway effect" by attempting to control better for confounding factors linked to initiation of smoking. We used a propensity score approach, which is intended to …". The third paragraph now ends "We are currently conducting additional work to try to obtain more precise answers by also using Wave 3 data". This, and some later changes, updates the position from saying we were planning to do these analyses. The seventh paragraph of the discussion starting "While our analysis .." has been completely rewritten. The second sentence in the conclusions section of the abstract has been amended to start "Indeed, it is not completely clear whether vaping actually increases subsequent uptake of cigarette smoking …." Two new references are included.
Any further responses from the reviewers can be found at the end of the article REVISED initiation by youths, use the youth data of each Wave, plus Wave 2 data for adults previously youths at Wave 1.

The main analysis
The main analysis considers those who had never smoked cigarettes by Wave 1 and who, at Wave 2, had information available on whether initiation of cigarette smoking had occurred. Use of other tobacco products is not considered. For a youth to be considered, data should be available on each of five demographics (age, sex, Hispanic origin, race, region) and on vaping.
The analyses, which relate ever having vaped by Wave 1 to initiation of cigarette smoking by Wave 2, after adjustment for factors linked to e-cigarette use recorded at Wave 1, was conducted in two steps.
Step 1. In step 1, Wave 1 data were used to develop a propensity score for e-cigarette use based on the five demographic variables and on 60 smoking predictor variables selected from a much longer list. As described more fully in the Extended Data, we ignored questions only asked in a population subset, only really relevant to smokers, or of dubious relevance. Also, where many related questions were asked, attention was limited to those seemingly more likely to be relevant.
We used a logistic regression model where the propensity for vaping (P i ) for a youth i (i = 1 to n) was linked to various smoking predictors x ij (j = 1 to m) by For a given set of predictors, we refer to the value of the term on the left as the propensity score.
All logistic regression analyses were weighted by the personlevel weights provided on the PATH database, with the weights normalized to sum to 1.
Introducing all 65 variables simultaneously into the model would have involved two problems. First, as the analysis required individuals with complete data on all variables, substantial information may be lost. Second, analyses including very many variables sometimes fail to solve. We therefore introduced variables in stages, using groups of conceptually-related variables, with missing values likely to be on the same individuals.
At stage 1 the variables were divided into groups numbered 1-11. In each group, an analysis was carried out for each variable individually, followed by a forward stepwise approach with the most significant variable introduced first, then the next, until no further variables in the group significant at p<0.01 could be introduced. At stage 2 significant variables from the first stage were divided into three groups (A, B and C), and a forward stepwise approach again used to identify significant variables.
Finally, at stage 3, the stepwise approach was applied to the stage 2 significant variables to generate a final list of variables for the propensity score, which was then re-calculated based on youths with complete data on all these variables.
At each stage, the analyses involved all participants with complete data for each variable considered in the group being analysed.
Step 2 involved the outcome of interest, initiation of cigarette smoking between Wave 1 and Wave 2. The first analysis was a weighted logistic regression analysis to determine the unadjusted OR and 95% CI for the association of ever vaping at Wave 1 with subsequent initiation of cigarette smoking. The second analysis was similar but adjusted for propensity by dividing the youths into five quintiles of the propensity score, the separate OR (95% CI) values for each stratum being then combined to form an overall propensity-adjusted estimate. Propensity-adjusted analyses were also conducted using the score as a continuous variable, and also using the variables making up the score individually rather than combined.

Sensitivity analyses
Five sets of sensitivity analyses were conducted linking vaping to initiation of cigarette smoking, along the lines of the main analysis. For each set, ORs (95% CIs) were again calculated with no adjustment for propensity, adjustment as quintiles, adjustment as a continuous variable, and adjustment for the variables making up the score.
Sensitivity analysis 1 Youths who, by Wave 1, had ever used any other tobacco product (i.e. than cigarettes or e-cigarettes) were excluded. As there were considerably fewer significant predictor variables from the stage 1 analyses, the stage 2 analyses were omitted.

Sensitivity analysis 2
Here ever users of other tobacco products by Wave 1 were not excluded but use of other products was included as an extra predictor. The stage 1 and 2 analyses were not repeated. Rather the final model used was one that included the same final set of variables plus that for ever using other tobacco products.

Sensitivity analysis 3
Whereas the main analysis and sensitivity analyses 1 and 2 linked a propensity score for ever vaping by Wave 1 to ever cigarette smoking by Wave 2, sensitivity analysis 3 linked a propensity score for current e-cigarette use at Wave 1 to current cigarette smoking at Wave 2, last 30-day use being considered current. Again, as few significant variables emanated from the stage 1 analyses, the stage 2 analyses were omitted.
Sensitivity analysis 4 Whereas all the analyses described above relate to a propensity score for vaping, sensitivity analysis 4 was essentially the same as for the main analysis but based on a propensity score for ever cigarette smoking.
Sensitivity analysis 5 Sensitivity analysis 5 was also like the main analysis, but the propensity score was based on ever use of any tobacco product.
For sensitivity analyses 4 and 5, the full three-stage process described in Step 1 was used to determine the variables included in the propensity score.

Analyses investigating residual confounding
In the main analysis, the propensity score for e-cigarettes used data provided by youths at Wave 1. Although there was no gold standard to validate reported answers, it seemed possible that more accurate predictors could be based on data from Wave 1 and 2 combined. For those variables forming the propensity score we investigated whether there was further useful information available in Wave 2 and, if so, created a revised variable. How this was done is detailed in the Results section, the procedure depending on which variables were selected for the propensity score. Once the revised predictor variables were created, the main analyses were rerun using a modified propensity score.

Software
Relevant data were transferred for analysis to a ROELEE database, and analysed using the ROELEE program (Release 59, Build 49). SAS version 9.4 (SAS Institute Inc, 2017) was also used to check some results from ROELEE and generate results when ROELEE failed to converge. The GLM Package and the Step function from the R Program (https://www.r-project.org) could be used to run all the analyses.

Main analysis
The propensity score for ever vaping by Wave 1 was developed using five demographic variables and 60 other predictors (Table 1). Each variable, except for variable 52, depended on a question involving a few possible answers (see Table 1 footnotes) with the regression analyses estimating a coefficient per level. Exceptionally, variables 10-13, where numbers of youths were very small for some levels, were treated as continuous variables in analysis.
Stage 1 in developing the propensity score involved separate regression analyses within each of the 11 groups. As Table 1 shows, 38 variables were eliminated from consideration at that stage, with 27 retained for stage 2, 8 considered in group A, 10 in group B and 9 in group C. Following eliminating 9 more variables at stage 2, 18 variables entered stage 3 with 6 more eliminated. After rerunning the regression analysis based on 10,671 youths with data on all 12 predictors, rather than 10,361 with data on 18 predictors, the final model was as shown in Table 2.
Ever e-cigarette use was independently associated with older age, male sex, use of alcohol and prescription drugs, social networking, and preferring exciting and unpredictable friends. It was also associated with cohabitants using tobacco, parents or guardians not being very upset if they found the youth using tobacco, agreeing that some tobacco products are safer than others, the youth being curious about smoking, and the youth thinking they will smoke a cigarette in the next year. Note that, for these last two variables, the grading system ascribed lower scores for greater curiosity or greater likelihood to smoke cigarettes in the next year so the fitted ORs were <1. The results for the variable regarding enjoying using tobacco was less straightforward to interpret as very few youths strongly agreed they would enjoy it. However, those who strongly disagreed that they thought that they would enjoy using tobacco had a clearly lower odds of ever e-cigarette use than those who simply agreed or disagreed.
As Table 3 shows, the unadjusted OR for the association of vaping by Wave 1 with cigarette smoking initiation by Wave 2 was 5.702 (95% CI 4.334-7.502). The OR was markedly reduced by adjustment for the propensity score, whether as quintiles (2.476, 1.852-3.310), as a continuous variable (2.474, 1.791-3.419), or for the 12 variables making up the score (1.847, 1.347-2.533). Table 3 also shows the effects of introducing the variables successively. With one minor exception, introducing each variable reduced the OR, the largest reductions relating to the first four variables considered, which in combination already reduced the OR to 2.185 (1.608-2.969). Table 4 summarizes the sensitivity analysis results, comparing them with those from the main analysis. While the number of significant variables included varies between analyses, all show that adjustment markedly reduces the unadjusted association, reducing ORs of over 5 to less than 3. The effect of adjustment was always greater when made for each of the individual variables making up the score. Relating ever vaping by Wave 1 to initiation of cigarette smoking by Wave 2 (main analysis, sensitivity analyses 1 and 2), the lowest adjusted ratio of 1. 586 (1.194-2.198) is seen in sensitivity analysis 2, where adjustment is made for 13 predictor variables, including smoking of other products. Here, the adjustment explains 87.5% of the unadjusted association (as estimated from the ratio of the excess ORs, i.e. OR − 1). Sensitivity analysis 3, which concerns current (last 30 day) rather than ever use of both products also produced similar results, though the estimates are more variable due to the very few new cigarette smokers among e-cigarette users. Sensitivity analysis 4, where the score was based on variables linked to Wave 1 cigarette smoking rather than vaping also gave similar results, as did sensitivity analysis 5, where the score was based on variables linked to use of any tobacco product.

Residual confounding
The propensity score used in the main analyses was revised using modified versions of the 12 predictor variables. The age range of 12-14 or 15-17 at Wave 1 was modified to be 12-13, 14, 15-16, 17 depending on whether 12-14 year-olds at Wave 1 were 15 at Wave 2, and whether 15-17 year-olds at Wave 1 were adults at Wave 2. Gender was unchanged, being consistent between waves. For three ever use variables (alcohol; prescription drugs; social networking) non-users at Wave 1 were now considered users if use was reported at Wave 2. For four variables where questions were asked at both waves (reaction if parent or guardian found using tobacco; think would enjoy tobacco; relative safety of tobacco products; cohabitant uses tobacco) the level most associated with vaping use was used (i.e. maximum for reaction and minimum for the other three). These questions were only asked of youths, so if the participant became an adult at Wave 2, the Wave 1 response was used. For the other three variables Table 1. Wave 1 predictor variables used with details of which stage they were eliminated from consideration. For each of the 65 Wave 1 predictor variables considered, Table 1 gives details of their type (graded or continuous) and of the stage at which they were eliminated from consideration for the final propensity score.

Group
Variables Levels or continuous a

Discussion
We have described analyses aimed at gaining further insight into the magnitude of any true "gateway effect" by attempting to control better for confounding factors linked to initiation of smoking. We used a propensity score approach, which is intended to remove confounding in the analysis of outcomes by balancing exposure groups on potential confounders, the score being developed prior to, so independently of, the analysis of  outcomes. This approach attempts to transpose observational data into what would have been obtained from a randomized trial, using groups balanced on baseline covariates. The main analysis, which aims to balance potential confounders across vapers and non-vapers at Wave 1 (the comparison groups in the analysis where cigarette smoking is the outcome) is strictly designed for this approach. An alternative approach addressed using sensitivity analysis 5, views the propensity outcome more broadly, considering use of any nicotine-containing product as indicative of an inclination to initiate cigarette smoking. The difficulty in the propensity score approach, as with use of observational data generally, is to ensure that all relevant variables are considered in the score, and to account for possible inaccuracies in the variables included.
The main and five sensitivity analyses summarized in Table 4 all show that adjustment for propensity determined at Wave 1 markedly weakens the gateway effect, the association between vaping by Wave 1 and subsequent initiation of smoking. This was true whether propensity was based on variables associated with vaping (the main analyses), cigarette smoking (sensitivity analysis 4) or any tobacco product use (sensitivity analysis 5). Sensitivity analyses 1-3 also demonstrated this marked reduction, whether users of other products at Wave 1 were excluded or included, whether or not adjustment was made for such use, or whether analyses were based on ever or current use. There was no consistent difference between results adjusted for propensity as quintiles or as a continuous variable, but adjustment for the individual variables making up the score produced lower adjusted ORs. The proportion of the unadjusted excess OR (i.e. OR -1) explained by adjustment for the individual variables was at least 75.4% in the main and sensitivity analyses, with a maximum of 87.5% for sensitivity analysis 2, where other product use was adjusted for, as well as the 12 main analysis predictor variables.
Our analyses have limitations. One is the small numbers of new smokers considered, never exceeding 71 and as low as six in sensitivity analysis 3. We are currently conducting additional work to try to obtain more precise answers by also using Wave 3 data.
A second issue is the possibility of over-adjustment. One can argue that vaping by Wave 1 may have affected some answers given then. For example, taking up e-cigarettes may make youths more curious about cigarettes, or more likely to think they will smoke or enjoy them. Using Wave 3 data, we are also conducting further analyses relating initiation of cigarette smoking at Wave 3 to vaping at Wave 2, restricting attention to those who, at Wave 1, had never vaped, and using propensity indicators recorded at Wave 1.
As is well documented, inaccurately determining confounding variables may limit the ability to fully adjust. Thus, many years ago (Tzonou et al., 1986), it was demonstrated that "even misclassification rates as low as 10% can prevent adequate control of confounding" and other publications highlight the residual confounding problem (Ahlbom & Steineck, 1992;Fewell et al., 2007;Greenland, 1980;Phillips & Davey Smith, 1994;Savitz & Baron, 1989). Proper adjustment for residual confounding requires a gold standard to compare reported answers with, but such data were unavailable in the PATH study. However, some insight was given by using predictor variables derived from answers given at both Wave 1 and Wave 2. Thus, if a youth reported alcohol use at one wave and not the other, it is possible this was not reported at one wave, and a predictor based on ever reported use may better predict smoking initiation. Such analyses usually weakened the adjusted association of prior vaping with subsequent smoking initiation, but only slightly. This may, however, reflect methodological limitations rather than lack of serious residual confounding.
While our analyses make it clear that most of the observed relationship between vaping and subsequent initiation of smoking results from confounding, the significant association seen even after extensive adjustment for confounders does seem to be consistent with there being some true gateway effect. However concerns about incomplete adjustment for confounding remain and our results do not unequivocally demonstrate that any true effect exists. More reliable information emerging from the further analyses we are currently conducting using Wave 3, should provide a better insight into the magnitude of any true gateway effect.
Another gateway analysis based on PATH Waves 1 and 2 has recently been published (Watkins et al., 2018). This differed from ours in that their "unadjusted" model already included all non-cigarette tobacco products as indicators of cigarette initiation, and their "adjusted" models added only a restricted list of predefined variables, rather than using models to include more relevant variables. While their adjusted models did include some established determinants of cigarette use (sensation seeking; alcohol use; living with a tobacco user; and variables regarding health warnings and advertising), their adjusted ORs were higher than ours. Thus, whereas the variables we included in our main analysis reduced the OR from 5.702 to 1.847, their similar analysis reduced it only from 3.50 to 2.53. Consequently, although they recognized uncontrolled confounding may exist, they dubiously considered vaping was "independently associated with cigarette smoking one year later".
In considering whether a true important gateway effect exists, one should note the lack of any increase in the US in cigarette smoking prevalence following the rise in vaping (Levy et al., 2019), and the fact that, in the PATH study, considerably more (279 vs. 79) Wave 1 cigarette only smokers took up e-cigarettes by Wave 2, than Wave 1 e-cigarette only users who took up smoking. Despite any possible gateway effect, introducing e-cigarettes may have reduced overall youth smoking prevalence.

Conclusions
The results presented, based on Waves 1 and 2, strongly suggest that reported estimates of the gateway effect (Soneji et al., 2017) are much too high. Indeed, it is not completely clear whether vaping actually increases subsequent uptake of cigarette smoking if potential confounding effects were to be fully accounted for.

Addendum
At the time this paper was being finalised, an analysis was published (Berry et al., 2019a) investigating gateway effects based on data from Waves 1, 2 and 3 of the PATH study. The authors reported that prior e-cigarette use was associated with increases in the odds of ever and current cigarette use by, respectively, 4.09 (95% CI 2.97-5.63) and 2.75 (1.60-4.73). Though noting that they could not rule out the possibility of residual confounding, they concluded that their findings supported a gateway effect. We will examine this claim in detail based on the results of our ongoing analyses using the data from all three Waves. market, with the large and rapidly growing vaping product (aka e-cigarette) company, Juul Labs. The acquisition price was based on a US$38billion valuation, which was more than twice Juul Labs' August valuation, and a surprise to many investors on Wall Street."

Data availability
We have to have this information in mind, because the manuscript submitted to F1000Research by Peter Lee and John Fry is financed by Philip Morris. My personal opinion is that Philip Morris has an interest to show that the so called gateway effect is not existent.
I am not an expert in the propensity score approach, but what I understand is, that the authors tried everything to show, that the gateway effect does not exist. It seems trivial to me, that adjustments reduce an OR. Even after adjustment for a broad range of confounders, there is still a significant and clinical meaningful association between e-cigarette use at baseline and smoking initiation in the observational period.
My main point is that I do not agree with the conclusion and discussion of the results. Instead of saying that confounding is a major factor explaining most of the observed gateway effect, the authors should emphasize that even after controlling for a large number of confounders, the gateway effect is still present.
I am counting nearly two dozens of cohort studies, which investigated the gateway effect. The authors only cite a portion of these studies. In detail, young people have been recruited in the US and Canada , UK , Finland , Mexico , the Netherlands , Romania , Taiwan , and Germany . More than 100,000 young people have been recruited for these studies with observational periods of up to 24 months. Outcome variables range from one cigarette to daily smoking. All these studies found a significant association between e-cigarette use at baseline and uptake of smoking in the observational period. Applying the well-known Bradford-Hill criteria, it seems to me that the following criteria are fulfilled: (1) Strength (effect size), (2) Consistency (reproducibility), (3) Temporality, and (4) Plausibility.
In the Introduction the authors should describe the two opponent explanations for the observed association between e-cigarette use and the later uptake of smoking: the so called "common liability" theory vs. the "gateway" theory . Another important theoretical paper should also be discussed .
If anything, the results reported are in line with the gateway theory and not with the common liability theory. statisticians/epidemiologists our interest is in unbiased assessment of data, not in trying to reach conclusions favourable to any sponsor. The objective of our paper was to provide insight into how much the observed gateway effect could be explained by adjustment for confounding factors, in order to provide insight into the possible magnitude of any true effect. That adjustment could reduce an unadjusted OR from 5.70 to 1.59 should be interpreted as providing doubt as to whether there is any real effect, despite the 1.59 being statistically significant. Our discussion, and also our companion paper in F1000Research , already highlight the potential problems of interpretation due to residual confounding.

References
Dr. Hanewinkel states that "it seems trivial" to him "that adjustments reduce an OR". However, theoretically adjustment can increase or decrease an OR. He cites numerous cohort studies that reported a significant gateway effect after adjustment, and we have cited all but the most recently published ones in our present paper or in its companion . We note that many of these papers have also shown a large reduction in the OR after allowing for confounding variables. The existence of a significant association following adjustment does not necessarily mean that the adjusted association is real, for the reasons we have already discussed.
We have now extended the introduction as suggested to describe the "common liability" and the "gateway" theory, making it clear that both can contribute to the observed association between e-cigarette use and subsequent initiation of smoking.
Our abstract already made clear that a significant association remained after adjustment, and that its interpretation is not straightforward. We have now modified the conclusion in the abstract to point out that further work is needed to clarify the extent of any true gateway effect that may exist.
We have also extended the discussion section of the paper. We feel that our results adequately support our conclusion that "confounding is a major factor, explaining most [though not all] of the observed gateway effect". Further analysis of later waves of the PATH study could well help to establish whether the residual effect is or is not real.

Shu Xu
Department of Biostatistics, NYU College of Global Public Health, New York City, NY, USA The authors aimed to evaluate the "gateway effect" of baseline e-cigarette use predicting cigarette smoking initiation among youths, addressing the inadequate adjustment of confounders considered in the previous published studies. The first two waves of PATH data from the youth sample were used for this purpose.
The strength of this study is that the authors used multiple approaches to evaluate this gateway effect, and performed five sets of sensitivity analysis. Propensity scores for e-cigarette use at wave 1 were created. Propensity score based approaches are preferred to statistically control for numerous confounding variables; however, this approach was not completely adequate as it appears in this study for several reasons. A few major concerns are below.
First, due to the availability of data at the time of analyses, the confounding variables were measured concurrently with e-cigarette exposure or afterwards (i.e., analyses investigating residual confounding). Considering the Wave 3 PATH data have been available since fall 2018, the authors are encouraged to revise the manuscript using data from the first 3 waves. They state they will do so but they could probably access the data at this point.
Second, the authors aimed to address the inadequate adjustment of confounding in previous studies, however, this limitation still exists in the current study. As a result of Step 1, only a set of 13 covariates were included of the main analysis, sensitivity analyses, and residual confounding analysis. Moreover, it is unclear whether the balancing of covariates was achieved between the (e-cigarette) exposure and non-exposure groups.
A few details regarding each approach need to be provided. Specifically, it was unclear what the authors meant by saying "the first analysis was a weighted logistic regression analysis." How was a weighting variable created? Was this what the authors called "no adjustment of propensity" afterwards (in the first paragraph of the Sensitivity Analysis section, and in Table 4)? The results from the "adjustment as quintiles" and "adjustment as a continuous variable" approaches typically agree, and the effect of e-cigarette exposure would be biased if only a subset of confounding variables were considered in analyses. Moreover, the "using the variables making up the score individually" approach was actually a traditional covariate approach, which assumed a linear association between the outcome and each covariate. Authors need to justify whether this assumption was satisfied or not before they concluded the findings of the study.
Third, in analyses investigating residual confounding, authors used some of the covariates from Wave 2, which might be influenced by the exposure of "e-cigarette" at Wave 1. In this case, the partial coefficient of e-cigarette use on outcome is actually the controlled direct effect in a mediation analysis. This coefficient only showed the partial effect of e-cigarette use on outcome, which was an underestimated total effect of exposure. In this set of analysis, the confounding was over-adjusted.
Fourth, all analyses only involved participants with complete data for each variable considered in the group being analyzed. Missing complete at random was actually assumed in all analyses. Authors need to report the sample sizes of the original data and the working data, and justify whether/why this assumption was valid in this study.
course will publish a further paper. We do not propose to revise the current manuscript.
As regards to the second concern, our final analyses were based on adjustment for a limited number of confounding variables. However, this list was based on a much longer list, with no further variables adding significantly to the final model, the much longer list (shown in Supplementary File 1) being itself derived from a review of the literature. The fact that the model enormously reduced the association between vaping and subsequent initiation of cigarette smoking was the main point we wished to make. It is possible other models may explain more of the unadjusted association but that will be addressed in the paper currently being prepared.
Dr. Niuara refers to reporting whether the balancing of covariates was achieved between the e-cigarette exposed and non-exposed groups. Presumably, he is referring to balancing after propensity adjustment. We intend to report on this in the planned new paper.
Dr. Niaura asked how the weighting variable was created. This variable is supplied directly to users of the PATH study and allows the user to generate results representative of the U.S. population.
It is stated that "the effect of e-cigarette exposure would be biased if only a subset of confounding variables were considered in analyses". As noted above we did take into account a very large number of potential confounders in our analyses. It would be expected that adjustment for some variables would not in fact have a material effect when other variables were already included in the analyses, i.e. they would not be true confounders.
As regards linear associations it should be noted that most of the variables were yes/no variables. We will consider non-linear relationships in our planned analyses using Wave 3.
It is well known that adjustment for inaccurately measured confounding variables may be incomplete. In the absence of a gold standard we attempted to gain insight into the problem of looking at variation in answers to corresponding questions at different Waves. It is true that this may cause problems if the answers are affected by e-cigarette use, but this could only be addressed by analyses involving more than three Waves, relating e-cigarette use at Wave 3 to initiation of smoking by Wave 4, with adjustment for questions asked at Waves 1 and 2 in non-tobacco users.
We will report fuller details regarding missing data in our planned analyses based on Waves 1 to 3. As regards the other points we do not propose to modify the current paper, preferring to take all the points into account in the paper on Waves 1 to 3.
For information, the ROELEE program is one my company started to develop over 30 years ago, initially to get output in a user-friendly form not available using programs like SAS or SPSS. It has undergone numerous updates over the years and is fully tested. It is available from ROELEE Statistics Ltd. While, for logistic regression analyses, it gives the same results as SAS, but in a more convenient form, the version of ROELEE we used at the time sometimes failed to converge when there are large numbers of predictor variables, whereas SAS did not. We are currently modifying the code of ROELEE to avoid this non-convergence problem.
As regards the sample sizes used in the 6 sets of analyses in Table 4, these were as follows: Main, S2 and S5 9423; S1 8900; S3 9907; S4 9354.
None additional to those disclosed in the paper itself. Competing Interests: