Using the “Uniform Scale” to facilitate meta-analysis where exposure variables are qualitative and vary between studies – methodology, examples and software

Peter N Lee; Jan Hamling; John S Fry; Sonja Vandyke; Rolf Weitkunat

doi:10.12688/f1000research.21900.1

Home Browse Using the “Uniform Scale” to facilitate meta-analysis where exposure...

ALL Metrics

Views

Downloads

Get PDF

Get XML

Export

▬

✚

Method Article

Using the “Uniform Scale” to facilitate meta-analysis where exposure variables are qualitative and vary between studies – methodology, examples and software

[version 1; peer review: 1 approved with reservations]

Peter N Lee ¹, Jan Hamling², John S Fry², Sonja Vandyke³, Rolf Weitkunat³

Peter N Lee ¹, Jan Hamling², [...] John S Fry², Sonja Vandyke³, Rolf Weitkunat³

PUBLISHED 22 Jan 2020

Author details Author details

¹ Director, P.N. Lee Statistics and Computing Ltd, Sutton, Surrey, SM2 5DA, UK
² Software development, RoeLee Statistics Ltd, Sutton, Surrey, SM2 5DA, UK
³ Research and development, Philip Morris S.A., Neuchatel, Switzerland

Peter N Lee
Roles: Conceptualization, Formal Analysis, Funding Acquisition, Methodology, Project Administration, Supervision, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

Jan Hamling
Roles: Formal Analysis, Methodology, Software, Writing – Review & Editing

John S Fry
Roles: Conceptualization, Formal Analysis, Methodology, Software, Writing – Review & Editing

Sonja Vandyke
Roles: Conceptualization, Formal Analysis, Methodology, Writing – Review & Editing

Rolf Weitkunat
Roles: Conceptualization, Methodology, Project Administration, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

Abstract

Meta-analyses often combine covariate-adjusted effect estimates (odds ratios or relative risks) and confidence intervals relating a specified endpoint to a given exposure. Standard techniques are available to do this where the exposure is a simple presence/absence variable, or can be expressed in defined units. However, where the definition of exposure is qualitative and may vary between studies, meta-analysis is less straightforward. We introduce a new “Uniform Scale” approach allowing expression of effect estimates in a consistent manner, comparing individuals with the most and least possible exposure.

In 2008, we presented methodology and made available software to obtain estimates for specific pairwise comparisons of exposure, such as any versus none, where the source paper provides estimates for multiple exposure categories, expressed relative to a common reference group. This methodology takes account of the correlation between the effect estimates for the different levels. We have now extended our software, available in Excel, SAS and R, to obtain effect estimates per unit of exposure, whether the exposure is defined or is to be expressed in the “Uniform Scale”. Examples of its use are presented.

Keywords

systematic review, meta-analysis, contrast, dose response

Corresponding author: Peter N Lee

Competing interests: Two of the authors (SV and RW) were employed by, and the other authors were consultants to, Philip Morris Products S.A. (the funding source) at the time of writing the paper. No changes to the paper were requested by the funding source.

Grant information: The work was wholly funded by Philip Morris Products S.A., PNL, JH and JF being funded by project agreements 26 and 30 to P.N. Lee Statistics and Computing Ltd.

Copyright: © 2020 Lee PN et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Lee PN, Hamling J, Fry JS et al. Using the “Uniform Scale” to facilitate meta-analysis where exposure variables are qualitative and vary between studies – methodology, examples and software [version 1; peer review: 1 approved with reservations]. F1000Research 2020, 9:33 (https://doi.org/10.12688/f1000research.21900.1) First published: 22 Jan 2020, 9:33 (https://doi.org/10.12688/f1000research.21900.1) Latest published: 22 Jan 2020, 9:33 (https://doi.org/10.12688/f1000research.21900.1)

Introduction

Results from individual studies relating an exposure of interest to risk of a disease are often recorded as a set of covariate-adjusted effect estimates (odds ratios (ORs), or relative risks (RRs)), with 95% confidence intervals (CIs), for differing levels of exposure relative to a reference (base) level. Associated with this, data on the numbers of subjects are often presented. For case-control or cross-sectional studies these are subdivided by presence of disease. For prospective studies the numbers with disease and the numbers at risk are typically presented.

For the purpose of conducting meta-analyses, it is often the situation that meta-analysts with no access to the raw data of a study require estimates of covariate-adjusted ORs or RRs for pairwise comparisons other than those presented. For example, if the base level (0) is never smokers, and levels 1 to 4 are, respectively, former smokers and current smokers of 1–10, 11–20 and 21+ cigarettes per day, one may wish to derive estimates for current vs never (levels 2 to 4 combined vs level 0), current vs non (levels 2 to 4 combined vs levels 0 and 1 combined) or current vs former (levels 2 to 4 combined vs level 1). As the effect estimates (OR or RR) for each level are not independent, having a common reference level, one cannot derive these estimates straightforwardly. Thus, for example, using fixed-effects meta-analysis to combine effect estimates for levels 2 to 4 to get an estimate for current smoking is not correct.

A solution to this problem, described in a paper we wrote in 2008 (Hamling et al., 2008), is based on methodology developed much earlier by Greenland & Longnecker (1992). Where there are k exposure levels, the method involves deriving, using the effect estimates and their 95% CIs together with the marginal totals of numbers of subjects, a set of pseudo-numbers for the relevant 2 x (k + 1) table. These pseudo-numbers (which have no direct meaning by themselves) produce the same effect estimates and 95% CIs for comparison with the reference level, and can be combined as appropriate to produce an adjusted estimate for any pairwise comparison of different sets of levels. Our earlier paper (Hamling et al., 2008) gives examples of the methodology in action. That paper not only shows how relative effect estimates can be derived for alternative comparisons, but also presents methodology for deriving alternative comparisons when results are given by categories of disease rather than categories of exposure. It also makes available software to derive the necessary estimates using both an Excel and a SAS implementation. These implementations also produce chi-squared and p-values for heterogeneity and trend corresponding to the table of pseudo-numbers of subjects and based on trend coefficients entered by the user, using formulae 4.38 and 4.39 of Breslow & Day (1980) for case-control studies and modified versions of these formulae for prospective studies described in their later publication (Breslow & Day, 1987).

While meta-analyses are often carried out for a specific comparison of exposure, such as current smokers vs never smokers, one often wishes to quantify the effect per unit exposure. Where exposure is measured in a consistent way in each study, and is known for each level of exposure considered in each study, standard techniques are available (described in the Methods section) to derive such trend estimates. However, where exposure may be measured in various ways, and is only defined semi-quantitatively (e.g. high, medium, low) this is not the case.

Here we introduce a new, “Uniform Scale”, approach to deal with this problem. It is based on the assumption that the exposures range from 0 (least possible) to 1 (most possible), and that the N participants in any study have equally spaced exposures ranging from 1/(2N) to 1−1/(2N). By attempting always to derive an effect estimate relating to a difference of 1 unit of exposure, this “Uniform Scale” approach allows the combination of effect estimates using different measures of an underlying common exposure.

We describe how effect estimates using this “Uniform Scale” approach can be derived.

We also make available an extended version of our software to allow estimation of trend estimates, whether the exposure is quantified in standard units or expressed in the “Uniform Scale”. This extended software, now available also in R as well as in Excel and SAS, can be used for cross-sectional studies as well as for case-control and prospective studies. The new software avoids problems some users had with the original Excel spreadsheet in generating the pseudo-numbers, due to the solver routine used, a Microsoft add-in. The new Excel software is written mainly in Visual Basic, to allow matrix inversion for matrices of variable size (which is needed for the new methodology but is not readily possible using Excel formulae). Note that the extensions of the software which relate to the inclusion of trend estimates make little sense for data subdivided by level of disease, the software here being therefore essentially unchanged.

Methods

Including effect estimates per unit of exposure

Given a set of pseudo-numbers for the levels of exposure for a study, together with an estimate of the mean exposure for each level, it is often required to estimate the effect (and 95% CI) corresponding to a 1 unit increase in exposure. This enables the meta-analyst to produce combined estimates over studies based on effect estimates expressed for differing levels of exposure. Thus, for example, if interest is in the risk increase per cigarette smoked, one can combine these trend estimates over study, even when one study might report results by levels of, say, 0, 1–5, 6–10, 11–15, 16–20, 21–30 and 31+ cigarettes per day and another reports results by levels of 0, 1–19, 20 and 21+ cigarettes per day. Provided one can derive reasonable mean consumption estimates for each exposure level and one assumes that there is a linear relationship between dose and the logarithm of the effect estimate, these trend estimates of risk increase per cigarette per day can then be readily combined in a meta-analysis.

The methodology used for case-control and cross-sectional studies is that described by Berlin et al, (1993), together with the correction for the non-independence of results by exposure level given by Greenland & Longnecker (1992). Orsini et al. (2012) provided the modifications to be used for prospective studies. The method first derives the variance of the effect estimate for each exposure level using the width of its 95% CI. The table of pseudo-numbers is then used to estimate the correlation of pairs of results, those values then being used, together with the variance values, to estimate the covariance of each pair. The variance-covariance matrix is then inverted, and used, together with the effect estimates and dose values (mean exposure levels) to estimate beta, the coefficient of the relationship between the dose and the logarithm of the effect estimate, and its variance. Finally the values of beta and its variance are exponentiated to give the rate of increase in the effect estimate per unit increase in dose.

The methodology described above assumes that the unexposed group has a dose value of zero. If the unexposed dose is a non-zero value, this value is subtracted from each of the dose values specified. Subtracting the same value from each dose value does not change the slope of the relationship, so the estimate of beta is unaffected.

Including effect estimates for the “Uniform Scale”

In some situations, effect estimates are presented by level of exposure where the level is merely expressed as, for example, low, medium or high, with no quantitative estimate of the extent of exposure. An example is data relating initiation of smoking in adolescents to “connectedness” (a feeling of belonging to or having affinity with a person, social group or organisation), where connectedness may be measured in various different ways, e.g. connectedness to school, connectedness to parents, or social connectedness. If these measures all relate to a common underlying scale, one could consider combining effect estimates for the various measures in a single meta-analysis. But what scale could be used?

One possible approach, and the one we suggest here, is to imagine an underlying “Uniform Scale”, where 0 indicates the least possible connectedness and 1 the most possible connectedness, with the population considered to be made up of N individuals with equally spaced scores ranging from 1/(2N) to 1−1/(2N). Thus, if there are 100 individuals in total, the individuals would have scores of 0.005, 0.015, 0.025 … 0.975, 0.985 and 0.995, with a mean score of 0.5. If there are 30 individuals in the low group, 50 in the medium group and 20 in the high group, the mean scores in the three groups would then be 0.15, 0.55 and 0.90. Or more formally, with a total of N subjects divided into k exposure groups of size N₁, N_{2, …} N_k respectively, the mean scores in the k groups would be 1/N times, respectively, N₁/2, N₁ + N₂/2, N₁ + N₂ + N₃/2, N₁ + N₂ + N₃ + N₄/2, and so on.

These scores are estimated from the numbers of controls for case-control studies, from the numbers at risk for prospective studies and from the total numbers (of cases and non-cases) for cross-sectional studies. The method of estimating the increase in risk per unit exposure is then identical to that described in the previous section.

Note that, within the software provided, the estimated rate of increase given for the “Uniform Scale” is based on scores derived from the table of pseudo-numbers. If the user wishes to see the estimate based on scores derived from the actual distribution, these scores have to be calculated by the user, and then entered as the dose values, the estimated rate of increase then appearing as the “Dose as entered” estimate per unit exposure. The software gives the rate of increases in risk per unit exposure for both methods, so that the user can decide which method is more appropriate and so select the relevant results.

Including studies that provide other forms of dose assessment in “Uniform Scale” meta-analysis

Sometimes results by dose may be presented in ways for which the “Uniform Scale” values can be calculated without the need to use the table of pseudo-numbers or the numbers of individuals in the study.

In some studies, the effect estimate presented relates to a comparison of two groups, one with low and one with high exposure, where the groups together cover the whole population. To include these in “Uniform Scale” meta-analysis one should square the effect estimate (and its 95% CI). To demonstrate this, assume that a proportion x has the high value, and 1−x the low value. As the mean scores for the low and high value are respectively (1−x)/2 and 1−x/2, they differ by 0.5, regardless of x, and as a linear relationship is assumed between the score and the logarithm of the effect estimate, the effect estimate should be raised to the power (1/0.5), i.e. squared.

Note that where the source only provides an effect estimate for high vs low exposure, with intermediate exposures possible, application of the “Uniform Scale” methodology requires additional assumptions. Even if, for example, high and low represent the upper and lower thirds of the distribution, giving mean scores of 0.167 and 0.833, raising the effect estimate to the appropriate power, 1/(0.833–0.167) = 1.5, would assume that the dose-response relationship adequately fitted the complete data, although information on the effect for the intermediate exposure was not provided.

In other studies, the effect estimate (and CI) may be presented in relation to a one point increase in a continuous scale ranging from 0 (least possible exposure) to m (most possible exposure). Here the effect may be simply converted to our “Uniform Scale”, with a range of 0 to 1 corresponding to the difference between least and most possible exposures, by raising the effect estimate (and CI) to the m^th power.

Where the effect estimate (and CIs) presented relates to a one level increase in a scale with m levels, where each level relates to a range of exposures (e.g. five levels with 1=very low, 2=low, 3=medium, 4=high and 5=very high), approximate effect estimates (and CIs) may also be obtained by m^th power transformation. Thus, in the example with 5 levels and a range of 4, transforming to the 5^th power is appropriate. If the levels represent m-tiles, this approximation would be exact, as a one level increase would be equivalent to an increase of 1/m in the “Uniform scale”.

In the situations described above in this section the software we provide is not required, as the meta-analyst can readily convert effect estimates (and 95% CIs) to the required “Uniform Scale” by simply raising them to the required power.

In these calculations of the “Uniform Scale” effect estimates, as illustrated in the Results section below, it is sometimes necessary to know the standard deviation (SD) of the N individual values on the “Uniform Scale”, 1, 3, 5, 7…. (2N−1), each divided by 2N. As the numbers are equally spaced, the mean is 0.5, so that Ʃx = N/2. To estimate the SD we also need Ʃx². Since the sum of squares of the first N odd numbers is N(2N+1)(2N−1)/3, we can divide this by 4N² to get the required value of Ʃx². For large N, Ʃx² = N/3. Based on the formula for the SD, it can then be shown that for large N, the SD is 1/√12 or 0.2887. This approximation is very good for the values of N usually reported for studies. Thus, for N = 50, SD = 0.2915 and for N = 100, it is 0.2901.

Implementation of the software provided

Allowing entry of results from cross-sectional studies. The original version of the software (Hamling et al., 2008) considered data only from case-control and prospective studies. The updated software also allows entry of data from cross-sectional studies. In nearly all aspects the methodology for cross-sectional studies is identical to that for case-control studies, with ORs for case-control studies based on the relative frequency of cases and controls replaced by ORs for cross-sectional studies based on the relative frequency of those with and without the disease of interest. Exceptionally, when using the “Uniform Scale”, the calculation differs between case-control and cross-sectional studies. Scores are based on the distribution of exposure in the whole target population, which is best approximated by the distribution of the whole study population in cross-sectional studies and by the distribution of the controls for case-control studies. This is because controls in case-control studies are selected to have a distribution of exposure relevant to the whole target population.

The Excel implementation. The Excel spreadsheet described here is similar to that made available with our earlier paper (Hamling et al., 2008) except that it provides estimates of trend per unit dose, including trend based on the “Uniform Scale”.

The methodology for estimating trend involves matrix inversion. The spreadsheet needs to handle matrices of variable size, depending on the number of exposure levels included in analysis. In Excel this is not possible using formulae, so much of the new Excel code is written in Visual Basic. Visual Basic is also used for additional validity checks on the data entered.

The updated spreadsheet attempts to avoid problems encountered in the original software. The Excel software uses a Solver routine (a Microsoft add-in) when generating the table of pseudo-numbers. Updates to the Microsoft operating system meant that this Solver routine ceased to work for some users. The new version of the software avoids these problems by using methods to ensure that the Solver add-in relevant to the user’s operating system is available for use. This has been tested on several versions of the Microsoft operating system.

The updated Excel spreadsheet and its documentation can be downloaded from Zenodo RREst_trend.xlsm and RREst_trend.pdf respectively (Lee et al., 2019). Also available at that site are details of the testing of the spreadsheet: see RREst_Trend_Test_Files.zip, which contains Testing of RREst_Trend in R SAS and Excel.pdf (describing the testing carried out) and the .xlsm files (which provide the details entered and results of each test).

The Excel spreadsheet is provided in .xlsm format because it uses Visual Basic code. This file format can be accessed using Microsoft Excel 2007 and later versions.

Before opening the spreadsheet in Excel the user should ensure that the Solver add-in has been installed. Within Excel look for the Add-Ins option within Tools or Developer.

The use of macros needs to be enabled within Excel. As the spreadsheet is opened in Excel the user may be asked to confirm that they wish to continue opening a file containing macros.

As in the previous version, the spreadsheet provides drop-boxes for selecting categorization (by exposure levels or by categories of disease) and study type, the updated version allowing for cross-sectional studies as well as for case-control and prospective studies.

The actions to be taken by the user are:

(1) Select categorization and study type using the drop-boxes

(2) Enter the 2 × 2 table of numbers of participants. For studies categorized by exposure the rows of the table are always “unexposed” and “all exposed”, while the columns vary by study type: cases and controls for case-control studies; cases and at risk for prospective studies; and cases and non-cases for cross-sectional studies. For studies categorized by disease, the rows and columns are transposed. In this 2 × 2 table the “unexposed” is the reference category presented in the study report, while “all exposed” represents the sum of all the other categories reported. Together they represent the whole study population.

(3) Enter, for each category (“exposed” level or category of disease) its name, the OR/RR estimate and its lower and upper 95% confidence limit.

(4) Enter values in the contrast column. Rows given value 0 or 1 are included in analysis, rows given value -1 are excluded. In the estimation of overall risk, rows given value 0 constitute the baseline, rows given value 1 constitute the exposed.

(5) Enter the dose values for the trend tests. The doses should be proportional to the amount of exposure. They are not meaningful for results categorized by disease type. Dose values for excluded rows are not used in analysis.

(6) Click the Calculate button to generate the estimated numbers of subjects (the pseudo-numbers which appear in the columns to the right of the entered category details) and to produce the required results. The overall risk for the specified contrast (OR/RR and 95% CI) and the results for the heterogeneity and trend test (chi-squared and p values) are as in the earlier version of the spreadsheet. The new results are the “Trend: rate of increase in risk per unit dose” (Rate and 95% CI) giving estimates both using the dose as entered and using the “Uniform Scale”.

The data entry area and the results all appear on the left-hand side of the spreadsheet, in columns A to H. This area also provides space to enter a heading (in rows 1 and 2) and notes (in rows 54–60). Saving the spreadsheet to a relevant file name and location preserves the details of the text and data entered, and the results produced.

Columns J to AF give additional information. Rows 1 to 19 give instructions to the user and notes, while rows 20 onwards give details of the underlying calculations. These include the adjusted dose values for the doses as entered (column AA) if the dose for the reference level is not zero, the dose-values using the “Uniform Scale” estimated from the pseudo-numbers (column AB) and the adjusted version of these (column AC). Adjustment simply involves subtracting the dose value for the reference level from all the other dose values.

This spreadsheet has been tested under Microsoft operating systems WIN 7, WIN 8.1 and WIN 10 using Excel 2010 and 2013. It has also been tested on MacBook using operating system Mac OS 10-13 (High Sierra) running Excel 2016.

The R implementation. This was developed as a web application using the Shiny “Web Application Framework for R” package. The application can be accessed at https://roelee.shinyapps.io/R_RRest/. The R code is available from Zenodo as the file app.R (Lee et al., 2019). Also available at that site are the method for estimating goodness of fit (described in the file Goodness of fit tests for fitted RRs.pdf) and details of the testing of the R code: see RREst_Trend_Test_Files.zip, which contains Testing of RREst_Trend in R SAS and Excel.pdf (describing the testing carried out) and the .csv files that give the input data used and the results generated in each test.

The Shiny app can be accessed in various browsers including those available for Windows, Apple and Android operating systems. The source code is provided to allow users to inspect and possibly modify the code. The code was developed in R Version 3.5.1 (2018-07-02).

Data entry and obtaining the required statistics are very similar in the R and the Excel implementations.

In the Shiny app, tabs at the top of the screen allow the user to “enter data for study”, see the “result for specified contrast”, see the table of “pseudo-numbers” or read the “notes” on using the application.

In “enter data for study”, the user enters first the number of exposed levels, the title and the study type. The user must then enter, in the “2x2 Table” and in “RRs, Contrasts and Doses”, the information described in items 2 to 5 of The Excel implementation above. Pressing the “solve” button on the left will then generate the pseudo-numbers and the required results.

The results include all those given in the Excel implementation. Additionally, for both trend types, the trend coefficient (the logarithm of the increase in risk per unit dose) and its standard error are shown; together with a measure of the goodness-of-fit of the trend (chi-squared value, its degrees of freedom and p value).

For a prospective study, where the pseudo-numbers of cases are A_i (i = 0, 1 … k) and the pseudo-numbers at risk are N_i (i = 0, 1 … k), the goodness-of-fit test is obtained by determining fitted numbers of cases, F_i (i = 0, 1, … k) which satisfy the formulae

\sum_{i = 0}^{k} F_{i} = \sum_{i = 0}^{k} A_{i} and R_{i} = (F_{i} N_{0}) / (F_{0} N_{i})

where R_i are the RR values fitted using the dose and the estimated beta value. These formulae can be solved directly and the goodness-of-fit chi-squared statistic is then derived in the usual way from the formula

χ^{2} = \sum_{i = 0}^{k} \frac{{(A_{i} - F_{i})}^{2}}{F_{i}}

on k-1 degrees of freedom.

For a case-control study, where the pseudo-numbers of cases are A_i (i = 0, 1 .. k) and the pseudo-numbers of controls are B_i (i = 0, 1 .. k), the goodness-of-fit test is obtained by determining fitted numbers of cases, F_i (i = 0, 1, .. k), and controls, G_i (i = 0, 1, .. k), which satisfy the formulae

\sum_{i = 0}^{k} F_{i} = \sum_{i = 0}^{k} A_{i}, \sum_{i = 0}^{k} G_{i} = \sum_{i = 0}^{k} B_{i} and O_{i} = (F_{i} G_{0}) / (F_{0} G_{i}),

where O_i are the OR values fitted using the dose and the estimated beta value. These formulae can be solved by numerical methods (such as Newton Raphson) and the goodness-of-fit chi-squared statistic is then derived using the formula

χ^{2} = \sum_{i = 0}^{k} \frac{{(A_{i} - F_{i})}^{2}}{F_{i}} + \sum_{i = 0}^{k} \frac{{(B_{i} - G_{i})}^{2}}{G_{i}}

on 2k-1 degrees of freedom.

Goodness-of-fit testing for a cross-sectional study is equivalent to that for a case-control study, with controls replaced by non-cases.

The R implementation also allows the user easily to load data from a previously saved .csv file and to save the study data and results as a .csv file.

The Shiny application has been tested using Mozilla Firefox under Microsoft operating systems WIN 7, WIN 8.1 and WIN 10 and also using Microsoft Edge (under WIN 10) and Internet Explorer (under WIN 8.1). It has also been tested using Safari on MacBook under operating system Mac OS 10–13 (High Sierra).

The SAS implementation. The SAS implementation is provided as the macro RREst_trend.sas and is available from Zenodo (Lee et al., 2019). The documentation of the SAS implementation is also given on that website as RREst_trend SAS.pdf. Also available at that site are the method for estimating goodness of fit (described in the file Goodness of fit tests for fitted RRs.pdf) and details of the testing of the SAS code: see RREst_Trend_Test_Files.zip, which contains Testing of RREst_Trend in R SAS and Excel.pdf (describing the testing carried out) and the file SAS_RREst_Test_Results.pdf which provides the details of each test.

Users need a licenced installation of SAS in order to use the SAS code provided.

The macro has the following parameters:

■ ds1 - Dataset 1, the name of the input dataset containing the OR/RR, Lower CI, Upper CI, contrast and dose values for each of the exposed levels.
■ ds2 - Dataset 2, the name of the input dataset containing the 2x2 table.
■ Type - Study type. Values: CC (case-control), PR (prospective), XS (cross-sectional); (default value CC).
■ levels - How the study data is categorised. Values: EX (by exposure), DI (by disease); (default value EX).
■ out - Name of the output dataset that will hold the pseudo-numbers (default _RREst_).
■ alpha - Error probability used for the confidence intervals of the data entered (default value 0.05, equivalent to 95% CI).
■ trend - Report the trend tests? Values: 1 = Yes, 0 = No (default value 0).
■ details - Output the detailed results (details of each iteration and the final P’ and Z’ values)? Values: 1 = Yes, 0 = No (default value 0).
■ grid - Step size (out of 0–1) that should be used for finding a starting point for the iterative process (default value 0.01).
■ ini_beta - Starting point for the iterative process (default value: use the “grid” parameter’s starting point).

The first two of these parameters must be specified. If they are entered as the first two parameters, the parameter names (ds1 and ds2) are assumed and so need not be entered. The other parameters are optional. They are specified using the format parameter name = value. If not specified, the default value will be used. For example,

%RREst(mydata1,mydata2);

Here the macro is called only specifying the two input datasets containing, respectively, the details of the exposure categories and the 2x2 table. All other parameters take their default values, including study type case-control with data presented by levels of exposure and giving the pseudo-numbers dataset the name _RREst. If the study is prospective, the SAS macro could be called using

%RREst(mydata1,mydata2,type=PR) ;

Dataset 1 (ds1) should contain the details of each exposure level (or disease type) that makes up the study population, including the unexposed level. The fields within the dataset should be named as follows:

■ level - Equivalent to the Category column in the Excel implementation, giving a description of the exposure category (or disease type).
■ Est - OR/RR value. Ignored for the unexposed level.
■ lower - Lower confidence limit. Ignored for the unexposed level.
■ upper - Upper confidence limit. Ignored for the unexposed level.
■ dose - Mean exposure level (trend coefficients). Equivalent to the Dose column in the Excel implementation. If not included, dose values are assumed to be 0, 1, 2, and so on.

In addition, one or more contrast fields should be entered. These are equivalent to the Contrast column in the Excel implementation. Results will be generated for each contrast specified. Exceptionally, these fields can be given any name.

Dataset 2 (ds2) is equivalent to the 2x2 table in the Excel implementation. It must contain two fields, their names depending on study type and categorisation:

■ Any study categorised by disease type - “Exposed” and “Unexposed”
■ Case-control study, by exposure - “Cases” and “Controls”
■ Prospective study, by exposure - “Cases” and “At_Risk”
■ Cross-sectional, by exposure - Cases” and “Non-cases”

In order to contain the 2x2 table values, it needs to have two data rows.

The output from the SAS macro is presented in the output window. This output includes the trend results using the “Uniform Scale” and the goodness-of-fit results as described above for the R implementation. The output is also written to output files, as described in the detailed documentation (available from Zenodo, file RREst_trend SAS.pdf) (Lee et al., 2019).

This SAS code has been tested using SAS 9.4 (64 bit) under Microsoft operating system WIN 7 and using SAS 9.4 (32 bit) under WIN 10.

Results

Data used

In this section we give examples of applying the methodology. We do this using the results presented in two papers, one a report of smoking habits and lung cancer risk in Norway (Engeland et al., 1996) and the other a report on initiation of tobacco use among adolescents in the USA (Karcher & Finn, 2005).

We used the first of these (Engeland et al., 1996) to demonstrate using the method to estimate the effect per unit dose of exposure. This paper reports several dose measures, including age of starting smoking and intensity of pipe smoking. We considered the dose measure intensity of cigarette smoking, measured as the number of cigarettes habitually smoked per day. The outcome of interest was lung cancer of any type. The paper reports a study of 8,905 men born between 1893 and 1927 who were followed up for 28 years. It provides a risk assessment with confidence interval for each of five categories of number of cigarettes smoked per day. We use these results to estimate the increase in risk associated with the consumption of one extra cigarette per day.

The second paper (Karcher & Finn, 2005) reports a cross-sectional study of 303 middle and high school students from a rural town in Midwest USA. It used a Measure of Adolescent Connectedness (MAC) instrument to assess the adolescents’ degree of caring for and involvement in specific relationships. This instrument included an assessment of parental connectedness, reported as low, medium and high connectedness. The paper reports odds ratios for experimental smoking by levels of parental connectedness. These levels of connectedness cannot be assessed as a numerical dose, but the “Uniform Scale” approach can be used.

All data input are available as underlying data (Lee et al., 2019).

Effect estimates per unit of exposure

Identifying the data to be used. In the cohort study on lung cancer risk in Norwegian men and women, Engeland et al. (1996) presented data in men on the numbers of lung cancer cases and of person-years for seven groups – never smokers (the reference group), former smokers and current smokers of, respectively, 1–4, 5–9, 10–14, 15–19 and 20+ cigarettes per day. Dividing the numbers of person-years by the number of years of follow-up (28) to give an approximate indicator of the numbers at risk, and combining the results for all the exposed groups (the last six groups), the numbers of cases were 27 in never and 306 in ever smokers, while the corresponding numbers at risk were 2,097 and 6,017. These numbers were used to populate the 2x2 table of numbers in the Excel program. Note that this table is only used as a starting point to the iterative process for estimating the pseudo-numbers so does not have to be precise.

The estimation of the pseudo-numbers also requires the adjusted RRs (95% CIs) for the seven groups, which are given in the paper as, respectively, 1.0, 1.3 (0.8 to 2.2), 1.4 (0.6 to 3.7), 4.1 (1.7 to 10.0), 7.0 (2.9 to 17.0), 11.0 (4.2 to 28.0) and 15.0 (6.1 to 37.0). For dose assessment the midpoint consumption level for each category of current smokers are also needed, which we derived using the standard distribution described by Fry & Lee (2000) as 2.5, 6.5, 10.88, 15.83 and 26.03.

As shown in Table 1, we entered the relevant data into the Excel spreadsheet. Although we subsequently excluded the result for former smokers from analysis, the whole study population was counted in the 2x2 table of numbers of participants and details of each exposure level (never smokers, former smokers and the five levels of smoking intensity) were entered because the estimation of pseudo-numbers would then be based on all the information we have about the study. For this example study we have available the numbers of participants (cases and at risk) for each of the exposure levels. Often a study will report only a summary of the numbers of participants such as the totals exposed and unexposed or the overall total numbers of participants and the proportions exposed (cases and at risk). These summary details are sufficient to allow the method to be used. The results shown indicate the following:

Table 1. Example of deriving effect estimates per unit of exposure for number of cigarettes per day (Engeland et al., 1996).

Entered data
Study type	Prospective
Categorised by	Exposure
Number of participants		Cases	At risk^c
	Unexposed^a	27	2,097
	All exposed^b	306	6,017

Exposure category	RR	Lower CI	Upper CI	Contrast^d	Dose
Never	1.0	-	-	0	0
Former	1.3	0.8	2.2	−1	0
Current 1–4 cigs/day	1.4	0.6	3.7	1	2.5
Current 5–9 cigs/day	4.1	1.7	10.0	1	6.5
Current 10–14 cigs/day	7.0	2.9	17.0	1	10.88
Current 15–19 cigs/day	11.0	4.2	28.0	1	15.83
Current 20+ cigs/day	15.0	6.1	37.0	1	26.03

Results
Pseudo-numbers	Exposure	Cases	At risk
	Never	19.0200	661.9306
	Former	61.9758	1659.1342
	Current 1–4 cigs/day	5.8414	145.2091
	Current 5–9 cigs/day	5.7557	48.8557
	Current 10–14 cigs/day	5.2392	26.0478
	Current 15–19 cigs/day	3.7340	11.8138
	Current 20+ cigs/day	3.5471	8.2298

Overall risk for the specified contrast	3.4950 (1.9517 to 6.2585)
Heterogeneity	χ² = 69.3160, p-value 0.0000
Trend (Breslow, 1980)	χ² = 66.7185, p-value 0.0000
Trend: rate of increase in risk per unit dose
Using the doses as entered^e	1.1228 (1.0879 to 1.1588)

^a The reference group in the study population: the never smokers

^b All non-reference participants in the study population: the sum of values for former smokers and the five groups of current smokers

^c Approximate indicators – see text

^d These should be 0 or 1 for the levels to be included in the analysis. The Overall risk result compares the exposed group levels (value 1) with the reference group level(s) (value 0).

^e We have valid dose information for the exposure categories so results based on the “Uniform Scale” are ignored.

Table of pseudo-numbers. This table is consistent with the input RR and 95% CI values in that using the standard formulae for estimating each RR (CI) from a 2 x k table will generate the results input. It is notable that the numbers of cases in smokers are much lower than the original numbers cited in the paper. This is partly due to the effects of adjustment, but also because the original numbers given in the paper are not consistent with the varying widths of the 95% CIs, which are much narrower for former smokers than for each level of current smoking.

Overall risk for the specified contrast. This calculation is identical to that provided in the original software. Given the contrast values chosen of 0, −1, 1, 1, 1, 1 and 1 the overall risk value is an estimate of the RR in all current smokers combined (contrast value 1) relative to never smokers (contrast value 0) with former smokers excluded (contrast value -1). The contrast values are not relevant to the trend analysis, except that the selection of −1 for former smokers causes that exposure level to be excluded from analysis, so the trend analysis is based only on the data for never and current smokers.

Heterogeneity and trend (Breslow, 1980). These values are identical to those given in the original software, and confirm that the trend is very highly significant, and also that the trend explains the major part of the heterogeneity between levels.

Trend: Rate of increase in risk per unit dose. Using the dose values input, the estimated RR (95% CI) is 1.1228 (1.0879 to 1.1588). This 12.28% increase per unit dose implies that the fitted risk for the five current smoking groups is 1.34, 2.12, 3.52, 6.25 and 20.26, as compared with the input values of 1.4, 4.1, 7, 11 and 15. As discussed elsewhere (Fry et al., 2013), the shape of the dose-response relationship of lung cancer risk to amount smoked may be better fitted by alternative non-linear models.

Effect estimates using the “Uniform Scale”

In the cross-sectional study relating connectedness to experimental smoking among rural youth, Karcher & Finn (2005) reported that there were 135 experimental smokers, 43 with low parental connectedness, 54 with medium connectedness and 38 with high connectedness. As these numbers formed respectively 55%, 50% and 32% of the numbers of youths at each level of connectedness, we could estimate that the numbers who were not experimental smokers were 35 for low, 54 for medium and 81 for high parental connectedness, giving totals by level of 78, 108 and 119.

The exposure scale is qualitative so the method of calculating effect estimates using actual dose values cannot be used. However, our “Uniform Scale” methodology can be used to give a result that is comparable with those for other measures of connectedness in the same or a similar population.

This study provides the numbers of participants in each exposure level (78, 108 and 119 for low, medium and high connectedness respectively) so, as a demonstration of the method, we can calculate the “Uniform Scale” scores without using the software. Calculating N₁/2, N₁ + N₂/2 and N₁ + N₂ + N₃/2, gives the values of 39, 132 and 245.5. Scaling these to the range 0–1 is achieved by dividing by the total number of participants (305), giving scores on the “Uniform Scale” of 0.1279, 0.4328 and 0.8049.

The software uses the same method to derive “Uniform Scale” scores but bases them not on the actual numbers of participant by exposure level (which are often not available in a study report) but instead on the table of pseudo-numbers. Using this example we can compare the scores and the related trend results estimated using the pseudo-numbers with those based on the actual numbers of participants (above).

The authors also presented adjusted ORs (95% CIs) of 1.26 (0.69 to 2.27) for low vs medium parental connectedness and of 2.55 (1.40 to 4.66) for low vs high parental connectedness. As we wish to estimate ORs relative to low, we inverted these to get 0.7937 (0.4405 to 1.4493) for medium vs low and 0.3922 (0.2146 to 0.7143) for high vs low.

As shown in Table 2, we entered the relevant data into the Excel spreadsheet. In the Dose column we entered the “Uniform Scale” scores calculated above (based on the actual numbers of participants) so that the two sets of trend results generated both related to the “Uniform Scale” scores, one using the actual numbers of participants and the other based on the pseudo-numbers. Papers do not usually give numbers of participants in each exposure level and so “Uniform Scale” dose values based on actual numbers of participants cannot be calculated and entered. In these circumstances the results based on the Dose column values (Trend “Using the doses as entered”) should be ignored.

Table 2. Example of use of the “Uniform Scale” based on parental connectedness data (Karcher & Finn, 2005).

Entered data
Study type	Cross-sectional
Categorised by	Exposure
Number of participants		Cases	Non-cases
	Unexposed^a	43	35
	All exposed^b	92	135
	Total	135	170

Exposure category	OR	Lower CI	Upper CI	Contrast^c	Dose^d
Low	1	-	-	0	0.1279
Medium	0.7937	0.4405	1.4493	1	0.4328
High	0.3922	0.2146	0.7143	1	0.8049

Results
Pseudo-numbers	Exposure	Cases	Non-cases
	Low	41.2377	33.6214
	Medium	51.5297	52.9358
	High	36.9146	76.7467

“Uniform Scale” values based on the pseudo-numbers	0.1278 (low) 0.4338 (medium) 0.8060 (high)
Overall risk for the specified contrast	0.5560 (0.3274 to 0.9443)
Heterogeneity	χ² = 11.0022, p-value 0.0041
Trend (Breslow, 1980)	χ² = 10.4889, p-value 0.0012
Trend: rate of increase in risk per unit dose
Using the doses as entered^d	0.2382 (0.0990 to 0.5730)
Using the “Uniform Scale” derived from the pseudo-numbers	0.2388 (0.0994 to 0.5738)

^a The reference group in the study population: low parental connectedness

^b All non-reference participants in the study population: medium + high parental connectedness

^c These should be 0 or 1 for the levels to be included in the analysis. The Overall risk result compares the exposed group levels (value 1) with the reference group level(s) (value 0).

^d In this example these have been set to the “Uniform Scale” values calculated using the actual numbers of participants in each level

The results include the table of pseudo-numbers. This table is again consistent with the input OR and 95% CI values. The numbers are slightly lower than the original numbers, due to the increase in variance following adjustment.

Another result presented is the overall risk for the specified contrast. Given the input contrast values of 0, 1 and 1 the resulting OR relates to the pairwise comparison of medium and high combined to low. The contrast values are not relevant to the trend analysis except that entering a contrast value of −1 would have caused the exposure level to be excluded from analysis, though not from the calculation of the pseudo-numbers.

The output also shows the results of the trend analysis – the rate of increase in risk per unit dose. Using the scores calculated above, based on the actual numbers, the estimated OR (95% CI) for a 1 unit difference in exposure (most exposed compared with least possible exposure) is 0.2382 (0.0990 to 0.5730). This is very similar to the estimate based on the pseudo-numbers, of 0.2388 (0.0994 to 0.5738), because the “Uniform Scale” scores based on the pseudo-numbers are very similar to those based on the actual numbers. Note that the estimated reduction in risk is larger than that for the high vs low comparison (OR 0.3922, 95% CI 0.2146 to 0.7143), as that comparison is only based on an estimated difference in exposure of about 0.68 units rather than being based on a difference of 1 unit.

Additional examples of using the “Uniform Scale”

In addition to the example above, it is useful to give some other examples where the nature of the information presented needs particular consideration in order to provide effect estimates in the “Uniform Scale”.

One example is the study by Lloyd-Richardson et al. (2002), which reports results from a cross-sectional study based on analyses of 19 818 adolescents, of whom 10 924 had at least experimented with smoking. The authors present an OR (CI) of 1.16 (1.03–1.30) for low school connectedness, assessed using an eight item scale, the number of levels per item not being given. The authors note that, in coding the data, the variable was standardized by “subtracting off the median and dividing them by the distance between the median and the third quartile”. As the median corresponds to a mean score of 0.5 and the third quartile to a mean score of 0.625 (midway between 0.5 and 0.75) this suggests the difference in mean score is 0.125, so that the OR (CI) should be raised to the eighth power (1/0.125 = 8) to give the required result of 3.28 (1.27–8.16).

Another example is the study by Simons-Morton & Haynie (2003) in which 973 students completed surveys at the beginning and end of the sixth grade. Their Table 4 gives an OR for high v low social competence of 0.71 (0.52–0.98). The authors note that social competence is measured using eight items, with a mean of 22.37 (SD 5.39). It is unclear how many levels there are for each item, though possibly four given the mean. Nor is it clear whether high v low is just a simple breakdown of the population into two groups, in which case the OR (CI) should be squared, as noted above, to give 0.50 (0.27–0.96). Alternatively, if high is the highest quartile, and low the lowest quartile, one is comparing groups with mean scores of 0.875 and 0.125, a difference of 0.75 so that the OR (CIs) should be raised to the power of 1/0.75 = 4/3, giving a result of 0.63 (0.42–0.97). More information would be needed before this study could be included in any meta-analysis.

Finally we consider the study by Kandel et al. (2004) who compared smoking onset factors in 5,374 adolescents who had never smoked based on data from the National Longitudinal Study. They report ORs of 0.59 (0.45–0.77) for positive scholastic attitudes and 0.93 (0.75–1.16) for parent-child connectedness. For positive scholastic attitudes they averaged four 5-point items, so presumably the OR related to a 1 point difference in a scale that can vary by 4 points. This indicates that one should power up the 0.59 (0.45–0.77) by 5 to give 0.07 (0.02–0.27). For parent connectedness the authors refer to a 13 item scale, but state neither the number of points in each scale nor whether the values for individual items were summed or averaged. The authors refer to Resnick et al. (1997) as using this 13-item scale, that paper stating that this variable was standardized “to a mean of 0 and an SD of 1”, so that “parameter estimates can be interpreted as standardized B”. Assuming that the paper considered here did the same, and noting that, as derived earlier, the standard deviation of our “Uniform Scale” is approximated by 1/√12 = 0.2887 for large N, one needs to power the OR up by √12 to give the units we require. This gives an OR for parent-child connectedness of 0.78 (0.37–1.67).

Note that in all three of these last examples, the new software is not required, the user only having to raise reported effect estimates to the appropriate power to obtain the required estimate, corresponding to the difference in risk between the most and least possible exposures.

Note also, that when conducting meta-analyses it is important to be sure that the effect estimates are calculated in the same direction. If some give results for high v low and some for low v high, it will be necessary to invert the estimates as appropriate to ensure combinability.

Comparing results using a known dose scale and using the “Uniform Scale”

It is of interest to see what results would have been obtained in the example shown in Table 1 had we not had estimates of the dose for never smokers and for current smokers by amount smoked, but instead used the “Uniform Scale” methodology to estimate doses for each level based on the distribution of the at risk population. Here the OR (CI) is estimated as 29.5493 (11.0943 to 78.7039) as compared with the original trend estimate of 1.1228 (1.0879 to 1.1588). The OR of 29.5493 is equal to 1.1228 raised to the 29.23th power. As the original estimate is per cigarette per day, this would imply that the “Uniform Scale” relates to a full range of 29.23 cigarettes per day. While no doubt some smokers smoke more cigarettes per day, this result seems not to be implausible given the original calculation was based on dose means, with that for the heaviest smoking group estimated as 26.03 cigarettes per day.

Discussion/conclusions

Since we published our original paper in 2008 (Hamling et al., 2008), we have carried out numerous meta-analyses making extensive use of the software (e.g. (Forey et al., 2011; Lee et al., 2012; Lee et al., 2017; Lee et al., 2016a; Lee et al., 2016b; Lee & Hamling, 2009; Lee & Hamling, 2016)). We have also carried out dose-response meta-analyses (Fry et al., 2013), though we have only recently updated the Excel software. While the methodology has proved useful to us and to other researchers in situations where the raw data for a study are not available, there are some difficulties in applying it. These include presentation of the original data to insufficient accuracy, not having available the required 2x2 table (e.g. cases/controls x unexposed/exposed for a case-control study) and so having to use approximate estimates; and sometimes finding that the estimation of the pseudo-numbers does not converge to a solution, even after using different starting points for the iteration process. Nevertheless, the method gave what appeared plausible estimates in many practical applications.

Extending the software to estimate increases in risk per unit of exposure does bring a few additional problems. One is the difficulty in obtaining an estimate of the midpoint exposure for data grouped by ranges of exposure, especially when the highest range is open-ended. Another is that the trend estimation assumes an underlying dose-response shape that may not necessarily apply. Nevertheless, the extensions to the software to include trend estimation does provide the meta-analyst with a useful tool for combining dose-response results from different studies that use varying ways of grouping dose.

The extension of the software to use the “Uniform Scale” also allows the meta-analyst to attempt combination of results from studies using qualitative rather than quantitative estimates of exposure, possibly derived using a variety of measures of some underlying exposure. By attempting to derive an effect estimate for the range from the most exposed possible (with score 1) to the least exposed possible (with score 0), the meta-analyst is given a way of combining results on a consistent basis. Clearly, the underlying assumption - that the study population is made up of individuals with successively increasing exposure by an equal amount - is dubious, but we nevertheless feel that the method is useful. Other approaches, but with a different specification of the distribution of exposure, may well be possible, but we have not investigated these so far.

Data availability

Underlying data

Zenodo: Software for use in meta-analysis, providing Effect estimates per unit of exposure, including the Uniform scale. http://doi.org/10.5281/zenodo.3582481 (Lee et al., 2019)

This project contains the following underlying data:

■ RREst_Trend_Test_Files.zip (various files describing the testing carried out, including .xlsm files used in testing the Excel version and .pdf files describing the R and SAS testing: the Methods section above names the files relevant to the testing of each implementation).

Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

Software availability

Archived source code at time of publication: http://doi.org/10.5281/zenodo.3582481 (Lee et al., 2019)

License: Creative Commons Attribution 4.0 International license (CC-BY 4.0)

This contains the following files:

app.R (the code for the R implementation)
RREst_trend.SAS (the SAS implementation)
RREst_trend.xlsm (the Excel implementation)
RREst_trend.pdf (the Excel implementation documentation)
RREst_trend SAS.pdf (the SAS implementation documentation)
Goodness of fit tests for fitted RRs.pdf (a document describing how the goodness-of-fit statistics are calculated)

Acknowledgements

The authors thank Mrs. Yvonne Cooper and Mrs. Diana Morris for typing the various drafts of this paper, and Mr. John Hamling for assistance with software testing.

F1000 recommended

References

Berlin JA, Longnecker MP, Greenland S: Meta-analysis of epidemiologic dose-response data. Epidemiology. 1993; 4(3): 218–228. PubMed Abstract | Publisher Full Text
Breslow NE, Day NE: Statistical methods in cancer research. Volume I - The analysis of case-control studies. IARC Sci Publ. 1980; (32): 5–338. PubMed Abstract
Breslow NE, Day NE: Statistical methods in cancer research. Volume II--The design and analysis of cohort studies. IARC Sci Publ. 1987; (82): 1–406. PubMed Abstract
Engeland A, Haldorsen T, Andersen A, et al.: The impact of smoking habits on lung cancer risk: 28 years' observation of 26,000 Norwegian men and women. Cancer Causes Control. 1996; 7(3): 366–376. PubMed Abstract | Publisher Full Text
Forey BA, Thornton AJ, Lee PN: Systematic review with meta-analysis of the epidemiological evidence relating smoking to COPD, chronic bronchitis and emphysema. BMC Pulm Med. 2011; 11: 36. PubMed Abstract | Publisher Full Text | Free Full Text
Fry JS, Lee PN: Revisiting the association between environmental tobacco smoke exposure and lung cancer risk. I. The dose-response relationship with amount and duration of smoking by the husband. Indoor Built Environ. 2000; 9(6): 303–316. Publisher Full Text
Fry JS, Lee PN, Forey BA, et al.: Dose-response relationship of lung cancer to amount smoked, duration and age starting. World J Metaanal. 2013; 1(2): 57–77. Publisher Full Text
Greenland S, Longnecker MP: Methods for trend estimation from summarized dose-response data, with applications to meta-analysis. Am J Epidemiol. 1992; 135(11): 1301–1309. PubMed Abstract | Publisher Full Text
Hamling J, Lee P, Weitkunat R, et al.: Facilitating meta-analyses by deriving relative effect and precision estimates for alternative comparisons from a set of estimates presented by exposure level or disease category. Stat Med. 2008; 27(7): 954–970. PubMed Abstract | Publisher Full Text
Kandel DB, Kiros GE, Schaffran C, et al.: Racial/ethnic differences in cigarette smoking initiation and progression to daily smoking: a multilevel analysis. Am J Public Health. 2004; 94(1): 128–135. PubMed Abstract | Publisher Full Text | Free Full Text
Karcher MJ, Finn L: How connectedness contributes to experimental smoking among rural youth: developmental and ecological analyses. J Prim Prev. 2005; 26(1): 25–36. PubMed Abstract | Publisher Full Text
Lee PN, Forey BA, Coombs KJ: Systematic review with meta-analysis of the epidemiological evidence in the 1900s relating smoking to lung cancer. BMC Cancer. 2012; 12: 385. PubMed Abstract | Publisher Full Text | Free Full Text
Lee PN, Forey BA, Hamling JS, et al.: Environmental tobacco smoke exposure and heart disease: A systematic review. World J Metaanal. 2017; 5(2): 14–40. Publisher Full Text
Lee PN, Fry JS, Forey B, et al.: Environmental tobacco smoke exposure and lung cancer: a systematic review. World J Metaanal. 2016a; 4(2): 10–43. Publisher Full Text
Lee PN, Hamling JS, Fry JS, et al.: Software for use in meta-analysis, providing Effect estimates per unit of exposure, including the Uniform scale (Version 1.0). Zenodo. 2019. http://www.doi.org/10.5281/zenodo.3582481
Lee PN, Hamling J: Systematic review of the relation between smokeless tobacco and cancer in Europe and North America. BMC Med. 2009; 7: 36. PubMed Abstract | Publisher Full Text | Free Full Text
Lee PN, Hamling JS: Environmental tobacco smoke exposure and risk of breast cancer in nonsmoking women. An updated review and meta-analysis. Inhal Toxicol. 2016; 28(10): 431–54. PubMed Abstract | Publisher Full Text | Free Full Text
Lee PN, Thornton AJ, Hamling JS: Epidemiological evidence on environmental tobacco smoke and cancers other than lung or breast. Regul Toxicol Pharmacol. 2016b; 80: 134–163. PubMed Abstract | Publisher Full Text
Lloyd-Richardson EE, Papandonatos G, Kazura A, et al.: Differentiating stages of smoking intensity among adolescents: stage-specific psychological and social influences. J Consult Clin Psychol. 2002; 70(4): 998–1009. PubMed Abstract | Publisher Full Text
Orsini N, Li R, Wolk A, et al.: Meta-analysis for linear and nonlinear dose-response relations: examples, an evaluation of approximations, and software. Am J Epidemiol. 2012; 175(1): 66–73. PubMed Abstract | Publisher Full Text | Free Full Text
Resnick MD, Bearman PS, Blum RW, et al.: Protecting adolescents from harm. Findings from the National Longitudinal Study on Adolescent Health. JAMA. 1997; 278(10): 823–32. PubMed Abstract | Publisher Full Text
Simons-Morton BG, Haynie DL: Psychosocial predictors of increased smoking stage among sixth graders. Am J Health Behav. 2003; 27(6): 592–602. PubMed Abstract | Publisher Full Text

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 22 Jan 2020

Author details Author details

Jan Hamling
Roles: Formal Analysis, Methodology, Software, Writing – Review & Editing

John S Fry
Roles: Conceptualization, Formal Analysis, Methodology, Software, Writing – Review & Editing

Sonja Vandyke
Roles: Conceptualization, Formal Analysis, Methodology, Writing – Review & Editing

Rolf Weitkunat
Roles: Conceptualization, Methodology, Project Administration, Writing – Review & Editing

Competing interests

Two of the authors (SV and RW) were employed by, and the other authors were consultants to, Philip Morris Products S.A. (the funding source) at the time of writing the paper. No changes to the paper were requested by the funding source.

Grant information

The work was wholly funded by Philip Morris Products S.A., PNL, JH and JF being funded by project agreements 26 and 30 to P.N. Lee Statistics and Computing Ltd.

Article Versions (1)

version 1

Published: 22 Jan 2020, 9:33

https://doi.org/10.12688/f1000research.21900.1

© 2020 Lee PN et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

SEE MORE DETAILS

CITE

how to cite this article

Lee PN, Hamling J, Fry JS et al. Using the “Uniform Scale” to facilitate meta-analysis where exposure variables are qualitative and vary between studies – methodology, examples and software [version 1; peer review: 1 approved with reservations]. F1000Research 2020, 9:33 (https://doi.org/10.12688/f1000research.21900.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Peer review discontinued

Version 1

VERSION 1

PUBLISHED 22 Jan 2020

Views

Reviewer Report 11 Feb 2020

Chang Xu, Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu, China

Approved with Reservations

https://doi.org/10.5256/f1000research.24147.r58931

Lee et al. have presented an interesting and important work for estimating the missing data in dose-response meta-analysis. It is so appreciated for these authors for their contribution in this area. I have some further comments that hope will be ... Continue reading

The background seems not well introduced. In dose-response meta-analysis, the non-reference effect estimates were correlated, when combining the dose-response relationship and taking into account the study correlation is essential for a BLUE estimation. And based on the GLST method, the group size information (cases, non-cases) were generally used to estimate the correlation. And further, some of the studies failed to provide such group size information that make it hard to get an estimation. Therefore, this article and previous one by Hamling are all an attempt to solve this problem. I believe a clear introduction for this could help readers get a better understanding.
The missing information about the group sizes could be an important issue for the GLST framework, as it uses GLS estimation which requires the co-variance matrix. There is another framework based on the robust-error to deal with the correlation, which was first proposed by Hedges et al.¹, and has been introduced in dose-response meta-analysis (called robust error meta-regression method, REMR) by Suhail et al. And based on REMR framework, the group size information are no longer needed². The authors should discuss it.
The updated software allows the cross-sectional study. This is an important and useful update. In my experience, I noticed that many systematic reviewers combine the cross-sectional study and cohort/case-control study together. However, this is not true. I suggest the authors add 1-2 sentences to the difference of cross-sectional study to the other two and remind the readers to avoid the problem.

Is the rationale for developing the new method (or application) clearly explained?

Yes
Is the description of the method technically sound?

Yes
Are sufficient details provided to allow replication of the method development and its use by others?

Yes
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Yes

References

1. Hedges LV, Tipton E, Johnson MC: Robust variance estimation in meta-regression with dependent effect size estimates.Res Synth Methods. 2010; 1 (1): 39-65 PubMed Abstract | Publisher Full Text
2. Xu C, Doi SAR: The robust error meta-regression method for dose-response meta-analysis.Int J Evid Based Healthc. 2018; 16 (3): 138-144 PubMed Abstract | Publisher Full Text

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Evidence synthesis, dose-response meta-analysis.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Author Response 04 Apr 2024

Peter Lee, Director, P.N. Lee Statistics and Computing Ltd, Sutton, SM2 5DA, UK

04 Apr 2024

Author Response

I thank the reviewer for his kind comments. However have decided not to follow his specific suggestions for amending the paper.
Competing Interests: None
I thank the reviewer for his kind comments. However have decided not to follow his specific suggestions for amending the paper.
I thank the reviewer for his kind comments. However have decided not to follow his specific suggestions for amending the paper.
Competing Interests: None Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 04 Apr 2024

Peter Lee, Director, P.N. Lee Statistics and Computing Ltd, Sutton, SM2 5DA, UK

04 Apr 2024

Author Response

I thank the reviewer for his kind comments. However have decided not to follow his specific suggestions for amending the paper.
Competing Interests: None
I thank the reviewer for his kind comments. However have decided not to follow his specific suggestions for amending the paper.
I thank the reviewer for his kind comments. However have decided not to follow his specific suggestions for amending the paper.
Competing Interests: None Close
Report a concern

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 22 Jan 2020

Peer review discontinued

At the request of the author(s), this article is no longer under peer review.

What does this mean?

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1
Version 1 22 Jan 20	read

Chang Xu, Sichuan University, Chengdu, China

Comments on this article

All Comments(0)

Add a comment

Browse by related subjects

Back to all reports

Reviewer Report

30 Views

11 Feb 2020 | for Version 1

Chang Xu, Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu, China

30 Views Cite this report Responses(1)

Approved With Reservations

The background seems not well introduced. In dose-response meta-analysis, the non-reference effect estimates were correlated, when combining the dose-response relationship and taking into account the study correlation is essential for a BLUE estimation. And based on the GLST method, the group size information (cases, non-cases) were generally used to estimate the correlation. And further, some of the studies failed to provide such group size information that make it hard to get an estimation. Therefore, this article and previous one by Hamling are all an attempt to solve this problem. I believe a clear introduction for this could help readers get a better understanding.
The missing information about the group sizes could be an important issue for the GLST framework, as it uses GLS estimation which requires the co-variance matrix. There is another framework based on the robust-error to deal with the correlation, which was first proposed by Hedges et al.¹, and has been introduced in dose-response meta-analysis (called robust error meta-regression method, REMR) by Suhail et al. And based on REMR framework, the group size information are no longer needed². The authors should discuss it.
The updated software allows the cross-sectional study. This is an important and useful update. In my experience, I noticed that many systematic reviewers combine the cross-sectional study and cohort/case-control study together. However, this is not true. I suggest the authors add 1-2 sentences to the difference of cross-sectional study to the other two and remind the readers to avoid the problem.

Is the rationale for developing the new method (or application) clearly explained?

Yes
Is the description of the method technically sound?

Yes
Are sufficient details provided to allow replication of the method development and its use by others?

Yes
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Yes

References

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Evidence synthesis, dose-response meta-analysis.

Respond to this report

Responses (1)

Alongside their report, reviewers assign a status to the article:

Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions

[1] Berlin JA, Longnecker MP, Greenland S: Meta-analysis of epidemiologic dose-response data. Epidemiology. 1993; 4(3): 218–228. PubMed Abstract | Publisher Full Text

[2] Breslow NE, Day NE: Statistical methods in cancer research. Volume I - The analysis of case-control studies. IARC Sci Publ. 1980; (32): 5–338. PubMed Abstract

[3] Breslow NE, Day NE: Statistical methods in cancer research. Volume II--The design and analysis of cohort studies. IARC Sci Publ. 1987; (82): 1–406. PubMed Abstract

[4] Engeland A, Haldorsen T, Andersen A, et al.: The impact of smoking habits on lung cancer risk: 28 years' observation of 26,000 Norwegian men and women. Cancer Causes Control. 1996; 7(3): 366–376. PubMed Abstract | Publisher Full Text

[5] Forey BA, Thornton AJ, Lee PN: Systematic review with meta-analysis of the epidemiological evidence relating smoking to COPD, chronic bronchitis and emphysema. BMC Pulm Med. 2011; 11: 36. PubMed Abstract | Publisher Full Text | Free Full Text

[6] Fry JS, Lee PN: Revisiting the association between environmental tobacco smoke exposure and lung cancer risk. I. The dose-response relationship with amount and duration of smoking by the husband. Indoor Built Environ. 2000; 9(6): 303–316. Publisher Full Text

[7] Fry JS, Lee PN, Forey BA, et al.: Dose-response relationship of lung cancer to amount smoked, duration and age starting. World J Metaanal. 2013; 1(2): 57–77. Publisher Full Text

[8] Greenland S, Longnecker MP: Methods for trend estimation from summarized dose-response data, with applications to meta-analysis. Am J Epidemiol. 1992; 135(11): 1301–1309. PubMed Abstract | Publisher Full Text

[9] Hamling J, Lee P, Weitkunat R, et al.: Facilitating meta-analyses by deriving relative effect and precision estimates for alternative comparisons from a set of estimates presented by exposure level or disease category. Stat Med. 2008; 27(7): 954–970. PubMed Abstract | Publisher Full Text

[10] Kandel DB, Kiros GE, Schaffran C, et al.: Racial/ethnic differences in cigarette smoking initiation and progression to daily smoking: a multilevel analysis. Am J Public Health. 2004; 94(1): 128–135. PubMed Abstract | Publisher Full Text | Free Full Text

[11] Karcher MJ, Finn L: How connectedness contributes to experimental smoking among rural youth: developmental and ecological analyses. J Prim Prev. 2005; 26(1): 25–36. PubMed Abstract | Publisher Full Text

[12] Lee PN, Forey BA, Coombs KJ: Systematic review with meta-analysis of the epidemiological evidence in the 1900s relating smoking to lung cancer. BMC Cancer. 2012; 12: 385. PubMed Abstract | Publisher Full Text | Free Full Text

[13] Lee PN, Forey BA, Hamling JS, et al.: Environmental tobacco smoke exposure and heart disease: A systematic review. World J Metaanal. 2017; 5(2): 14–40. Publisher Full Text

[14] Lee PN, Fry JS, Forey B, et al.: Environmental tobacco smoke exposure and lung cancer: a systematic review. World J Metaanal. 2016a; 4(2): 10–43. Publisher Full Text

[15] Lee PN, Hamling JS, Fry JS, et al.: Software for use in meta-analysis, providing Effect estimates per unit of exposure, including the Uniform scale (Version 1.0). Zenodo. 2019. http://www.doi.org/10.5281/zenodo.3582481

[16] Lee PN, Hamling J: Systematic review of the relation between smokeless tobacco and cancer in Europe and North America. BMC Med. 2009; 7: 36. PubMed Abstract | Publisher Full Text | Free Full Text

[17] Lee PN, Hamling JS: Environmental tobacco smoke exposure and risk of breast cancer in nonsmoking women. An updated review and meta-analysis. Inhal Toxicol. 2016; 28(10): 431–54. PubMed Abstract | Publisher Full Text | Free Full Text

[18] Lee PN, Thornton AJ, Hamling JS: Epidemiological evidence on environmental tobacco smoke and cancers other than lung or breast. Regul Toxicol Pharmacol. 2016b; 80: 134–163. PubMed Abstract | Publisher Full Text

[19] Lloyd-Richardson EE, Papandonatos G, Kazura A, et al.: Differentiating stages of smoking intensity among adolescents: stage-specific psychological and social influences. J Consult Clin Psychol. 2002; 70(4): 998–1009. PubMed Abstract | Publisher Full Text

[20] Orsini N, Li R, Wolk A, et al.: Meta-analysis for linear and nonlinear dose-response relations: examples, an evaluation of approximations, and software. Am J Epidemiol. 2012; 175(1): 66–73. PubMed Abstract | Publisher Full Text | Free Full Text

[21] Resnick MD, Bearman PS, Blum RW, et al.: Protecting adolescents from harm. Findings from the National Longitudinal Study on Adolescent Health. JAMA. 1997; 278(10): 823–32. PubMed Abstract | Publisher Full Text

[22] Simons-Morton BG, Haynie DL: Psychosocial predictors of increased smoking stage among sixth graders. Am J Health Behav. 2003; 27(6): 592–602. PubMed Abstract | Publisher Full Text

Using the “Uniform Scale” to facilitate meta-analysis where exposure variables are qualitative and vary between studies – methodology, examples and software

Abstract

Keywords

Introduction

Methods

Including effect estimates per unit of exposure

Including effect estimates for the “Uniform Scale”

Including studies that provide other forms of dose assessment in “Uniform Scale” meta-analysis

Implementation of the software provided

Results

Data used

Effect estimates per unit of exposure

Table 1. Example of deriving effect estimates per unit of exposure for number of cigarettes per day (Engeland et al., 1996).

Effect estimates using the “Uniform Scale”

Table 2. Example of use of the “Uniform Scale” based on parental connectedness data (Karcher & Finn, 2005).

Additional examples of using the “Uniform Scale”

Comparing results using a known dose scale and using the “Uniform Scale”

Discussion/conclusions

Data availability

Underlying data

Software availability

Acknowledgements

References

Comments on this article Comments (0)

Peer review discontinued

Comments on this article Comments (0)

Peer review discontinued

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated