The Noetic Experience and Belief Scale : A validation and reliability study

Belief in the paranormal is widespread worldwide. Recent Background: surveys suggest that subjective experiences of the paranormal are common. A concise instrument that adequately evaluates beliefs as distinct from experiences does not currently exist. To address this gap, we created the Noetic Experiences and Beliefs Scale (NEBS) which evaluates belief and experience as separate constructs. The NEBS is a 20-item survey with 10 belief and 10 experience Methods: items rated on a visual analog scale from 0-100. In an observational study, the survey was administered to 361 general population adults in the United States and a subsample of 96 one month later. Validity, reliability and internal consistency were evaluated. A confirmatory factor analysis was conducted to confirm the latent variables of belief and experience. The survey was then administered to a sample of 646 IONS Discovery Lab participants to evaluate divergent validity and confirm belief and experience as latent variables of the model in a different population. The NEBS demonstrated convergent validity, reliability and Results: internal consistency (Cronbach’s alpha Belief 0.90; Experience 0.93) and test-retest reliability (Belief: 0.83 Experience: 0.77). A confirmatory r = ; r = factor analysis model with belief and experience as latent variables demonstrated a good fit. The factor model was confirmed as having a good fit and divergent validity was established in the sample of 646 IONS Discovery Lab participants. The NEBS is a short, valid, and reliable instrument for Conclusions: evaluating paranormal belief and experience.


Amendments from Version 1
Introduction "Paranormal beliefs pertain to phenomena that have not been empirically attested to the satisfaction of the scientific establishment" 1 . Paranormal beliefs encompass a broad range of concepts, such as ghosts or spirits, extrasensory perception (ESP), extraterrestrial beings, and mind-to-mind communication, or telepathy. Belief in the paranormal is widespread around the world [1][2][3][4][5][6][7][8][9][10][11][12][13] . For example, in a Gallup poll of 1,002 United States adults conducted in 2005, 55% respondents believed in psychic or spiritual healing or the power of the human mind to heal the body, 41% believed in extrasensory perception, and 31% believed in telepathy or mind-to-mind communication 14 .
However, having a belief in the paranormal does not necessarily mean having experienced the paranormal. A paranormal experience refers to an individual's memory of an experience that one judges to be genuine. The memory of a paranormal experience relies on a different mental substrate than a belief based on environment, education and reasoning and the neural structures underlying memory of an experience and belief are likely different 15,16 . Paranormal belief and experience are often correlated when measured simultaneously, although this is rarely done 17,18 . For example, one study found a positive correlation (r = 0.61) between paranormal experience and belief scores 12 . Another interesting study found that exposure to television programs that regularly depict paranormal phenomena was positively correlated with belief, but only for respondents who had personal experiences 19 .
Prevalence of reported paranormal experiences evaluated over the last 40 years in a variety of populations has ranged from a low of 10% in Scottish citizens 20 to a high as 97% in enthusiasts in the United States 12 . Two very large prevalence studies have been conducted. One surveyed adults in 13 European countries and the United States (N=18,607). European respondents reported experiencing telepathy (34%), clairvoyance (21%), and contact with the dead (25%). Percentages for the U.S. adults were considerably higher: 54%, 25%, and 30% respectively 21 . Another large study of British adults (n=4,096) found that 37% of respondents reported at least one paranormal experience defined as precognitions, extra-sensory perception, mystical experiences, telepathy, and after-death communication 22 . Other smaller prevalence studies have been conducted around the world. Haraldsson et al. conducted two surveys of prevalence in Iceland, one in 1974 with 902 participants 5 and one in 2006 with 991 participants 3 . He found that psychic phenomena actually increased from 59% of men and 71% of women in 1974 to 70% of men and 81% of women in 2006. In Scotland, 10-16% of the general population sample (n -241) had experienced second sight, with the exception of the Grampian area where prevalence was more than doubled at 33% 20 . Chinese, Japanese, African-American and Caucasian-American college students (n -1922) were surveyed and 31-47% reported having at least one experience 23 . Of 502 adults in Winnipeg, Canada 65% 24 and 38% of 622 Charlottesville, Virginia students and townspeople 25 reported having at least one experience. In the United States, 67% of the 1460 participants reported having had an ESP experience, 31% a clairvoyant experience, and 42% contact with the dead 26 . More recently in the United States, 89.3% of the general population, 89.5% of scientists and engineers, and 97.8% of paranormal enthusiasts reported at least one paranormal experience (n -899) 12 .
Specificity of the work in this field is limited by the lack of questionnaires that adequately separate paranormal belief from experience and do so concisely 1,27 . Using ambiguous measures can lead to confounding the two constructs of belief and experience, and blur results 1,6 . Instruments that do separate these constructs are long and not conducive to the time constraints of many studies -see Exceptional Experiences Questionnaire 28 and Anomalous Experience Inventory 29 . To address these limitations and as part of a larger research program on extended human capacities, we created the Noetic Experience and Belief Scale (NEBS), a 20-item survey that evaluates paranormal beliefs and experiences separately. The present studies investigate the psychometric properties of the Noetic Experience and Belief Scale in two populations. By studying these phenomena, we aim to gain a deeper understanding of the nature of consciousness and the reach of human potential.
The objectives of the following two observational studies were to evaluate the validity and reliability of the Noetic Experience and Belief Scale (NEBS) and to confirm the two latent variables of belief and experience in a confirmatory factor analysis. In study 1, the survey was administered to 350 participants for the validity and confirmatory factor analyses and again to a subsample of 96 of these participants for a test-retest analysis. In study 2, the survey was administered to a different population where divergent validity was evaluated and the factor model reevaluated. We hypothesized that NEBS would be valid, reliable, and demonstrate good fit for a model with belief and experience as latent variables in both populations.

STUDY 1: General population sample
Methods Initial development of the NEBS. The NEBS was developed through consensus by the authors and two expert consultants who actively work in the field. This group was informed by our own previous studies and by reviewing other studies and previously used instruments that evaluated paranormal beliefs and/or experiences. One previous study 30 evaluated the prevalence of 27 paranormal experiences listed here in decreasing order of prevalence: Claircognizance, Clairempathy, Precognition, Lucid Dreaming, Emotional Healing, Clairvoyance, Clairsentience, Animal Communication, Telepathy, Aura Reading, Astral Projection, Clairaudience, Clairalience, Mediumship, Channeling, Physical Healing, Geomancy, Retrocognition, Psychometry, Remote Viewing, Automatic Writing, Clairgustance, Psychokinesis, Pyrokinesis, Levitation, and Psychic Surgery (please see extended data for definitions of each of these terms 31 ). Based on feedback from participants and a review of these items, we removed emotional healing (very similar to physical healing), psychic surgery (very rare), and clairsentience (very similar to claircognizance), renamed channel to psychophony and mediumship to contact with the dead, and added the item Information from Dreams. We then conducted another prevalence study in a different population. Notably, we did not use the jargon term for each paranormal belief/experience, but instead used as neutral language as possible to describe the experience itself. For example, rather than asking if the participant had ever experienced "pyrokinesis -the ability to create and/or manipulate fire", the item asked "Have you ever created fire using only your concentration or will?" These neutral language items were then administered to 899 participants consisting of three samples: a general population sample, scientists and engineers, and paranormal enthusiasts 12 .
In both studies, we found that some items were highly correlated and represented overlapping constructs. They could also be viewed as specific nuanced experiences within a larger extended human capacities category. For example, psychic physical healing or the purported ability to feel other people's physical symptoms in your own body and heal, transform, or transmute them would fall under the umbrella category of psychokinesis or the purported ability to influence a physical system without any physical interaction or with mental effort alone. Thus, in an effort to reduce participant burden and allow for quick assessment of experiences and beliefs we collapsed any overlapping constructs into individual items for each of the following categories: 1. Non-local consciousness (e.g. Astral Projection, Lucid Dreaming); 2. Extraterrestrials; 3. Precognition/Retrocausation; 4. Survival of Consciousness (after bodily death); 5. Contact with the dead (Mediumship); 6. Clairvoyance (Claircognizance, Clairempathy, Clairvoyance, Clairsentience, Aura Reading, Clairalience, Clairaudience, Geomancy, Clairgustance, Remote Viewing, Psychometry, Animal Communication); 7. Psychokinesis (Physical Healing, Psychokinesis, Psychic Surgery, Pyrokinesis, Levitation); 8. Telepathy; 9. Automatism (Channeling, Automatic Writing) We also reviewed a number of existing questionnaires that measured paranormal experience and/or belief 8,26,28,29,[32][33][34][35][36][37][38][39][40][41][42][43][44][45] . A summary of this review is presented as Supplemental data A (please see extended data 31 ). Each questionnaire was evaluated for the number of items, whether it assesses belief, experience or both, whether it evaluates belief and experience as separate constructs, and subscales if applicable. From this review, an additional item on intuition, representing perhaps the most common paranormal experience, was added to the new scale for a total of 10-items.
The instrument was called the Noetic Experience and Belief Scale using noetic from the Greek noēsis/noētikos, meaning inner wisdom, direct knowing, or subjective understanding; and unlike a vague impression, a noetic experience carries a deep sense of authority and certainty. We included "noetic" in the title rather than "paranormal" in part because of the stigma associated with the term paranormal, which could introduce bias that might be mitigated by using an alternate term. Similarly, the paranormal categories were not stated in the scale but only descriptions of the constructs included (please see extended data 31 . Procedures. The first study administered the NEBS to a randomly selected general population group in the United States to establish validity, test-retest reliability, and confirm the two latent variables of belief and experience. We contracted with Lucid, LLC (New Orleans, Louisiana) to obtain completed surveys from an unbiased census-distributed sample of 350 participants representative of the general population in the United States. The sample was unbiased in that it was not associated with the Institute of Noetic Sciences or any other paranormal or noetic-related group. Lucid, LLC is a marketplace that connects hundreds of sample suppliers with individual primary research studies to facilitate online surveys. Lucid uses screening questions to qualify respondents for a particular study then through programmatic technology aligns the best suppliers for that individual audience. Once a respondent qualifies through the screener, the appropriate suppliers are notified through an API and an email is triggered from the supplier directly to the survey taker. Each of the suppliers on the marketplace has approximately 200 pre-profiled mapped qualifications. These include age, gender, household income, job role, hobbies, etc. Lucid uses these qualifications as well as the screening questions to ensure efficiency and high quality when matching survey takers with individual projects. All potential volunteers are screened, checked for validity, and emailed a link to the survey. Participants were Englishspeaking adults in the United States. Inclusion criteria were: Adults aged 18 to 89, who could read and understand English, and were willing to complete questionnaires. Exclusion criteria were: Children (<18 years old) or Elders >89 years old. Elders 90 years old and older were excluded because the survey was designed to be anonymous and recording ages greater than 90 is considered private health information 46 . Targets for distribution were based on United States census values and were as follows: Gender -50% males and females, Age -18-24 -13%, 25-44 -41%, 45-64 -30%, 65+ -16%; Ethnicity -Hispanic -11%, Black -12%, White (non-Hispanic) -59%, Other -18%.
The study was approved by the Institute of Noetic Sciences Institutional Review Board # WAHH_2018_06. Participants were given a link to a Health Insurance Portability and Accountability Act compliant survey on SurveyMonkey. The first page of the survey was a consent form (please see extended data 31 ). Participants were asked to read the form and check a box acknowledging that they had been informed of the procedures, and risks and benefits of participating in the study. They then completed the survey, which took approximately 15-20 minutes. Data were collected from November 9, 2018 through December 14, 2018. All data were collected anonymously, with no identifiers or IP addresses. Participants were compensated $3 for completing the survey once and $7 ($3 + $4) if they also participated in the retest administration.
In total, 444 began the survey; 26 did not agree to the consent form and 57 agreed to the consent form but did not complete the survey. The remaining 361 participants completed the survey (underlying data 47 ). Surveys were collected between November 9, 2018 andDecember 13, 2018. Participants were on average 44 years old ± 16.8 and had 14.5 ± 5.3 years of education. Of these, 52% were female and 56.8% were in-relationship. Participants were mostly Caucasian (67% Caucasian, 13% Black or African American, 8% Hispanic or Latino, 6% Asian or Pacific Islander, 5% American Indian or Alaskan Native, and 2% preferred not to answer). In terms of salary, 67% of participants had earned between 0 and $75,000 (30% Under $30K, 37% $30K to under $75K, 11% $75K to under $100K, 11% $100K or under $150K, 7% $150K to under $250K, 3% $250K or greater, 2% Decline to answer) with an average household size of 2.6 ± 1.4.
Of the original 361 participants who completed the survey, 96 completed the same survey again approximately one month after the first administration (mean 35.3 days ± 3.7) between December 14, 2018 and January 2, 2019. Participants who completed the retest had similar demographics as the original sample (age 44 years old ± 16.9, education 14.3 ± 2.1, 54% male, 64% Caucasian, 53% in relationship, 74% with income under $75,000, and average household size of 2.4 ± 1.4).
Measures. In addition to demographic information (e.g. age, gender, marital status, socioeconomic status), the main instrument in the survey was the Noetic Experience and Belief Scale (NEBS). The scale contains ten statements about beliefs in intuition, non-local consciousness, extraterrestrials, precognition, survival of consciousness, contact with the dead, clairvoyance, psychokinesis, telepathy, and automatism that all begin with the stem "I believe…" and then a description of the concept. The participant rates each belief statement on a slider anchored by Disagree Strongly (0) and Agree Strongly (100). For each of the ten items, participants also answered "I have personally had this experience." on a slider scale anchored by Never (0) and Always (100). Two experience items were worded differently to accommodate the nature of the concept. The life after death experience item was worded "I have personally had an experience that I interpreted as a proof that consciousness survives the physical body." and the contact with the dead item was worded "I have personally had the experience of contact with the dead." Six of the 10 items were from the Australian Sheep-Goat Scale, three of which were exactly the same (#'s 9, 10, 11), and three were modified (#'s 4, 5,14) 45 . The scale results in overall scores for paranormal belief and experience by averaging the ten items for each subscale. Item scores can also be used individually for scores on each specific category. Internal consistency of the NEBS scale was calculated with a Cronbach α coefficient, as described subsequent sections (Full scale is available in extended data 31 ).
Convergent construct validity was measured by administering pre-existing survey instruments that evaluate similar concepts: Australian Sheep-Goat Scale 45,48 , Revised Paranormal Belief Scale 43 , and Anomalous Experiences Inventory (AEI) 29 . The Australian Sheep-Goat Scale is an 18-item questionnaire on various beliefs and experiences. Respondents endorse True (2 points), Uncertain (1 point), or False (0 points) for each item. Values are then summed to form a score ranging from 0-36. The Revised Paranormal Belief Scale is a 26-item scale that measures the degree of belief in the paranormal in each of seven dimensions: Traditional Religious Belief, Psi, Witchcraft, Superstition, Spiritualism, Extraordinary Life Forms, and Precognition. Respondents endorse how strongly they believe in each item on a 7-item Likert scale. Subscales and a total score are obtained by calculating means of specific items. The AEI is a 70-item questionnaire that evaluates multiple subscales: anomalous/paranormal experience, anomalous/paranormal beliefs, anomalous/paranormal ability, fear of the anomalous/paranormal, and drug use. Respondents answer True (1) or False (0) for each item and values are summed for each scale. The scales selected have already been assessed as valid and reliable and used in numerous peer-reviewed publications. Correlation matrices of the scores were evaluated for expected patterns of associations between measures of the same construct.

Statistical Methods
Test-Retest. Some participants repeated the survey approximately one month later so that test-retest reliability could be assessed with a Pearson correlation coefficient.
Sample size. Some sources suggest at least 10 people per item for psychometric validation although a recent review suggested that sample size is rarely justified a priori 49 . We aimed for a sample size of 350 for the 20-item scale. For confirmatory factor analysis, there is also no agreement on the number of participants needed although sources 50 recommend approximately 10 participants for each estimated parameter (10 × 20 parameters = 200). We had 361 participants resulting in a ratio of 18.05 participants to each parameter estimated.

Confirmatory factor analysis.
A confirmatory factor analysis was used (rather than an exploratory factor analysis) because a theoretical framework was already established for evaluating belief and experience as separate constructs, albeit highly correlated 1,12 . The latent variables for the model were Belief and Experience. Observed variables were the 20 NEBS items. Univariate variables were tested for normality with the Shapiro-Wilk Test and any outliers assessed with scatter and box plots. Normality of residuals were evaluated with kernal density estimates and standardized normal probability plots. Outliers were evaluated with residuals, leverages, influence and Cook's distance. Multicollinearity was evaluated with the variance inflation factor (VIF), which is the quotient of the variance in a model with multiple terms by the variance of a model with one term alone and quantifies the severity of multicollinearity. An unstructured covariance matrix was used so as to not impose any constraints on the variance and covariance values. All 20 items were highly correlated and thus, covariances between unique factors for all items were included in the model and then removed if they did not reach significance. All statistical analyses were conducted with Stata 15.0 (StataCorp, LLC, College Station, TX).

Construct validity
The means and standard deviations for the paranormal belief and experience questionnaires are shown in Table 1. All Table 1 correlation pairs were positive and significant at p = 0.05 level or less (all but three being more than p < 0.00005).

Reliability
Internal consistency. Cronbach's alpha was calculated for the NEBS Belief subscale items and Experience subscale items to measure the extent to which the items within the subscales correlated with each other and measured a similar construct 51 . The ten belief items had a Cronbach's alpha of 0.90 and average inter-item covariance of 429.9. The ten experience items had a Cronbach's alpha of 0.93 and average inter-item covariance of 610.7.
Belief: On average, intuition, survival of consciousness, and non-local consciousness were the highest rated Beliefs (see means and standard deviations for each item in Table 2). All Belief construct pair correlations were significant (p < 0.00005). Telepathy Belief and clairvoyance Belief were highly correlated ( Experience: On average, intuition and non-local consciousness were the most common Experiences. All Experience pairs were significantly correlated (p < 0.00005). Seven Experience pairs were highly correlated (r = 0.70 -0.89). Many Experiences were moderately correlated (r = 0.50-0.69).
Belief and Experience: Most Belief and Experience pairs were significantly correlated at the p<0.000005 level except for belief in intuition and the experience of extraterrestrials (p = 0.0004), contact with the dead (p = 0.0001), psychokinesis (p = 0.002), telepathy (p = 0.002), and automatism (p = 0.0016). Belief in telepathy was highly correlated with the Experience of telepathy (r = 0.79). Belief and Experience pairs of the same construct were all moderately correlated except for Survival of Consciousness which had a significant but low correlation. Many beliefs were moderately correlated with Experiences.
Belief and Experience as separate constructs Confirmatory factor analysis was performed based on data from 361 respondents; there were no missing data. The retest data of the 96 participants were not included in the confirmatory factor analysis modeling. A correlation table of observed values with means and standard deviations is shown in Table 2. The a priori theoretical model of Belief and Experience items as described in the statistics section is presented in Figure 1.
We hypothesized a two-factor model to be confirmed in the measurement portion of the model where Belief and Experience were the latent variables. We evaluated the assumptions of univariate and multivariate normality and linearity. Univariate variables were not normally distributed individually. The ADF estimation method was used because it makes no assumption of joint normality or even symmetry for observed or latent variables (StataCorp, 2013   Arizona Integrative Outcomes Scale (AIOS) is a one-item, visual analogue self-rating scale (VAS) with two alternate forms (one for daily ratings, AIOS-24h; and one for monthly ratings, AIOS-1m). The daily rating version was used for this study. The instructions are: "Please reflect on your sense of wellbeing, taking into account your physical, mental, emotional, social, and  spiritual condition over the past 24 hours. Mark the line below with an X at the point that summarizes your overall sense of well-being for the past 24 hours." The horizontally-displayed VAS is 100 mm in length, with the low anchor being, "Worst you have ever been" and the high anchor being, "Best you have ever been." The AIOS has demonstrated the ability to discriminate between healthy and unhealthy populations and has adequate convergent and divergent validity 54 .
Positive and negative affective well-being is measured with a variety of dichotomous indicators asking subjects whether they had experienced an emotional state for much of the day yesterday. For positive affect, the emotional states are happiness, enjoyment and smiling/laughter, which, aggregated together, have a reliability of α = 0.72. For negative affect, the emotional states are stress, worry and sadness, with a reliability of α = 0.65 55 .
Overall health is a single item question "In general, how would you rate your overall health?" which is answered by choosing one of five options: Poor; Fair; Good; Very good; Excellent 56 .
Acute sleep scale is a single item scale asking participants to rate their quality of sleep over the past 24 hours on an 11-point numeric rating scale ranging from 0 denoting "best possible sleep" to 10 denoting "worst possible sleep" 57 .
The Numeric Pain Rating Scale (NPRS) is a segmented numeric version of the visual analog scale in which a respondent selects a whole number (0-10 integers) that best reflects the intensity of his/her pain. The NPRS is anchored by terms describing pain severity extremes. Participants are asked to report pain intensity "in the last 24 hours" or an average pain intensity with 0 = "No pain" to 10 = "Worst possible pain" 58 .
Big Five Inventory-10 (BFI) scale is a 10-item measure of the Big Five (or Five-Factor Model) dimensions: Neuroticism, Extraversion, Openness to Experience, Agreeableness, Conscientiousness. The BFI-10 was developed to provide a personality inventory for research settings with time constraints. It allows assessing the Big Five with only two items per dimension. Previous research has shown that the BFI-10 possesses psychometric properties that are comparable in size and structure to longer five factor inventories such as the NEO-PI-R which has 240 items. The score for each dimension is obtained by summing standard items and reverse scored items for each scale 59 .
Compassion scale is 5 items from the Dispositional Positive Emotion Scale compassion subscale. It measures dispositional tendencies to feel positive emotions toward others in their daily lives. Items are rated from strongly disagree to strongly agree and scored from 1 to 7. Items are averaged for a total score and higher scores indicate greater levels of positive emotion 60 .
Statistical Analysis. Demographic information was qualitatively described for categorical variables. Means and standard deviations calculated for all continuous variables. Pearson correlations were conducted for relationships between measures. Cronbach's Alpha was calculated for the Belief and Experience subscales. All analyses were conducted with Stata 15.0 (StataCorp, LLC, College Station, TX). The confirmatory factor analysis was conducted in the same was study 1.

Results
The NEBS Belief items had a Cronbach's alpha of 0.93 and an average inter-item covariance of 304.4. The NEBS Experience items had a Cronbach's alpha of 0.91 and average inter-item covariance of 476.4. The experience scale was moderately correlated with the belief scale in this sample (Table 4).

Discussion
The overall results of the two studies provide psychometric support for the validity and reliability of the NEBS as a brief assessment of self-reported paranormal beliefs and experiences. The participant demographics of study 1 reflected the general population of the United States as designated by the recruitment criteria. Construct validity of the NEBS Belief subscale was strong, as it was strongly correlated with multiple other scales measuring paranormal belief including the Australian Sheep Goat scale, the Psi, Spiritual and Precognition subscales of the Paranormal Belief Scale, and AEI Paranormal Belief subscale. Construct validity of the NEBS Experience subscale was also strong, demonstrating higher correlations to experience items such as AEI-Paranormal Experience and AEI-Paranormal Ability than other items such as Traditional Religious Beliefs. The NEBS did not measure paranormal fear or drug use as reflected in the low correlations on those AEI subscales. For divergent validity, there were only negligible correlations (r's between 0 and 0.30) to all other measures, providing more evidence that the NEBS is not measuring other constructs. Interestingly, other studies evaluating personality traits and paranormal beliefs have been mixed 62,63 , with some studies observing positive correlations with neuroticism 64,65 (unlike our study which found a negligible negative correlation) and others not finding any correlations 65-67 . The NEBS reliability and internal consistency was also demonstrated through high Cronbach's alphas for both subscales in two different samples. Our confirmatory factor analysis for two latent constructs of Belief and Experience in the general population dataset revealed a model good fit (RMSEA = 0.06), controlling for covariances between specific individual items, that was then confirmed with the IONS Discovery Lab sample (RMSEA = 0.06). RMSEA calculates the size of the standardized residual correlations and theoretically ranges from 0 (perfect fit) to 1 (poor fit). A model is considered satisfactory when RMSEA < 0.08 68,69 . Our conceptual model of Belief and Experience as separate constructs and as evaluated through the NEBS was confirmed.
When measured separately, Belief and Experience are highly correlated. We found this in both of our samples (study 1: r =0.77; study 2: r = 0.64). Interestingly, the correlation was . The AEI -Paranormal Belief and Paranormal Experience subscales were highly correlated in our study 1 sample as well (0.77). Interestingly, the original study of this scale found a much lower (r =0.57) although significant correlation between the two subscales 29 . We also found belief and experience to be highly correlated (r = 0.61) for another mixed population of scientists and engineers, the general population, and paranormal enthusiasts 12 . Other studies that have evaluated belief and experience in general have also found positive correlations 17,18 . A study examining the correlation between specific religious and classic paranormal beliefs, such as belief in heaven and hell or psychic healing, in relation to the paranormal experiences of illness cured by prayer and the use of the mind to heal the body, found mixed results. For example, belief in the devil and belief in illness cured by prayer had a low significant correlation (r = 0.38), but the relationship between illness cured by prayer and the belief in psychic healing (r = -0.04) was not 40 .
Paranormal belief and experience are highly correlated in most studies that assess them, and yet they are distinctly different constructs that should be evaluated separately. What we do not yet understand is the causal or temporal nature of the relationship between belief and experience. Does paranormal belief precede experience or vice versa? Does someone's belief in the paranormal prime them to experiencing it or does a subjective experience of the paranormal instill belief in the phenomena? Future longitudinal studies evaluating a baseline level of people's beliefs and collecting data on how those beliefs change over time in relation to any experiences they have would be helpful in answering this question.
There are a number of limitations that should be kept in mind when reviewing the results of this study. The individual constructs included in the NEBS are highly correlated. Conceptually, the individual concepts are unique but could also be viewed as overlapping. For example, the items on non-local consciousness (B2. I believe that my consciousness is not limited by my physical brain or body. E2. I have personally had this experience.) and survival of consciousness (B5. I believe in life after death. E5. I have personally had an experience that I interpreted as a proof that consciousness survives the physical body.) could be considered as the same construct worded in a different way. The experience items are administered directly after the belief item of the same construct. The instrument was purposefully designed in this way to keep it concise. However, asking the belief question directly before the experience question could bias responses to the experience question in some way. We also acknowledge that the limited objective format of the survey (answered with a slider from 0-100) with constrained definitions is limiting. A more in-depth phenomenological approach would surely provide greater nuance and depth of understanding of belief and experience. However, the nature of such an instrument in terms of administration and scoring would not solve the problem of needing a simple and concise instrument. Any NEBS results should be interpreted with these limitations in mind.
In summary, the NEBS is a 20-item survey rated on a sliding scale from 0-100, with 10 Belief and

Florey Neuroscience Institutes, Melbourne, Australia
This is the careful revision of a thorough study on a 20-item questionnaire for assessing paranormal beliefs using a visual analogue scale with 10 items on belief and 10 items on experience. The test was applied to 361 subjects in an observational study and to a second group of 646 control subjects recruited in their lab. Statistical tests were used to determine validitiy, reliability, internal consistency as well as test-retest reliability. The results show that the questionnaire matches these quality measures. The study is novel and timely. But there still is room for improvement.
Likewise, the authors fail to provide a greater picture of what a belief is. In fact, natural or normal beliefs are essential products of brain function (Seitz in this journal; cited in the paper). In contrast the et al. paranormal beliefs of the kind the authors address in this study are considered to result from abnormal brain function. As paranormal beliefs are the objective of the study, the questionnaire is not applicable as a general or comprehensive belief scale. This should be stated in the discussion.
Right from the first sentence of the introduction the authors point out that they focus on "paranormal beliefs". Also, the authors suggest that belief pertains to the abnormal or paranormal and provide a good survey about similar studies. But it remains unclear what abnormal beliefs are in comparison to delusion-like beliefs as investigated in the normal population by Pechey and Halligan (2011).
The scale items (presented in Appendix I) consist of statements beginning with " I believe …" (B-statement) and throughout with one statement "I have personally had this experience" (E-statement).
In contrast to what the authors note the high correlation of the ten B-and E-statements is not surprising, as both statements involve the same types of neural information processing. The B-statements reflect a personal inference explaining a previous perception or experience of high personal relevance, which is a belief as described earlier by Seitz and Angel in this journal (cited in the paper). Therefore, it is justified that the authors call these phrases B-statement. For comparison, the E-statement focuses semantically onto an experience of a paranormal perception. Asking the subjects to approve or deny this statement, however, requires that the subjects recall this very perception or experience. This recall from memory has a probability of being true for the subject which also reflects a belief and would be expressed correspondingly by "I believe that I have personally had this experience". Consequently, this renders the E-statements to be essentially B-statements as well.
The authors present nicely the development of their scale (pages 3 through 4). This should go under a headline of its own. For comparison, study 1 refers to the procedures described on pages 4 through 5. Therefore, the headline of study 1 should be moved accordingly.
It is unclear who the 899 participants are (page 4 left column). If this pertains to both study populations, why is this number mentioned here? Please, clarify. Likewise, it is unclear, in which subgroups these participants were analyzed. 1 2 participants were analyzed. Page 12, left column, line 1: I guess it should be than instead of that.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
No competing interests were disclosed.

Competing Interests:
Reviewer Expertise: Neurology, Cognitive Neuroscience I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
paper). In contrast the paranormal beliefs of the kind the authors address in this study are considered to result from abnormal brain function. As paranormal beliefs are the objective of the study, the questionnaire is not applicable as a general or comprehensive belief scale. This should be stated in the discussion.
-We have added these sentences to the discussion "Others have suggested that paranormal beliefs stem from abnormal brain function or psychopathology such as Dissociative Identity Disorder or Schizophrenia. The NEBS focus on the phenomenology of paranormal beliefs and experiences regardless of any pathology that may have generated them." RS: Right from the first sentence of the introduction the authors point out that they focus on "paranormal beliefs". Also, the authors suggest that belief pertains to the abnormal or paranormal and provide a good survey about similar studies. But it remains unclear what abnormal beliefs are in comparison to delusion-like beliefs as investigated in the normal population by Pechey and Halligan (2011).
-Pechey and Halligan comment in their abstract that "Delusions are defined as false beliefs different from those that almost everyone else believes." Their study shows that many of these beliefs are commonly believed by the general population and thus, should not be considered delusions. In fact, their item "The soul or spirit survives death" corresponds with the NEBS item #2 and "Some people communicate with the dead" corresponds to NEBS item #6 and both these items were commonly endorsed (@64% and 55% respectively) in Pechey and Halligan's study. We also never use the word "abnormal" in the paper. We are not saying that paranormal beliefs are abnormal anywhere in the manuscript. As Pechey and Halligan's paper supports, some of these beliefs are "normal" in that many in the population believe them. Also, the first DSM-5 schizophrenia diagnosis criterion revolves around the symptoms that may include delusions but do not have to. There must be two or more symptoms of delusions, hallucinations, disorganized speech, grossly disorganized or catatonic behavior, or negative symptoms for at least a one-month (with at least one of those symptoms being delusions, hallucinations, or disorganized speech). Most importantly in this conversation is that the symptoms must impair the person's work, interpersonal relations, or self-care for a significant amount of time since the symptoms began. The symptoms must also be ongoing for at least 6 months and other mental illness and effects from substances or medical conditions must be ruled out. Other symptoms that contribute to a schizophrenia diagnosis beyond these key criteria include: inappropriate emotions, disturbed sleep, negative mood, anxiety and phobias, detachment or feeling of disconnection from self, a feeling that the surroundings aren't real, impairments in language, cognitive processing and memory, social deficits, and hostility and aggression.
The goal of our study was not to highlight pathology or abnormal function but to provide a tool to quickly evaluate the phenomenology of experience and beliefs of this nature.

RS:
The scale items (presented in Appendix I) consist of statements beginning with " I believe …" (B-statement) and throughout with one statement "I have personally had this experience" (E-statement). In contrast to what the authors note the high correlation of the ten B-and E-statements is not surprising, as both statements involve the same types of neural information processing. The B-statements reflect a personal inference explaining a previous perception or experience of high personal relevance, which is a belief as described earlier by Seitz and Angel in this journal (cited in the paper). Therefore, it is justified that the authors call these phrases B-statement. For comparison, the E-statement focuses semantically onto an experience of a paranormal perception. Asking the subjects to approve or deny this statement, however, requires that the subjects recall this very perception or experience. This recall from memory has a that the subjects recall this very perception or experience. This recall from memory has a probability of being true for the subject which also reflects a belief and would be expressed correspondingly by "I believe that I have personally had this experience". Consequently, this renders the E-statements to be essentially B-statements as well.
-Yes, this is a good point. We believe this is intrinsic in all phenomenological surveys that ask participants to reflect on their experiences.

RS:
The authors present nicely the development of their scale (pages 3 through 4). This should go under a headline of its own. For comparison, study 1 refers to the procedures described on pages 4 through 5. Therefore, the headline of study 1 should be moved accordingly.
-Thank you we have moved the Study 1 heading to below the development of the scale.
RS: It is unclear who the 899 participants are (page 4 left column). If this pertains to both study populations, why is this number mentioned here? Please, clarify. Likewise, it is unclear, in which subgroups these participants were analyzed.
-Thank you, we see how this can be confusing. We were listing the number of participants in the referenced study. We've removed this from the manuscript as the number of participants in this referenced study is not important here. Readers can see the referenced paper if they would like to see more details of participant numbers in each group.
Page 12, left column, line 1: I guess it should be than instead of that.
-Thank you, this has been fixed.
No competing interests were disclosed. Competing Interests:

Version 1
The introduction is severely lacking any review of previous scales and how different factors have been conceptualised in the past.
The development of the NEBS should be in the methods section and not the introduction.
The methods sections would benefit from traditional subheadings.
The results section contains demographic data which should be in the participant's section in the methods.
An exploratory factor analysis would have been useful on the first sample.
For study one, the CFA should come first followed by test retests.
Double-check the fit indices (particularly the CFI, the cut of is listed as .90 when in fact it should be .95) and was the chi-square significant?
Study two -why are there wellbeing measures? They are only mentioned briefly via a table in the results.
The items relating to experience are all the same.
The participant is asked if they have experienced something directly after they have been asked if they believe it; this could present a confound.
Overall, while this paper offers an interesting premise, there are several flaws. The scale itself does not have dedicated experience items that refer to specific phenomena, which is problematic. The methodology itself is also flawed. I suspect an Exploratory Factor Analysis (EFA) would reveal that the factors would be around the phenomena rather than the dichotomy of belief and experience. I.e., a person, who believes in particular phenomena is more likely to experience them; therefore, the factors will demonstrate this. I would recommend that this be more of an exploratory study and that an EFA be run on the first sample at the very least. However, this would probably not give the desired outcome. The items that relate to experience should ideally be able to be answered in isolation and not be dependent on the belief items. This could act as a prime, with people who state they believe in something when asked if they have experienced it directly after being more likely to agree.

If applicable, is the statistical analysis and its interpretation appropriate? Partly
Are all the source data underlying the results available to ensure full reproducibility?

Are the conclusions drawn adequately supported by the results? Yes
No competing interests were disclosed. Competing Interests: Reviewer Expertise: Paranormal belief I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Response 12 Feb 2020 , Institute of Noetic Sciences, Petaluma, USA

Helané Wahbeh
We thank the reviewer for their thoughtful comments. We have made revisions to the paper as a result that we believe have strengthened it. Below you will find each reviewers comments and our changes to the manuscript or responses.
The introduction is severely lacking any review of previous scales and how different factors have been conceptualised in the past.
-The original manuscript included Supplementary Data that summarized a review of 18 previous scales that informed the design of our scale. We have revised the text in the Introduction to clarify and draw attention to this. The revised text reads: "We also reviewed a number of existing questionnaires that measured paranormal experience and/or belief (8,26,28,29,(32)(33)(34)(35)(36)(37)(38)(39)(40)(41)(42)(43)(44)(45). A summary of this review is presented as Supplemental data A. (please see extended data (31)). Each questionnaire was evaluated for the number of items, whether it assesses belief, experience or both, whether it evaluates belief and experience as separate constructs, and subscales if applicable." The development of the NEBS should be in the methods section and not the introduction.
-The development of the NEBS has now been moved to the methods section.
The methods sections would benefit from traditional subheadings.
-The methods sections now has subheadings.
The results section contains demographic data which should be in the participant's section in the methods.
-The participant demographic information was moved to the participant's section in the methods.
An exploratory factor analysis would have been useful on the first sample.
-Please see last item for full response of this comment. The items relating to experience are all the same.
-The wording of the experience items are the same but reflect the belief construct expressed in the item directly before.
The participant is asked if they have experienced something directly after they have been asked if they believe it; this could present a confound.
-Please see response below about EFA and dependencies between belief and experience items.
Overall, while this paper offers an interesting premise, there are several flaws. The scale itself does not have dedicated experience items that refer to specific phenomena, which is problematic. The methodology itself is also flawed. I suspect an Exploratory Factor Analysis (EFA) would reveal that the factors would be around the phenomena rather than the dichotomy of belief and experience. I.e., a person, who believes in particular phenomena is more likely to experience them; therefore, the factors will demonstrate this. I would recommend that this be more of an exploratory study and that an EFA be run on the first sample at the very least. However, this would probably not give the desired outcome. The items that relate to experience should ideally be able to be answered in isolation and not be dependent on the belief items. This could act as a prime, with people who state they believe in something when asked if they have experienced it directly after being more likely to agree.
-Exploratory factor analysis (EFA) is a statistical technique used to find the underlying structure of a set of observed variables (Gorsuch, 2015), whereas confirmatory factor analysis (CFA) is used when researchers have formulated a hypothesis regarding the relationship between observed variables and the underlying latent factors (Gorsuch, 2015). There has been a debate regarding the circumstances in which these two analyses should be used in research (Hurley et al., 1997) and whether/when they should be used in tandem (Gerbing and Hamilton, 1996). EFA can be used prior to cross-validation with CFA for the purpose of model specification (Gerbing and Hamilton, 1996). EFA can also be used after CFA to explore poor fits in CFA models, explore factor structures when the original hypotheses are weak, and confirm factor structures when the original hypotheses were strong, but certain assumptions are not reasonable (Schmitt, 2011 The reviewer recommends an EFA on the first sample. It is suggested, but not made clear, that an EFA on the first sample would serve the purpose of exploration regarding other possible factor structures. It is also suggested that EFA could possibly identify elevated correlation in the already correlated belief and experience factors that is due to survey structure alone. In this study, hypotheses are based on theory and practice, and are therefore strong. It is unclear whether EFA is suggested to confirm assumptions that may not hold in this study. If so, what are these specific assumptions and why are they unreasonable or not upheld? If EFA is suggested as a pre-cursor to CFA, then EFA should be conducted followed by cross-validation with CFA on an independent data set, as suggested by Gerbing and Hamilton (1996). If this procedure is followed, either (1) the EFA will support the researchers' hypotheses or (2) the EFA will not support the researchers' hypotheses. If (2) (2) occurs, "this would probably not give the desired outcome", and it is highly pertinent to point out that the researchers did not undertake this study to achieve a "desired outcome" but rather to test hypotheses. If (2) occurs, it is not clear whether this invalidates NEBS as a functioning survey tool. CFA does not always confirm a factor structure obtained via EFA (van Prooijen and van der Kloot, 2001;Borkenau and Ostendorf, 1990). If a CFA fits well and satisfies all assumptions but an EFA indicates that the underlying structure of the data can be represented by different factors, is this sufficient evidence to discard the original CFA? If so, can the reviewer provide support for this claim? In Gorsuch's classic text on factor analysis, he states "Confirmatory factor analysis tests hypotheses that a specified subset of variables legitimately define a prespecified factor" (Gorsuch, 2015). If the researchers have found that a subset of variables defined their prespecified factors and this was the central goal of legitimately their paper, what then is the purpose and role of added exploratory analysis in the context of this paper?
If the reviewer's hypothesis that an EFA "would reveal that the factors would be around the phenomena rather than the dichotomy of belief and experience" is supported by an EFA on the first sample, then it is not clear whether presenting items in isolation would remedy this given the high inherent correlation between belief and experience, and therefore between the items themselves. If this statement is more than opinion or a hypothesis and there is scientific support for it, how should this isolation be achieved and how much isolation is enough isolation to guarantee that this effect is removed from the analysis? If the diagnosis of this specific issue is the only reason for the suggested EFA, then if the researchers were to reorganize and re-administer the surveys according to the reviewer's specifications, would an EFA still be necessary? It is also worth noting that the question of whether or not the survey structure introduced added dependence between items can be easily tested experimentally by providing the original and modified surveys (with some reasonable span of time in between) to a group of (new) participants and quantifying the difference in responses. If the difference is not statistically significant, then any added dependence should be negligible. Would this kind of adjustment in the methodology remedy the need for an EFA, according to the reviewer?
In summary, it is unclear whether the reviewer calls for an EFA to 1) examine other factor structures in the data or 2) test additional dependencies that may have been introduced by the survey structure. If (1), it is unclear what role this EFA would play in the current paper. If (2), it may be more straightforward to address this experimentally. Borkenau, Peter and Ostendorf, Fritz (1990). Comparing exploratory and confirmatory factor analysis: A study on the 5-factor model of personality. , Personality and Individual Differences