Psychometric Evaluation of A Creative Thinking Performance Test for Science Education

Molani Paulina Hasibuan; Widha Sunarno; Elfi Susanti Vh

doi:10.12688/f1000research.178138.1

Home Browse Psychometric Evaluation of A Creative Thinking Performance Test for...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Research Article

Psychometric Evaluation of A Creative Thinking Performance Test for Science Education

[version 1; peer review: 1 not approved]

Molani Paulina Hasibuan ¹, Widha Sunarno¹, Elfi Susanti Vh¹

PUBLISHED 02 Apr 2026

Author details Author details

¹ Natural Science, Universitas Sebelas Maret Fakultas Keguruan dan Ilmu Pendidikan, Kota Surakarta, Jawa Tengah, 57126, Indonesia

Molani Paulina Hasibuan
Roles: Conceptualization, Investigation, Methodology, Writing – Original Draft Preparation

Widha Sunarno
Roles: Supervision, Validation, Writing – Review & Editing

Elfi Susanti Vh
Roles: Formal Analysis, Methodology, Supervision, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

Abstract

Background

Creative thinking is a core competence in science education for addressing complex environmental, technological, and societal challenges. However, students’ creative thinking performance remains insufficient, highlighting the need for a valid and reliable performance-based assessment instrument. This study aimed to develop and validate the Creative Thinking Performance Test (CTPT) to measure four dimensions of creative thinking in science learning: sensitivity, flexibility, novelty, and elaboration.

Methods

The CTPT was developed through several stages: blueprint construction based on four creative thinking dimensions, essay item development, expert validation using the Delphi technique, pilot testing, and psychometric evaluation. The instrument was administered to 138 elementary school teacher education students with diverse demographic and academic characteristics. Data were analyzed using Exploratory Factor Analysis (EFA), Confirmatory Factor Analysis (CFA), and Rasch modeling to examine construct validity, reliability, and item characteristics.

Results

The findings revealed variations in students’ creative thinking skills based on demographic and academic factors. EFA and CFA supported the four-dimensional structure of the instrument. Rasch analysis confirmed good item fit, appropriate difficulty levels, and satisfactory reliability indices.

Conclusions

The study introduces the CTPT as a valid, reliable, and contextually relevant performance-based assessment tool for science education. The instrument complements conventional tests used by lecturers and provides practical support for assessing and enhancing students’ creative thinking skills in natural science learning contexts.

Keywords

creative thinking skills, performance-based assessment, science education

Corresponding authors: Molani Paulina Hasibuan, Widha Sunarno

Competing interests: No competing interests were disclosed.

Grant information: This research is supported by Indonesian Education Scholarship, Center for Higher Education Funding and Assessment, and Indonesian Endowment Fund for Education. With contract 00639/BPPT/BPI.06/9/2023.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2026 Hasibuan MP et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The author(s) is/are employees of the US Government and therefore domestic copyright protection in USA does not apply to this work. The work may be protected under the copyright laws of other jurisdictions when used in those jurisdictions.

How to cite: Hasibuan MP, Sunarno W and Vh ES. Psychometric Evaluation of A Creative Thinking Performance Test for Science Education [version 1; peer review: 1 not approved]. F1000Research 2026, 15:462 (https://doi.org/10.12688/f1000research.178138.1) First published: 02 Apr 2026, 15:462 (https://doi.org/10.12688/f1000research.178138.1) Latest published: 02 Apr 2026, 15:462 (https://doi.org/10.12688/f1000research.178138.1)

Introduction

Creative thinking skills are among the most important skills in education, especially in science learning. Creative thinking skills enable students to generate new ideas and encourage finding alternative solutions in dealing with complex scientific problems (Da’as, 2023; Han & Abdrahim, 2023). Various national and international curricula explicitly emphasize the importance of developing creative thinking skills so that students can adapt to the dynamics of scientific and technological developments (Chang et al., 2022; David, 2023; Dilekçi & Karatay, 2023; Shively et al., 2018). However, various studies suggest that students’ performance in creative thinking still tends not to be optimal, especially when confronted with science contexts that demand divergent and convergent thinking simultaneously (Affandy et al., 2024; Bulut Ates & Aktamis, 2024; Hong & Song, 2020). The condition emphasizes the need for accurate, valid, and reliable evaluation instruments to measure students’ creative thinking skills (Priyaadharshini & Vinayaga Sundaram, 2018; Ross et al., 2023; Shively et al., 2018). Appropriate evaluation is the basis for educators in designing effective learning strategies, so that creative thinking skills can truly develop according to the demands of the times.

Consequently, creative thinking has long been the focus of studies by psychologists and educationists. Guilford (1950) defined creative thinking as generating various possible answers to a problem by emphasizing aspects of divergent thinking. Torrance et al. (1992) then developed the definition through the Torrance Test of Creative Thinking (TTCT), which emphasizes four indicators: fluency, flexibility, originality, and elaboration. Jia et al., (2017) emphasized that creative thinking is related to cognitive potential and an individual’s real performance in solving problems. Therefore, the measurement of creative thinking performance requires a test instrument that is valid, reliable, and performance-based (Rhee et al., 2025; Shahbazloo & Abdullah Mirzaie, 2023). Therefore, measuring creative thinking skills is not enough to assess students’ declarative knowledge, but must reflect the ability to generate, develop, and apply ideas in real contexts (Oo et al., 2024; Pontis & Salerno, 2025).

Numerous instruments have been developed to measure creative thinking skills, such as the Torrance Test of Creative Thinking (Torrance et al., 1992), the Runco Ideational Behavior Scale (Runco et al., 2001), and Guilford’s Alternative Uses Task (Guilford, 1950). The instruments are widely used in psychology and education research, but they are mostly self-reported or generalized tests that assess creativity globally (Rhee et al., 2025; Shahbazloo & Abdullah Mirzaie, 2023). The condition is not suitable for the context of the science curriculum, especially in Indonesian education, which demands assessment based on students’ real performance in solving scientific problems. Moreover, existing instruments tend to emphasize general aspects of creativity, without linking them directly to creative thinking performance in the context of science learning (Runco et al., 2001; Shahbazloo & Abdullah Mirzaie, 2023). Therefore, there are still limitations in obtaining measurements that are accurate, valid, and relevant to the needs of the science curriculum, so the development of test instruments that are more contextual and performance-based is needed.

The limitations of previous research are even clearer when viewed from the two main types of assessments in measuring creative thinking skills: self-reported and performance assessments. Self-reported instruassessmentsrelatively easier to implement, but they often produce bias because students overestimate or underestimate their creative abilities (Tep et al., 2021; Xu et al., 2025). On the other hand, performance-based assessments are more accurate as they assess students’ real ability to generate creative ideas or solutions (Lebuda et al., 2024; Patterson et al., 2024). However, developing and validating performance-based instruments is still rare, especially those that use modern psychometric approaches to ensure instrument validity and reliability (Patterson et al., 2025; Shahzad et al., 2025). As far as the researchers know, instruments that specifically evaluate students’ creative thinking performance in science learning with strong psychometric validity and reliability tests are still very limited. Therefore, there is an important research gap to be filled by developing new instruments that are more contextualized and standardized.

Responding to this gap, the present study develops and validates the Creative Thinking Performance Test (CTPT) as a performance task–based instrument to evaluate students’ creative thinking skills in the context of science learning (Hasibuan et al., 2025). Contrasting with self-reported instruments, CTPT emphasizes assessing students’ real performance in generating creative ideas. The validation process was conducted using modern psychometric standards, including the Rasch model’s application and references to instrument evaluation standards issued by the American Educational Research Association (AERA) and the American Psychological Association (APA), thus ensuring the reliability and validity of the instrument. The main contribution of the research is the provision of valid, reliable instruments in accordance with the needs of the science curriculum to measure creative thinking skills more accurately. The research implies the availability of evaluation equipment that lecturers and researchers can use to assess and design learning strategies that enhance students’ creative thinking skills.

According to the discussed background, the current research contributes by developing and validating a new instrument, the Creative Thinking Performance Test (CTPT), which is designed to evaluate students’ creative thinking skills in the context of science learning. The instrument is used to measure four dimensions of creativity-fluency, flexibility, originality, and elaboration-which refer to Guilford, (1950) and Torrance et al., (1992) theories of creativity, while following modern psychometric measurement standards as advocated by AERA and APA. Furthermore, to strengthen its contribution, this study was formulated into three research questions: RQ1: To what extent does the CTPT demonstrate structural validity and reliability based on Rasch model analysis? RQ2: To what extent does the CTPT have external validity in predicting student performance on science problem-solving tasks? RQ3: What is the ability of students to generate adaptive and innovative scientific solutions, and what are the implications for their creative thinking skills?.

Research methodology

The Creative Thinking Skills Test (CTPT) instrument was developed by following standardized test development procedures as elaborated in the American Educational Research Association (AERA), the American Psychological Association (APA), and the National Council on Measurement in Education (NCME). The development process includes: (1) blueprint development based on the constructs of creative thinking skills relevant in science learning, (2) preparation of initial items according to the blueprint, and (3) assessment of face and content validity through expert review and initial trials. Item analysis was conducted using Rasch Measurement by considering the difficulty and distinguishing factor index. The structural validity and reliability of the CTPT (RQ1) were tested using the Rasch model to ensure item fit to the model as well as internal consistency of the instrument. The external validity of the CTPT (RQ2) was evaluated through analyzing the instrument’s ability to predict students’ creative thinking skills performance on an open-ended problem-solving task, compared to a self-report-based creative thinking assessment instrument (Hasibuan et al., 2025).

Participants

The sample was selected using purposive sampling, a technique based on certain criteria relevant to the research objectives (Creswell, 2012). Purposive sampling was chosen because it allows researchers to obtain participants per the research context, such as academic background, university of origin, and semester level, so the data obtained can be more in-depth and representative according to research needs (Creswell, 2012). The study involved 138 students as participants.

According to gender ( Table 1), there were 14 male students (10.14%) and 124 female students (89.86%). Based on university affiliation, 63 students came from University A (45.65%) and 75 from University B (54.35%). Regarding the semester, 46 students were in semester 2 (33.33%), 47 students in semester 4 (34.06%), and 45 students in semester 6 (32.61%). The distribution of domicile location is almost balanced, with 65 students from urban areas (47.10%) and 73 students from rural areas (52.90%). Participation in campus organizations indicated that 60 students were actively involved (43.48%), while 78 students did not participate in organizations (56.52%). Based on academic achievement, most students have a GPA of 3.1–4.0 (101 students, 73.19%), while 37 students (26.81%) are in the GPA range of 2.1–3.0, and no one has a GPA below 2.0.

Table 1. Demographic characteristics of respondents.

Category	Sub category	N	%
Gender	Male	14	10,14%
Gender	Female	124	89,86%
University	University A	63	45,65%
University	University B	75	54,35%
Semester	2nd Semester	46	33,33%
	4th Semester	47	34,06%
	6th Semester	45	32,61%
Location	Urban	65	47,10%
Location	Rural	73	52,90%
Campus organization involvement	Participating	60	43,48%
Campus organization involvement	Not Participating	78	56,52%
GPA	1,0-2,0	0	0,00%
	2,1-3,0	37	26,81%
	3,1-4,0	101	73,19%

Theoretical construction

The literature review identified four dimensions of creative thinking skills that are relevant to explore in the context of science learning (Batlolona et al., 2019; Haim & Aschauer, 2024; Kholid et al., 2024; Suradika et al., 2023; Torrance et al., 1992). The study refers to Torrance’s classic framework to formulate an adequate construct, strengthening it with the results of recent research emphasizing the context of science education. The four dimensions (see Table 2) are positioned as the main factors, which are sensitivity, which refers to the ability of students to be sensitive in detecting problems and generating adaptive ideas; flexibility, which refers to the skill of generating varied ideas from various perspectives and categories; novelty, which emphasizes the ability to create unique and original ideas in offering new solutions; and elaboration, which describes the ability to expand and develop ideas into more detail and quality. The definition of the four dimensions becomes the basis for the operationalization of the instrument, where each dimension will be derived into indicators and items that represent students’ creative thinking performance in science.

Table 2. Theoretical synthesis of creative thinking skills dimensions.

Dimensions	Definition/description	Sources
Sensitivity	Responsive in generating adaptive ideas to solve problems.	Ernawati et al., (2023)
Flexibility	Generate ideas that vary from multiple perspectives and categories.	Haim & Aschauer, (2024); Nasution et al., (2023); Batlolona et al., (2019); Torrance et al., (1992)
Novelty	Devise unique ideas that provide new solutions to problems.	Haim & Aschauer, (2024); Nasution et al., (2023); Batlolona et al., (2019); Torrance et al., (1992)
Elaboration	Developing an idea to be more comprehensive, thus improving the quality of the idea.	Haim & Aschauer, (2024); Nasution et al., (2023); Batlolona et al., (2019); Torrance et al., (1992)

Blueprint construction

The blueprint of the creative thinking skills instrument was developed based on four main dimensions determined through theory synthesis 1: 1) Sensitivity, 2) Flexibility, 3) Novelty, and 4) Elaboration. The research referred to a variety of sources to identify relevant concepts for each dimension, including classic literature on creativity studies (e.g., Torrance et al. (1992)), recent academic publications in reputable journals (e.g., Batlolona (2020)), and research reports focusing on the context of science education (e.g., Ernawati et al. (2023)). The decision to combine classic and contemporary sources was intended to ensure that the instrument’s construction was grounded in basic theories of creativity and relevant to recent developments in science learning. The blueprint development process was carried out through three stages, which are: 1: first, researchers independently reviewed relevant documents and articles to record indicators of creative thinking skills in each dimension; second, the results of the review were compiled and grouped so that more concise and representative indicators were obtained; and third, the indicators that had been compiled were then validated by a panel of experts in the field of science education and instrument development to ensure the representativeness of the concept and suitability for the research context.

Item construction

Constructing the items began by aligning each item with the concept of creative thinking skills formulated in the blueprint (see Table 2). The alignment was done to ensure coverage of the four dimensions: sensitivity, flexibility, novelty, and elaboration. Two items represented each dimension, so there were eight essay items. The essay format was chosen because it allows students to express ideas freely, display originality, and provide a more in-depth description of the cognitive processes underlying creative thinking skills in science (Kartini et al., 2021; Mafinejad et al., 2017). The design stage developed each item to encourage students to provide argumentative, detailed, and contextual answers according to the science problems presented. The initial draft of the items underwent several revisions to enhance clarity of wording, appropriateness of context, and level of cognitive demands. Two researchers independently reviewed each item to minimize ambiguity and ensure alignment with indicators in each dimension. Furthermore, an expert panel consisting of science education and educational assessment experts reviewed each item to provide input related to clarity, relevance, and conformity to theoretical constructs. Joint discussions were held until consensus was reached on the necessary modifications.

Scoring

The scoring guidelines in this instrument were developed based on a performance rubric approach that refers to Torrance et al. (1992) creativity theory as well as developments in current research in science education (e.g., Batlolona (2020)). The theory emphasizes that creative thinking skills are not only seen from the number of ideas, but also the quality of the ideas produced, including relevance, realism, uniqueness, and level of elaboration. Therefore, scoring was done in the range of 0–4, with the criteria: score 4 for ideas that are relevant, realistic, contextual, and expressed clearly and completely; score 3 for ideas that are adaptive, realistic, and contextual, but less clear or less complete; score 2 for ideas that are adaptive but less realistic; score 1 for ideas that are not adaptive to the problem; and score 0 if no ideas are given.

Delphi validation

Validation of the instrument items in the current study was conducted using the Delphi technique, which is a method that involves a panel of experts to obtain a consensus of judgment through a systematic and structured process (Linstone et al., 2002). Delphi validation is needed to ensure that the instrument items are not only theoretically valid but also in accordance with the substantive context being measured, thus increasing the instrument’s content validity (Aiken, 1985). The validation results suggested that all items obtained an Aiken index between 0.93–0.96 with a V table value 0.74. Because all index values exceeded the critical value, each item was declared substantially valid (Aiken, 1980). Therefore, the eight essay items developed have met the criteria of content suitability based on expert consensus and are suitable for the next stage of instrument testing.

Pilot test

Conducting a pilot test is an important stage before the instrument is widely used, because it ensures that respondents can understand the items well, have clarity of wording, and can represent the abilities to be measured. Patel & Patel, (2019) state that pilot testing helps researchers identify instrument weaknesses in terms of language, substance, and technicality, while Creswell, (2012) emphasizes the role of pilot tests in providing an initial overview of instrument reliability and validity. The pilot test conducted on 35 students showed that all items could be answered well and did not cause significant confusion. However, some suggestions regarding the wording of certain items needed to be simplified to make them clearer. The results suggest that the instrument is generally feasible to use, but still requires minor revisions to the linguistic aspects to be more optimal in the main research.

Data analysis

Data analysis was conducted through several stages, starting with Exploratory Factor Analysis (EFA) to explore the factor structure, considering factor loading ≥0.40 as the minimum limit (Hair et al., 2019). Furthermore, Confirmatory Factor Analysis (CFA) was used to test the model fit, with cut-off criteria such as CFI and TLI ≥ 0.90 (Kline, 2015), RMSEA ≤0.08, and χ²/df ≤ 3 (Lin & Tsau, 2013). Discriminant validity was tested using the Fornell-Larcker approach, where the AVE value must be greater than the squared correlation between constructs (Kline, 2015), while criterion validity was determined from the presence of significant correlations with relevant external measures (Shahzad et al., 2025). The Rasch Measurement Model was used to check item quality with the criteria of item fit on infit and outfit MNSQ 0.5–1.5 (Andrich & Marais, 2019), item and person reliability ≥0.70 (Cronbach, 1951), and person-item distribution analysis to see the balance of item difficulty levels with respondents’ abilities.

Research results

Descriptive results

According to the results of descriptive analysis, there are variations in creative thinking skills scores in the participant categories (see Table 3). Gender-wise, female students (N = 124) exhibited higher scores than males (N = 14), for example in sensitivity (M = 64.97, SD = 11.91 vs M = 56.69, SD = 15.12) and flexibility (M = 67.78, SD = 10.45 vs M = 60.13, SD = 8.73). By university, students from University A (N = 63) scored higher on flexibility (M = 65.16, SD = 8.49) than University B (M = 54.33, SD = 13.10), while University B was superior on elaboration (M = 65.82, SD = 14.88). Based on semester, semester 2 students (N = 46) stood out in sensitivity (M = 71.55, SD = 9.64) and novelty (M = 61.99, SD = 9.37), while semester 6 students (N = 45) were relatively higher in flexibility (M = 63.54, SD = 12.07). Within the location category, urban students (N = 65) scored better on almost all aspects, such as sensitivity (M = 63.79, SD = 10.48) and elaboration (M = 59.93, SD = 10.15), compared to rural students (M = 62.04, SD = 11.65; M = 57.24, SD = 9.44). Involvement in campus organizations also has an effect, where students who are active in organizations (N = 60) are higher in flexibility (M = 66.23, SD = 10.12) than those who are not active (M = 60.47, SD = 11.28). Lastly, based on GPA, the 3.1–4.0 group (N = 101) performed better on novelty (M = 55.87, SD = 10.53) and elaboration (M = 59.84, SD = 9.91) than the 2.1–3.0 GPA group (M = 52.75, SD = 9.80; M = 57.36, SD = 10.44). The findings suggest that demographic and academic factors have different roles in influencing variations in students’ creative thinking skills.

Table 3. Descriptive statistics results.

Category	Sub category	N	Sensitivity		Flexibility		Novelty		Elaboration
Category	Sub category	N	M	SD	M	SD	M	SD	M	SD
Gender	Male	14	56.69	15.12	60.13	8.73	48.85	10.64	54.80	9.10
Gender	Female	124	64.97	11.91	67.78	10.45	57.31	13.88	58.37	10.00
University	University A	63	61.22	10.12	65.16	8.49	49.09	8.55	52.34	8.52
University	University B	75	62.67	11.82	54.33	13.10	51.55	8.50	65.82	14.88
Semester	2nd Semester	46	71.55	9.64	49.10	13.48	61.99	9.37	66.52	9.82
	4th Semester	47	61.61	10.78	59.84	10.61	57.02	10.31	54.68	8.85
	6th Semester	45	60.69	11.22	63.54	12.07	50.96	10.66	62.46	9.77
Location	Urban	65	63.79	10.48	65.05	10.36	56.52	11.45	59.93	10.15
Location	Rural	73	62.04	11.65	61.78	11.97	54.16	9.96	57.24	9.44
Campus organization involvement	Participating	60	61.31	10.09	66.23	10.12	53.12	9.74	60.78	10.37
Campus organization involvement	Not Participating	78	64.15	11.43	60.47	11.28	55.48	10.85	58.01	9.89
GPA	2,1-3,0	37	62.85	10.24	64.11	11.32	52.75	9.80	57.36	10.44
GPA	3,1-4,0	101	63.72	11.17	62.48	11.56	55.87	10.53	59.84	9.91

Exploratory Factor Analysis (EFA)

The Exploratory Factor Analysis (EFA) results suggested that the instrument was worthy of further analysis and in accordance with the theoretical construction. The KMO value of 0.821 indicated an excellent level of sample feasibility, while Bartlett’s Test of Sphericity yielded Chi-Square = 459.593, df = 28, and p = 0.000, indicating that the correlation matrix was significant and the data met the assumptions for factor analysis (see Table 4 and Figure 1). According to the extraction results, four main factors with an eigenvalue of more than 1 cumulatively explained 82.24% of the total variance. The first factor explained 28.77%, the second factor 25.40%, the third factor 14.96%, and the fourth factor 13.11% of the overall variance after rotation. The component matrix shows that each instrument item has a loading factor above 0.70 on its respective factor, which indicates the consistency and representativeness of the item. Consistent with the theoretical framework, the first factor is interpreted as Sensitivity (Item_1 and Item_2), the second factor as Flexibility (Item_3 and Item_4), the third factor as Novelty (Item_5 and Item_6), and the fourth factor as Elaboration (Item_7 and Item_8). Consequently, the EFA results suggest that the empirical structure of the instrument supports the four dimensions of creative thinking skills that have been theoretically established. Hence, the instrument has good initial construct validity.

Table 4. Exploratory factor analysis results on test instrument.

Component matrix
	Component
	1	2	3	4
Item_1	0.782	0.214	0.103	0.095
Item_2	0.745	0.196	0.121	0.088
Item_3	0.201	0.801	0.162	0.099
Item_4	0.224	0.768	0.135	0.084
Item_5	0.132	0.167	0.823	0.142
Item_6	0.116	0.152	0.789	0.168
Item_7	0.104	0.123	0.184	0.814
Item_8	0.097	0.139	0.162	0.798

Figure 1. Exploratory factor analysis results on test instrument.

Confirmatory Factor Analysis (CFA)

Confirmatory Factor Analysis (CFA) results (see Table 5) suggest that the four-dimensional model of creative thinking skills has an excellent fit with the data. The value of Chi-Square/df = 1.87 is below the threshold of <3.0, which indicates a good fit. Other indices also supported the model fit, including RMSEA = 0.056 (< 0.08), SRMR = 0.041 (< 0.08), CFI = 0.954 (> 0.90), TLI = 0.942 (> 0.90), NFI = 0.918 (> 0.90), GFI = 0.931 (> 0.90), and AGFI = 0.905 (> 0.90). Each index is within the recommended cut-off criteria, indicating that the four-dimensional construct model-Sensitivity, Flexibility, Novelty, and Elaboration-empirically supports the established theoretical structure. Therefore, the CFA results confirmed that the instrument has good construct validity and can be used to measure students’ creative thinking skills performance reliably.

Table 5. Model fit indices for confirmatory factor analysis of creativethinking skills.

Fit index	Value	Cut-off criteria	Interpretation
Chi-Square (χ²/df )	1.87	< 3.00	Good Fit
RMSEA	0.056	< 0.08	Good Fit
SRMR	0.041	< 0.08	Good Fit
CFI	0.954	> 0.90	Good Fit
TLI	0.942	> 0.90	Good Fit
NFI	0.918	> 0.90	Good Fit
GFI	0.931	> 0.90	Good Fit
AGFI	0.905	> 0.90	Good Fit

Discriminant validity

The results of the discriminant validity analysis using the Fornell-Larcker criterion show that each dimension of creative thinking skills has good discrimination ability. The Average Variance Extracted (AVE) value in each dimension is 0.78 for Sensitivity, 0.81 for Flexibility, 0.76 for Novelty, and 0.79 for Elaboration (see Table 6). Some of the AVE values are greater than the squared correlation between factors, for example, the correlation between Sensitivity and Flexibility is 0.54, Sensitivity and Novelty is 0.49, and Sensitivity and Elaboration is 0.52. It indicates that each dimension explains more of its own item variance than the variance explained by other dimensions, so that each construct can be distinguished theoretically and empirically. Therefore, the instrument has sufficient discriminant validity, ensuring that the four dimensions are distinct yet conceptually related constructs.

Table 6. Discriminant validity results on instruments.

Factor	Sensitivity	Flexibility	Novelty	Elaboration
Sensitivity	0.78	0.54	0.49	0.52
Flexibility	0.54	0.81	0.51	0.47
Novelty	0.49	0.51	0.76	0.55
Elaboration	0.52	0.47	0.55	0.79

Criterion validity

Criterion validity results suggest significant relationships between several demographic variables and the dimensions of students’ creative thinking skills (see Table 7). Location variable had the most consistent and significant effect on all dimensions, with β = 0.28 (p = 0.004) on sensitivity, β = 0.25 (p = 0.007) on flexibility, β = 0.22 (p = 0.012) on novelty, and β = 0.24 (p = 0.008) on elaboration, indicating that students from urban areas tend to have higher creative thinking scores than students from rural areas. The semester variable also significantly affects all dimensions, for example, β = 0.21 (p = 0.022) on sensitivity and β = 0.20 (p = 0.029) on elaboration, indicating that increasing academic experience with each semester contributes to creative thinking ability. Moreover, GPA displayed significant effects on flexibility (β = 0.20, p = 0.034), novelty (β = 0.19, p = 0.039), and elaboration (β = 0.18, p = 0.041), indicating a positive relationship between academic achievement and the quality of ideas generated by students. The variables gender, university, and campus organisation participation display a more limited effect, with some p-values close to significant (e.g. gender on sensitivity β = 0.18, p = 0.041; university on flexibility β = 0.17, p = 0.038). Overall, the results confirm that the instrument can reflect differences in creative thinking ability related to students’ demographic and academic characteristics, thus supporting the criterion validity of the instrument.

Table 7. Criterion validity results on instrument.

Demographic	Sensitivity		Flexibility		Novelty		Elaboration
Demographic	β	p	β	p	β	p	β	p
Gender	0.18	0.041	0.12	0.087	0.09	0.132	0.15	0.056
University	0.11	0.094	0.17	0.038	0.14	0.049	0.10	0.118
Semester	0.21	0.022	0.19	0.031	0.16	0.044	0.20	0.029
Location	0.28	0.004	0.25	0.007	0.22	0.012	0.24	0.008
Campus organization involvement	0.09	0.148	0.14	0.066	0.08	0.179	0.12	0.092
GPA	0.16	0.052	0.20	0.034	0.19	0.039	0.18	0.041

Model fit

Rasch Measurement results demonstrate that the creative thinking skills instrument has good measurement quality at the person and item levels. The mean score for the person was 20.2 with a standard deviation of 4.6, a score range of 6 to 30, and a mean measure of 0.98 with a standard error of 0.55 (see Figure 2). The MNSQ infit and MNSQ outfit values averaged 1.00 each, indicating a good fit of the data to the Rasch model. In contrast, the person reliability = 0.84 and separation = 2.27 values indicated the instrument’s ability to differentiate the levels of students’ creative thinking skills adequately. RAW SCORE-TO-MEASURE CORRELATION reached 0.98, confirming the consistency of measurement.

Figure 2. Person fit model analysis results.

Among the items, the mean score was 352.4 with a standard deviation of 31.2, and the mean measure was 0.00 with a standard error of 0.13 (see Figure 3). The range of item sizes was between −0.90 to 0.70, and the MNSQ infit and outfit values averaged 1.00 each, indicating that all items fit the Rasch model. Item reliability = 0.94 and separation = 4.03 indicated that the instrument could distinguish the difficulty level of each item well. Overall, the Rasch results indicate that this eight-item essay instrument is internally valid, reliable, and has an adequate balance between item difficulty and student ability, making it feasible to measure creative thinking skills performance in the target population.

Figure 3. Results of person fit model.

Internal consistency

The results of the instrument’s internal consistency indicate a good reliability level in measuring students’ creative thinking skills. The correlation between the raw score and the measure reached 0.98, indicating a strong relationship between the students’ scores and the measured construct. In addition, the Cronbach’s Alpha (KR-20) value of 0.84 showed high internal reliability, indicating that the eight essay items have sufficient internal consistency and can be trusted to assess overall creative thinking performance. These results support using the instrument in the main study, as it provided stable and consistent measurements across participants.

Item characteristic curves

The analysis results of item 1.1SS (see Figure 4), which measures students’ ability to formulate adaptive solutions based on local potential related to the electrical energy crisis in eastern Indonesia, show that scores are concentrated in the medium to high category. Seventy-one students (51%) scored 3 with an average ability of +1.15 logit, indicating their ability to generate adaptive and contextual ideas is quite good. Thirty-five students (25%) were at a score of 2 with an ability of +0.40 logits, indicating a basic understanding but still need to develop ideas to be more realistic. Twenty-two students (16%) achieved the maximum score of 4 with +3.29 logit, demonstrating full mastery in designing creative and contextual solutions according to local potential. However, only a small number of students, namely 6 people (4%) at score 1 with −1.44 logit ability and 4 people (3%) at score 0 with −1.55 logit ability, failed to show adequate understanding of the concept of alternative energy and utilisation of local resources.

Figure 4. Example of student response patterns on item 1.

The Item Characteristic Curve (ICC) visually demonstrates that the expectation curve of the Rasch model (red line) is in line with the empirical data pattern (black-blue dots), with most of the dots falling within the 95% confidence interval. A good fit of the model is indicated, although there is a slight deviation in the low to medium ability range (around −2 to 0 logits). Therefore, item 1.1SS proved empirically valid, has sufficient discrimination power, and effectively assesses students’ ability to design adaptive solutions based on science and local potential, distinguishing low, medium, and high ability students.

Following the analysis of students’ ability to formulate local potential-based adaptive solutions related to the energy crisis in item 1.1SS, the next step is to evaluate their ability to design more specific innovative solutions in the context of science and technology. Item 5.5NY emphasises the application of creative thinking skills to produce innovative ideas in the form of simple tools that can convert kitchen waste into energy, so that it can illustrate the extent to which students can integrate the concepts of science, creativity, and local contexts in a more practical and applicable manner.

The analysis results of item 5.5NY (see Figure 5), which measures students’ ability to design innovative ideas for simple tools to convert kitchen waste into energy, indicate that most students are in the middle ability category. A total of 45% of students obtained a score of 2 with an average ability of 0.93 logits, indicating an initial ability to generate adaptive ideas, but not fully realistic or detailed. Meanwhile, 34% of students obtained a score of 3 with an average ability of 1.50 logits, reflecting a more mature ability to develop contextualised innovative solutions to household organic waste problems. Only 7% of students achieved the maximum score of 4 with an ability of 4.75 logits, showing full mastery in designing creative, functional and science-based tools. Students with the lowest score of 0 were only 1%, indicating a small proportion who could not generate ideas related to renewable energy from waste.

Figure 5. Example of student answer response patterns on item 5.

The pattern is in line with the Category Probability Curve (CPC), where category 2 has the highest probability at ability around 0–1 logits, category 3 is dominant at ability 1–3 logits, and category 4 only appears at ability above 3 logits, consistent with the low proportion of students who reach the maximum score. The Item Characteristic Curve (ICC) also indicates that students’ expected scores follow the Rasch model well, although there is a slight deviation in the middle to high ability range (1–3 logits). Therefore, item 5.5NY can be valid and reliable, effective in assessing students’ ability to integrate creativity, science, and local context to produce innovative energy-based solutions from household waste. However, its discrimination capacity at highly proficient (>4 logits) is still limited.

Person-item histograms

The person-item map results demonstrate the distribution of respondents’ ability and item difficulty on a single logit scale. The person part (above) shows that the majority of respondents are distributed around logit 0 to +2, with a peak frequency of about 20–22 respondents at logit 0 (green colour) and about 20 respondents at logit +2 (red colour). It indicates that most of the respondents have moderate to above-average ability. There are still some respondents with low ability, indicated by about 2–3 respondents at logits −3 to −4, but the number is relatively small compared to the moderate ability group.

The items (below) are all clustered around logits 0 to +1, with no items that are either extremely difficult (logits > +2) or extremely easy (logits < −2) (see Figure 6). The items tended to be of medium difficulty, so they were reasonably well balanced with the average ability of the participants. Taken together, the distribution shows that the instrument is adequate in measuring the skills of respondents with moderate to high ability. However, it is less able to distinguish between respondents with very low or very high ability due to the limited variation in item difficulty.

Figure 6. Person-item histograms of person and item results on the test instrument.

Measurement of creative thinking subscale on the topic of renewable energy

Analysis of the creative thinking skills subscale on Renewable Energy indicated that the four dimensions of the instrument had varying measures with low standard errors, indicating stable and reliable estimates. The Sensitivity dimension has a measure of −0.515 with a standard error of 0.14, INFIT MNSQ of 0.875 (ZSTD -1) and OUTFIT MNSQ of 0.87 (ZSTD -1.1), and point-measure correlation of 0.695 (see Table 8), indicating that students are quite sensitive in identifying science problems related to renewable energy and can generate ideas that are adaptive and relevant to the local context. Flexibility dimension (measure −0.465; SE 0.135; INFIT MNSQ 1.23; OUTFIT MNSQ 1.25; point-measure correlation 0.615) signifies students’ ability to generate diverse ideas from various perspectives, for example, considering various alternative energy sources or innovative ways to utilise waste into energy, thus demonstrating divergent thinking skills that are important in problem-based science learning.

Table 8. Results of measurement of creative thinking subscale on the topic of renewable energy.

Subscale	Measure	Standard error	INFIT		OUTFIT		Point measure correlation
Subscale	Measure	Standard error	MNSQ	ZSTD	MNSQ	ZSTD	Point measure correlation
Sensitivity	-0,515	0,14	0,875	-1	0,87	-1,1	0,695
Flexibility	-0,465	0,135	1,23	1,65	1,25	1,8	0,615
Novelty	0,42	0,13	0,86	-1,15	0,845	-1,3	0,72
Elaboration	0,56	0,13	1,03	0,25	1,035	0,3	0,67

The Novelty dimension (measure 0.42; SE 0.13; INFIT MNSQ 0.86; OUTFIT MNSQ 0.845; point-measure correlation 0.72) emphasises students’ ability to create original and unique ideas, such as designing a simple device to convert kitchen waste into energy, reflecting scientific creativity in the context of science experiments. The Elaboration dimension (measure 0.56; SE 0.13; INFIT MNSQ 1.03; OUTFIT MNSQ 1.035; point-measure correlation 0.67) indicates students’ ability to develop ideas in detail and systematically, for example, designing a complete renewable energy utilisation procedure, tool, or strategy, which is relevant to critical and analytical thinking competencies in science learning. Based on the overall validity and reliability of all subscales, with INFIT and OUTFIT MNSQ within the range of 0.5–1.5 and point-measure correlation >0.6, the instrument is effective in differentiating students’ ability to think creatively and apply science concepts on the topic of renewable energy.

Discussion

According to international test development standards, measuring creative thinking skills validly and reliably is the main prerequisite for the instrument to be used in educational evaluation. The CTPT instrument was developed and tested through a modern approach (Rasch Model). The EFA and CFA results confirmed that the structure of the four dimensions-sensitivity , flexibility, novelty, and elaboration-wasadequate, with fit indices (χ²/df = 1.87, RMSEA = 0.056, CFI = 0.954, TLI = 0.942, GFI = 0.931) in the good fit category. Internal reliability values were also high (Cronbach’s Alpha = 0.84), reinforced by Rasch reliability of 0.84 at the person level and 0.94 at the item level, indicating measurement consistency between items and the instrument’s ability to distinguish student skill levels. Rasch analysis also displayed unidimensionality, model fit (MNSQ infit/outfit ≈ 1.00), and item separation = 4.03, which confirmed the instrument could classify item difficulty levels well (Andrich & Marais, 2019). Descriptive findings indicated that most students were still at a low to medium level in creative thinking, for example, the highest average score on the sensitivity aspect was owned by semester 2 students (M = 71.55). In contrast, the novelty aspect was relatively low in various groups (M ≈ 50–57). These results confirm that the CTPT can differentiate students with different skill levels while providing important diagnostic information for educators. Therefore, the instrument is valid and reliable and useful for identifying the need for creative learning interventions (Lin & Tsau, 2013). However, as per modern assessment principles, instrument development needs to be iterative and updated to remain relevant to the dynamics of science education (Amprazis & Papadopoulou, 2025; Kaur et al., 2024; Rincón et al., 2023).

Beyond internal validity, an external validity test was also conducted to ascertain how much CTPT scores can predict students’ real performance. The analysis was conducted on a science experiment-based problem-solving task on renewable energy. The regression results revealed that the CTPT score was a significant predictor of the quality of students’ solutions in terms of sensitivity, flexibility, novelty, and elaboration. However, students with high scores on the Novelty dimension (measure = 0.42; SE = 0.13; INFIT MNSQ = 0.86; point-measure correlation = 0.72) tended to be able to produce original designs, such as a simple device to convert kitchen waste into energy, which was reflected in their actual performance during the experiment. The findings were further corroborated through structural equation modelling (SEM), which showed that CTPT made a significant contribution in explaining variations in problem-solving performance (β > 0.40, p < 0.01). Therefore, the CTPT proved more accurate in mapping students’ creative thinking skills.

The analysis continued by reviewing the pattern of variation in creative thinking skills scores based on gender, semester, GPA, university, location, and campus organisation participation. Female students excel in sensitivity (M = 64.97) and flexibility (M = 67.78) compared to males (M = 56.69; M = 60.13), which is in line with Runco et al., (2001) creativity theory that intrinsic motivation and different learning experiences affect sensitivity and flexibility of thinking. Regarding the semester, 2nd-semester students stood out in sensitivity (M = 71.55) and novelty (M = 61.99), indicating an explorative phase in the early stages of college. In contrast, 6th-semester students excelled in flexibility (M = 63.54) due to more mature academic experience. In terms of university, University A students were more flexible (M = 65.16), while University B excelled on elaboration (M = 65.82), which may be influenced by different curriculum approaches or academic culture. Other findings show that urban students do better on almost all dimensions than rural students, and involvement in campus organisations is associated with higher flexibility scores (M = 66.23 vs M = 60.47). Meanwhile, GPA was associated with novelty and elaboration, with students with a GPA of 3.1–4.0 outperforming those with a GPA of 2.1–3.0. The pattern of variation enriches the empirical evidence on the contextual factors that influence creativity, although some findings differ from previous studies (He et al., 2022; Volfson et al., 2018).

The coherence of the CTPT instrument is also evident in the item analysis, for example, in items 1.1SS and 5.5NY, which represent real context-based problem-solving tasks. In item 1.1SS, which focuses on the electrical energy crisis in eastern Indonesia, the score distribution is concentrated in the medium-high category, reflecting students’ ability to integrate science concepts with local potential. The reflects the aspects of sensitivity, flexibility, and elaboration simultaneously. In contrast, item 5.5NY, which requires an innovative design to convert kitchen waste into energy, emphasises the novelty aspect, so the score distribution is more spread out with a dominance in the medium category. The low proportion of students who achieved the maximum score indicates that the ability to produce innovative solutions is still limited. However, the understanding of basic concepts is quite good.

The results of the study have implications for problem-based science learning and experimentation. First, instructors can design tasks with scaffolding so that students are sensitive to issues and encouraged to develop more original and applicable ideas. Second, the curriculum can integrate simple experimental projects that allow students to test ideas in prototypes, so creativity does not stop at the conceptual level. Third, the balance between mastery of domain knowledge and stimulation of creativity must be enforced, so instruments such as CTPT truly separate knowledge limitations from creativity limitations. Thus, CTPT functions not only as a valid and reliable measurement tool but also as a diagnostic instrument that can guide creative learning design in a more contextual, measurable, and targeted manner.

Limitations and future directions

The main limitation of the study rests on the scope and context of the instrument developed. The CTPT instrument has only been validated on elementary school teacher education program students, so the generalisation of the results is still limited to this group. The instrument has not been tested at other levels of education, such as secondary schools or non-primary teacher education higher education. It has not been used in international contexts with different learner characteristics and cultural backgrounds. Furthermore, the items in the CTPT are still dominated by science-based contexts, so their applicability is relatively limited when used in other fields that require creativity, such as arts, technology or social sciences. To strengthen the external validity and ensure the instrument’s flexibility, further research needs to be directed at expanding the trials across educational levels, scientific fields, and cultures so that the CTPT can become a more universal and adaptive instrument in measuring creative thinking skills.

Besides limitations in scope and context, this study also has methodological and technical limitations. The psychometric analysis is still limited to using Rasch models and SEM, so it does not include other, more comprehensive analytical methods to enrich validity evidence. The instrument also needs to be updated regularly as theories and conceptual frameworks regarding creative thinking develop to remain relevant to research and educational practice needs. Future research directions include the integration of longitudinal tracking to monitor the development of creativity over time and the use of learning analytics to link test scores with students’ actual performance in completing science-based and cross-cutting creative tasks.

Conclusions and implications

The study introduces the CTPT, a performance-based instrument designed to assess creative thinking skills in the context of science learning. The CTPT demonstrated strong structural validity and reliability, particularly in evaluating the core dimensions of creative thinking skills, such as fluency, flexibility, originality, and elaboration in learners of different ability levels. External validity analysis further confirmed the practical usefulness of the instrument, showing a significant relationship between CTPT scores and learners’ performance in completing creative problem-solving tasks. The findings emphasise the importance of using performance-based assessments to complement traditional tests and self-report instruments to assess creative thinking skills accurately. Furthermore, the results also highlight the need for continuous development and adaptation of assessment instruments to remain relevant to educational practice and the development of creativity theories. Future research is recommended to expand the application of CTPT to various levels of education, across cultural contexts, and various fields of study, as well as to explore the use of digital technology to increase the efficiency and authenticity of the assessment.

Ethics statement

The research protocol was reviewed and approved by the Ethics Committee of Universitas Sebelas Maret (approval number: 6366/UN27.02/PT.00/2025) in accordance with the institutional ethical guidelines and national regulations for research involving human participants. Before data collection commenced, we provided an information sheet to the parents and legal guardians of the participating children and obtained their written informed consent. Permission from the educators was also obtained. Pseudonyms are used in this article to protect the anonymity and privacy of all participants.

AI use disclosure

We have read and agree to comply with the F1000 AI Policy. We confirm that during the preparation of this manuscript, I used QuillBot exclusively to assist with the translation of the original Indonesian text into English. The content was subsequently reviewed and edited by the authors to ensure accuracy and clarity.

Data availability statement

Underlying data

Repository name: Dataset for the Psychometric Evaluation of A Creative Thinking Performance Test for Science Education. https://doi.org/10.5281/zenodo.18763222 (Hasibuan, 2026).

The project contains the following underlying data:

dataset_Psychometric Evaluation of A Creative Thinking Performance Test for Science Education.xlsx (raw item-level scores and psychometric analysis dataset of the Creative Thinking Performance Test for Science Education).

Extended data

Repository name: Dataset for the Psychometric Evaluation of A Creative Thinking Performance Test for Science Education. https://doi.org/10.5281/zenodo.18763222 (Hasibuan, 2026).

This project contains the following extended data:

Extended_data.docx (extended data including instrument description, test instrument, scoring rubric, and dataset documentation).

Data are available under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

References

Affandy H, Sunarno W, Suryana R, et al.: Integrating creative pedagogy into problem-based learning: The effects on higher order thinking skills in science education. Think. Skills Creat. 2024; 53: Article-101575. Publisher Full Text
Aiken LR: Content validity and reliability of single items or questionnaires. Educ. Psychol. Meas. 1980; 40(4): 955–959. Publisher Full Text
Aiken LR: Three coefficients for analysing Reliability and Validity of rating. Educ. Psychol. Meas. 1985; 45: 131–142. Publisher Full Text
Amprazis A, Papadopoulou P: Plant Knowledge in an Era of Limited Plant Awareness: A Consideration for Meaningful Change. Res. Sci. Educ. 2025; 55(5): 1–23. Publisher Full Text
Andrich D, Marais I: A Course in Rasch Measurement Theory: Measuring in the Educational, Social and Health Sciences. Springer Nature; 2019.
Batlolona JR: Problem based learning: Students’ mental models on water conductivity concept. International Journal of Evaluation and Research in Education (IJERE). 2020; 9(2): 269–277. Publisher Full Text
Batlolona JR, Diantoro M, Wartono, et al.: Creative thinking skills students in physics on solid material elasticity. J. Turk. Sci. Educ. 2019; 16(1): 48–61. Publisher Full Text
Bulut Ates C, Aktamis H: Investigating the effects of creative educational modules blended with Cognitive Research Trust (CoRT) techniques and Problem Based Learning (PBL) on students’ scientific creativity skills and perceptions in science education. Think. Skills Creat. 2024; 51: 101471. Publisher Full Text
Chang TS, Wang HC, Haynes AMD, et al.: Enhancing student creativity through an interdisciplinary, project-oriented problem-based learning undergraduate curriculum. Think. Skills Creat. 2022; 46: 101173. Publisher Full Text
Creswell JW: Educational Research: Planning, Conducting and Evaluating Quantitative and Qualitative Research. Pearson Education; 4th ed.2012.
Cronbach LJ: Coefficient Alpha and the Internal Structure of Tests. Psychometrika. 1951; 16(3): 297–334. Publisher Full Text
Da’as R: Teacher’s engagement in creativity: The role of school middle leaders’ values, team diversity and team knowledge self-efficacy. Think. Skills Creat. 2023; 49: 101346. Publisher Full Text
David SA: Towards a dialogical and progressive educational policy framework: Manoeuvring a middle way among the polarised. F1000Res. 2023; 12: 1–19. PubMed Abstract | Publisher Full Text | Free Full Text
Dilekçi A, Karatay H: The effects of the 21st century skills curriculum on the development of students’ creative thinking skills. Think. Skills Creat. 2023; 47: 101229. Publisher Full Text
Ernawati MDW, Yusnidar, Haryanto, et al.: Do creative thinking skills in problem-based learning benefit from scaffolding?. J. Turk. Sci. Educ. 2023; 20(3): 399–417. Publisher Full Text
Guilford JP: Creativity. Am. Psychol. 1950; 5: 444–454. Publisher Full Text
Haim K, Aschauer W: Innovative FOCUS: A Program to Foster Creativity and Innovation in the Context of Education for Sustainability. Sustainability. 2024; 16(2257): 1–2218. Publisher Full Text
Hair JF, Risher JJ, Ringle CM: When to use and how to report the results of PLS-SEM. Eur. Bus. Rev. 2019; 31(1): 2–24. Publisher Full Text
Han W, Abdrahim NA: The role of teachers’ creativity in higher education: A systematic literature review and guidance for future research. Think. Skills Creat. 2023; 48: 101302. Publisher Full Text
Hasibuan MP: Dataset for the “Psychometric Evaluation of A Creative Thinking Performance Test for Science Education.” Zenodo. 2026. Publisher Full Text
Hasibuan MP, Sunarno W, Vh ES: Psychometric Evaluation of a Creative Thinking Performance Test for Science Education. Research Square. 2025. Publisher Full Text
He P, Zheng C, Li T: Development and Validation of an Instrument for Measuring Chinese Chemistry Teachers’ Perceived Self-Efficacy Towards Chemistry Core Competencies. Int. J. Sci. Math. Educ. 2022; 20(7): 1337–1359. Publisher Full Text
Hong O, Song J: A componential model of Science Classroom Creativity (SCC) for understanding collective creativity in the science classroom. Think. Skills Creat. 2020; 37: 100698. Publisher Full Text
Jia X, Hu W, Cai F, et al.: The influence of teaching methods on creative problem finding. Think. Skills Creat. 2017; 24: 86–94. Publisher Full Text
Kartini FS, Widodo A, Winarno N, et al.: Promoting Student’s Problem-Solving Skills through STEM Project-Based Learning in Earth Layer and Disasters Topic. J. Sci. Learn. 2021; 4(3): 257–266. Publisher Full Text
Kaur R, Mantri A, Nagabhushan P, et al.: Rasch Computing Analysis of Two Tier Concept Inventory to Assess Engineering Students’ Conceptual Knowledge. SN Computer Science. 2024; 5(5): 643–656. Publisher Full Text
Kholid MN, Mahmudah MH, Ishartono N, et al.: Classification of students’ creative thinking for non-routine mathematical problems. Cogent Educ. 2024; 11(1): 2394738. Publisher Full Text
Kline RB: Principles and Practice of Structural Equation Modeling (Methodology in the Social Sciences). Guilford Press; 5th ed. 2015.
Lebuda I, Hofer G, Rominger C, et al.: No strong support for a Dunning–Kruger effect in creativity: analyses of self-assessment in absolute and relative terms. Sci. Rep. 2024; 14(1): 1–11. PubMed Abstract | Publisher Full Text | Free Full Text
Lin H-H, Tsau S-Y: The Development of an Imaginative Thinking Scale. Imagin. Cogn. Pers. 2013; 32(3): 207–238. Publisher Full Text
Linstone HA, Turoff M, Helmer O: The Delphi method: An efficient procedure to generate knowledge. Springer; 2002. Publisher Full Text
Mafinejad MK, Arabshahi SKS, Monajemi A, et al.: Use of Multi-Response format test in the assessment of medical students’ critical thinking ability. J. Clin. Diagn. Res. 2017; 11(9): LC10–LC13. PubMed Abstract | Publisher Full Text | Free Full Text
Nasution NEA, Al Muhdhar MHI, Sari MS, et al.: Relationship between Critical and Creative Thinking Skills and Learning Achievement in Biology with Reference to Educational Level and Gender. J. Turk. Sci. Educ. 2023; 20(1): 66–83. Publisher Full Text
Oo TZ, Kadyirov T, Kadyjrova L, et al.: Design-based learning in higher education: Its effects on students’ motivation, creativity and design skills. Think. Skills Creat. 2024; 53: 101621. Publisher Full Text
Patel M, Patel N: Exploring Research Methodology. Int. J. Res. Rev. 2019; 6(3): 48–55.
Patterson JD, Barbot B, Lloyd-Cox J, et al.: AuDrA: An automated drawing assessment platform for evaluating creativity. Behav. Res. Methods. 2024; 56(4): 3619–3636. Publisher Full Text
Patterson JD, Pronchick J, Panchanadikar R, et al.: CAP: The creativity assessment platform for online testing and automated scoring. Behav. Res. Methods. 2025; 57(264): 1–217. Publisher Full Text
Pontis S, Salerno GL: Understanding scientific creativity criteria: Biologists’ assessments of PhD students’ creative products using the CAT. Think. Skills Creat. 2025; 57: 101861. Publisher Full Text
Priyaadharshini M, Vinayaga Sundaram B: Evaluation of higher-order thinking skills using learning style in an undergraduate engineering in flipped classroom. Comput. Appl. Eng. Educ. 2018; 26(6): 2237–2254. Publisher Full Text
Rhee JH, Park SY, Han G, et al.: Role of indoor environmental attributes on creativity: A systematic review. J. Environ. Psychol. 2025; 104: Article-102622. Publisher Full Text
Rincón AG, Barragán S, Cosenz F, et al.: Prevention and Mitigation of Rural Higher Education Dropout in Colombia: A Dynamic Performance Management Approach. F1000Res. 2023; 12: 430–497. PubMed Abstract | Publisher Full Text | Free Full Text
Ross SD, Lachmann T, Jaarsveld S, et al.: Creativity across the lifespan: changes with age and with dementia. BMC Geriatr. 2023; 23(1): 1–10. Publisher Full Text
Runco MA, Plucker JA, Lim W: The Measurement and psychometric integrity of a measure of ideational behavior. Creat. Res. J. 2001; 16(3): 393–400. Publisher Full Text
Shahbazloo F, Abdullah Mirzaie R: Investigating the effect of 5E-based STEM education in solar energy context on creativity and academic achievement of female junior high school students. Think. Skills Creat. 2023; 49: Article-101336. Publisher Full Text
Shahzad MF, Xu S, Zahid H: Exploring the impact of generative AI-based technologies on learning performance through self-efficacy, fairness & ethics, creativity, and trust in higher education. Educ. Inf. Technol. 2025; 30(3): 3691–3716. Publisher Full Text
Shively K, Stith KM, Rubenstein LDV: Measuring what matters: Assessing creativity, critical thinking, and the design process. Gift. Child Today. 2018; 41(3): 149–158. Publisher Full Text
Suradika A, Dewi HI, Nasution MI: Project-Based Learning and Problem-Based Learning Models in Critical and Creative Students. Jurnal Pendidikan IPA Indonesia. 2023; 12(1): 153–167. Publisher Full Text
Tep P, Maneewan S, Chuathong S: Psychometric examination of Runco Ideational Behavior Scale: Thai adaptation. Psicologia: Reflexao e Critica. 2021; 34(4): 4–11. PubMed Abstract | Publisher Full Text | Free Full Text
Torrance EP, Ball O, Safter HT: Torrance test of creative thinking streamlined scoring guide figural a and B. Bensenville. Illinois: Scholastic Testing Service, Inc; 1992.
Volfson A, Eshach H, Ben-Abu Y: Development of a diagnostic tool aimed at pinpointing undergraduate students’ knowledge about sound and its implementation in simple acoustic apparatuses’ analysis. Phys. Rev Phys. Educ. Res. 2018; 14(2): 20127. Publisher Full Text
Xu S, Reiss MJ, Lodge W: Comprehensive Scientific Creativity Assessment (C-SCA): A new approach for measuring scientific creativity in secondary school students. Int. J. Sci. Math. Educ. 2025; 23(2): 293–319. Publisher Full Text

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 02 Apr 2026

Author details Author details

¹ Natural Science, Universitas Sebelas Maret Fakultas Keguruan dan Ilmu Pendidikan, Kota Surakarta, Jawa Tengah, 57126, Indonesia

Molani Paulina Hasibuan
Roles: Conceptualization, Investigation, Methodology, Writing – Original Draft Preparation

Widha Sunarno
Roles: Supervision, Validation, Writing – Review & Editing

Elfi Susanti Vh
Roles: Formal Analysis, Methodology, Supervision, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

This research is supported by Indonesian Education Scholarship, Center for Higher Education Funding and Assessment, and Indonesian Endowment Fund for Education. With contract 00639/BPPT/BPI.06/9/2023.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Article Versions (1)

version 1

Published: 02 Apr 2026, 15:462

https://doi.org/10.12688/f1000research.178138.1

Copyright

© 2026 Hasibuan MP et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The author(s) is/are employees of the US Government and therefore domestic copyright protection in USA does not apply to this work. The work may be protected under the copyright laws of other jurisdictions when used in those jurisdictions.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Hasibuan MP, Sunarno W and Vh ES. Psychometric Evaluation of A Creative Thinking Performance Test for Science Education [version 1; peer review: 1 not approved]. F1000Research 2026, 15:462 (https://doi.org/10.12688/f1000research.178138.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 1

VERSION 1

PUBLISHED 02 Apr 2026

Views

19

Reviewer Report 13 Apr 2026

Rommel Alali, King Faisal University, Al-Ahs, Saudi Arabia

Not Approved

https://doi.org/10.5256/f1000research.196488.r472889

Thank you for submitting your manuscript. The topic developing a scale is important and relevant to educational and psychological research. However, the manuscript requires extensive improvement before it can be considered for indexing.

The introduction does

Thank you for submitting your manuscript. The topic developing a scale is important and relevant to educational and psychological research. However, the manuscript requires extensive improvement before it can be considered for indexing.

The introduction does not provide: A clear theoretical framework, A critical review of prior scales or models, A justification for building a new scale rather than adapting existing ones, and the literature review is general and descriptive, lacking synthesis and conceptual depth.

The manuscript does not explain: What gap exists in current measurement tools. Why existing international or regional scales are insufficient. How the proposed scale adds scientific value.
No explanation of how items were generated. No criteria for selecting experts. No mention of cognitive interviews or item refinement. No reporting of factor loadings or dimensional structure.
Sample Size Limitations for Factor Analysis: The study utilized a sample of 138 students. For Exploratory Factor Analysis (EFA) and especially Confirmatory Factor Analysis (CFA), this is generally considered a small sample. Most psychometric standards suggest a minimum of 200–300 participants or a ratio of 10:1 (participants to items) to ensure stable factor loadings and model fit indices.
Rasch analysis is highly effective but sensitive to sample size when estimating item difficulty and person ability simultaneously. With only 138 participants, the "Person Separation Index" might be lower than ideal, potentially affecting the instrument’s ability to distinguish between different levels of creative ability across a broader population.
The study focuses heavily on construct validity (EFA/CFA) and content validity (Delphi technique). However, it does not provide evidence of predictive or concurrent validity, such as, by correlating CTPT scores with existing gold-standard tests like the Torrance Tests of Creative Thinking (TTCT) or actual academic performance in science.
You can refer to the following references to help with the required modifications:

AlAli, R., & Al-Barakat, A.A. (2024). [Reference 1]
AlAli, R. & Saleh, S. (2022). [Reference 2]

Sections are not well-organized according to standard psychometric research structure.
The manuscript needs editing for clarity and coherence.

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Partly
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Yes

References

1. AlAli R, Al-Barakat A: Constructing and Developing a Scale for Assessing Language Teachers' Performance in Integrating Reflective Thinking Skills within Primary Reading Learning Environments. Forum for Linguistic Studies. 2024; 6 (6): 194-210 Publisher Full Text
2. AlAli R, Saleh S: Towards Constructing and Developing a Self-Efficacy Scale for Distance Learning and Verifying the Psychometric Properties. Sustainability. 2022; 14 (20). Publisher Full Text

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Measurement and Evaluation

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 02 Apr 2026

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1
Version 1 02 Apr 26	read

Rommel Alali, King Faisal University, Al-Ahs, Saudi Arabia

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

19 Views

13 Apr 2026 | for Version 1

Rommel Alali, King Faisal University, Al-Ahs, Saudi Arabia

19 Views Cite this report Responses(0)

Not Approved

Thank you for submitting your manuscript. The topic developing a scale is important and relevant to educational and psychological research. However, the manuscript requires extensive improvement before it can be considered for indexing.

The introduction does not provide: A clear theoretical framework, A critical review of prior scales or models, A justification for building a new scale rather than adapting existing ones, and the literature review is general and descriptive, lacking synthesis and conceptual depth.

The manuscript does not explain: What gap exists in current measurement tools. Why existing international or regional scales are insufficient. How the proposed scale adds scientific value.
No explanation of how items were generated. No criteria for selecting experts. No mention of cognitive interviews or item refinement. No reporting of factor loadings or dimensional structure.
Sample Size Limitations for Factor Analysis: The study utilized a sample of 138 students. For Exploratory Factor Analysis (EFA) and especially Confirmatory Factor Analysis (CFA), this is generally considered a small sample. Most psychometric standards suggest a minimum of 200–300 participants or a ratio of 10:1 (participants to items) to ensure stable factor loadings and model fit indices.
Rasch analysis is highly effective but sensitive to sample size when estimating item difficulty and person ability simultaneously. With only 138 participants, the "Person Separation Index" might be lower than ideal, potentially affecting the instrument’s ability to distinguish between different levels of creative ability across a broader population.
The study focuses heavily on construct validity (EFA/CFA) and content validity (Delphi technique). However, it does not provide evidence of predictive or concurrent validity, such as, by correlating CTPT scores with existing gold-standard tests like the Torrance Tests of Creative Thinking (TTCT) or actual academic performance in science.
You can refer to the following references to help with the required modifications:

AlAli, R., & Al-Barakat, A.A. (2024). [Reference 1]
AlAli, R. & Saleh, S. (2022). [Reference 2]

Sections are not well-organized according to standard psychometric research structure.
The manuscript needs editing for clarity and coherence.

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Partly
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Yes

References

1. AlAli R, Al-Barakat A: Constructing and Developing a Scale for Assessing Language Teachers' Performance in Integrating Reflective Thinking Skills within Primary Reading Learning Environments. Forum for Linguistic Studies. 2024; 6 (6): 194-210 Publisher Full Text
2. AlAli R, Saleh S: Towards Constructing and Developing a Self-Efficacy Scale for Distance Learning and Verifying the Psychometric Properties. Sustainability. 2022; 14 (20). Publisher Full Text

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Measurement and Evaluation

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

Respond to this report

Responses (0)

[1] Affandy H, Sunarno W, Suryana R, et al.: Integrating creative pedagogy into problem-based learning: The effects on higher order thinking skills in science education. Think. Skills Creat. 2024; 53: Article-101575. Publisher Full Text

[2] Aiken LR: Content validity and reliability of single items or questionnaires. Educ. Psychol. Meas. 1980; 40(4): 955–959. Publisher Full Text

[3] Aiken LR: Three coefficients for analysing Reliability and Validity of rating. Educ. Psychol. Meas. 1985; 45: 131–142. Publisher Full Text

[4] Amprazis A, Papadopoulou P: Plant Knowledge in an Era of Limited Plant Awareness: A Consideration for Meaningful Change. Res. Sci. Educ. 2025; 55(5): 1–23. Publisher Full Text

[5] Andrich D, Marais I: A Course in Rasch Measurement Theory: Measuring in the Educational, Social and Health Sciences. Springer Nature; 2019.

[6] Batlolona JR: Problem based learning: Students’ mental models on water conductivity concept. International Journal of Evaluation and Research in Education (IJERE). 2020; 9(2): 269–277. Publisher Full Text

[7] Batlolona JR, Diantoro M, Wartono, et al.: Creative thinking skills students in physics on solid material elasticity. J. Turk. Sci. Educ. 2019; 16(1): 48–61. Publisher Full Text

[8] Bulut Ates C, Aktamis H: Investigating the effects of creative educational modules blended with Cognitive Research Trust (CoRT) techniques and Problem Based Learning (PBL) on students’ scientific creativity skills and perceptions in science education. Think. Skills Creat. 2024; 51: 101471. Publisher Full Text

[9] Chang TS, Wang HC, Haynes AMD, et al.: Enhancing student creativity through an interdisciplinary, project-oriented problem-based learning undergraduate curriculum. Think. Skills Creat. 2022; 46: 101173. Publisher Full Text

[10] Creswell JW: Educational Research: Planning, Conducting and Evaluating Quantitative and Qualitative Research. Pearson Education; 4th ed.2012.

[11] Cronbach LJ: Coefficient Alpha and the Internal Structure of Tests. Psychometrika. 1951; 16(3): 297–334. Publisher Full Text

[12] Da’as R: Teacher’s engagement in creativity: The role of school middle leaders’ values, team diversity and team knowledge self-efficacy. Think. Skills Creat. 2023; 49: 101346. Publisher Full Text

[13] David SA: Towards a dialogical and progressive educational policy framework: Manoeuvring a middle way among the polarised. F1000Res. 2023; 12: 1–19. PubMed Abstract | Publisher Full Text | Free Full Text

[14] Dilekçi A, Karatay H: The effects of the 21st century skills curriculum on the development of students’ creative thinking skills. Think. Skills Creat. 2023; 47: 101229. Publisher Full Text

[15] Ernawati MDW, Yusnidar, Haryanto, et al.: Do creative thinking skills in problem-based learning benefit from scaffolding?. J. Turk. Sci. Educ. 2023; 20(3): 399–417. Publisher Full Text

[16] Guilford JP: Creativity. Am. Psychol. 1950; 5: 444–454. Publisher Full Text

[17] Haim K, Aschauer W: Innovative FOCUS: A Program to Foster Creativity and Innovation in the Context of Education for Sustainability. Sustainability. 2024; 16(2257): 1–2218. Publisher Full Text

[18] Hair JF, Risher JJ, Ringle CM: When to use and how to report the results of PLS-SEM. Eur. Bus. Rev. 2019; 31(1): 2–24. Publisher Full Text

[19] Han W, Abdrahim NA: The role of teachers’ creativity in higher education: A systematic literature review and guidance for future research. Think. Skills Creat. 2023; 48: 101302. Publisher Full Text

[20] Hasibuan MP: Dataset for the “Psychometric Evaluation of A Creative Thinking Performance Test for Science Education.” Zenodo. 2026. Publisher Full Text

[21] Hasibuan MP, Sunarno W, Vh ES: Psychometric Evaluation of a Creative Thinking Performance Test for Science Education. Research Square. 2025. Publisher Full Text

[22] He P, Zheng C, Li T: Development and Validation of an Instrument for Measuring Chinese Chemistry Teachers’ Perceived Self-Efficacy Towards Chemistry Core Competencies. Int. J. Sci. Math. Educ. 2022; 20(7): 1337–1359. Publisher Full Text

[23] Hong O, Song J: A componential model of Science Classroom Creativity (SCC) for understanding collective creativity in the science classroom. Think. Skills Creat. 2020; 37: 100698. Publisher Full Text

[24] Jia X, Hu W, Cai F, et al.: The influence of teaching methods on creative problem finding. Think. Skills Creat. 2017; 24: 86–94. Publisher Full Text

[25] Kartini FS, Widodo A, Winarno N, et al.: Promoting Student’s Problem-Solving Skills through STEM Project-Based Learning in Earth Layer and Disasters Topic. J. Sci. Learn. 2021; 4(3): 257–266. Publisher Full Text

[26] Kaur R, Mantri A, Nagabhushan P, et al.: Rasch Computing Analysis of Two Tier Concept Inventory to Assess Engineering Students’ Conceptual Knowledge. SN Computer Science. 2024; 5(5): 643–656. Publisher Full Text

[27] Kholid MN, Mahmudah MH, Ishartono N, et al.: Classification of students’ creative thinking for non-routine mathematical problems. Cogent Educ. 2024; 11(1): 2394738. Publisher Full Text

[28] Kline RB: Principles and Practice of Structural Equation Modeling (Methodology in the Social Sciences). Guilford Press; 5th ed. 2015.

[29] Lebuda I, Hofer G, Rominger C, et al.: No strong support for a Dunning–Kruger effect in creativity: analyses of self-assessment in absolute and relative terms. Sci. Rep. 2024; 14(1): 1–11. PubMed Abstract | Publisher Full Text | Free Full Text

[30] Lin H-H, Tsau S-Y: The Development of an Imaginative Thinking Scale. Imagin. Cogn. Pers. 2013; 32(3): 207–238. Publisher Full Text

[31] Linstone HA, Turoff M, Helmer O: The Delphi method: An efficient procedure to generate knowledge. Springer; 2002. Publisher Full Text

[32] Mafinejad MK, Arabshahi SKS, Monajemi A, et al.: Use of Multi-Response format test in the assessment of medical students’ critical thinking ability. J. Clin. Diagn. Res. 2017; 11(9): LC10–LC13. PubMed Abstract | Publisher Full Text | Free Full Text

[33] Nasution NEA, Al Muhdhar MHI, Sari MS, et al.: Relationship between Critical and Creative Thinking Skills and Learning Achievement in Biology with Reference to Educational Level and Gender. J. Turk. Sci. Educ. 2023; 20(1): 66–83. Publisher Full Text

[34] Oo TZ, Kadyirov T, Kadyjrova L, et al.: Design-based learning in higher education: Its effects on students’ motivation, creativity and design skills. Think. Skills Creat. 2024; 53: 101621. Publisher Full Text

[35] Patel M, Patel N: Exploring Research Methodology. Int. J. Res. Rev. 2019; 6(3): 48–55.

[36] Patterson JD, Barbot B, Lloyd-Cox J, et al.: AuDrA: An automated drawing assessment platform for evaluating creativity. Behav. Res. Methods. 2024; 56(4): 3619–3636. Publisher Full Text

[37] Patterson JD, Pronchick J, Panchanadikar R, et al.: CAP: The creativity assessment platform for online testing and automated scoring. Behav. Res. Methods. 2025; 57(264): 1–217. Publisher Full Text

[38] Pontis S, Salerno GL: Understanding scientific creativity criteria: Biologists’ assessments of PhD students’ creative products using the CAT. Think. Skills Creat. 2025; 57: 101861. Publisher Full Text

[39] Priyaadharshini M, Vinayaga Sundaram B: Evaluation of higher-order thinking skills using learning style in an undergraduate engineering in flipped classroom. Comput. Appl. Eng. Educ. 2018; 26(6): 2237–2254. Publisher Full Text

[40] Rhee JH, Park SY, Han G, et al.: Role of indoor environmental attributes on creativity: A systematic review. J. Environ. Psychol. 2025; 104: Article-102622. Publisher Full Text

[41] Rincón AG, Barragán S, Cosenz F, et al.: Prevention and Mitigation of Rural Higher Education Dropout in Colombia: A Dynamic Performance Management Approach. F1000Res. 2023; 12: 430–497. PubMed Abstract | Publisher Full Text | Free Full Text

[42] Ross SD, Lachmann T, Jaarsveld S, et al.: Creativity across the lifespan: changes with age and with dementia. BMC Geriatr. 2023; 23(1): 1–10. Publisher Full Text

[43] Runco MA, Plucker JA, Lim W: The Measurement and psychometric integrity of a measure of ideational behavior. Creat. Res. J. 2001; 16(3): 393–400. Publisher Full Text

[44] Shahbazloo F, Abdullah Mirzaie R: Investigating the effect of 5E-based STEM education in solar energy context on creativity and academic achievement of female junior high school students. Think. Skills Creat. 2023; 49: Article-101336. Publisher Full Text

[45] Shahzad MF, Xu S, Zahid H: Exploring the impact of generative AI-based technologies on learning performance through self-efficacy, fairness & ethics, creativity, and trust in higher education. Educ. Inf. Technol. 2025; 30(3): 3691–3716. Publisher Full Text

[46] Shively K, Stith KM, Rubenstein LDV: Measuring what matters: Assessing creativity, critical thinking, and the design process. Gift. Child Today. 2018; 41(3): 149–158. Publisher Full Text

[47] Suradika A, Dewi HI, Nasution MI: Project-Based Learning and Problem-Based Learning Models in Critical and Creative Students. Jurnal Pendidikan IPA Indonesia. 2023; 12(1): 153–167. Publisher Full Text

[48] Tep P, Maneewan S, Chuathong S: Psychometric examination of Runco Ideational Behavior Scale: Thai adaptation. Psicologia: Reflexao e Critica. 2021; 34(4): 4–11. PubMed Abstract | Publisher Full Text | Free Full Text

[49] Torrance EP, Ball O, Safter HT: Torrance test of creative thinking streamlined scoring guide figural a and B. Bensenville. Illinois: Scholastic Testing Service, Inc; 1992.

[50] Volfson A, Eshach H, Ben-Abu Y: Development of a diagnostic tool aimed at pinpointing undergraduate students’ knowledge about sound and its implementation in simple acoustic apparatuses’ analysis. Phys. Rev Phys. Educ. Res. 2018; 14(2): 20127. Publisher Full Text

[51] Xu S, Reiss MJ, Lodge W: Comprehensive Scientific Creativity Assessment (C-SCA): A new approach for measuring scientific creativity in secondary school students. Int. J. Sci. Math. Educ. 2025; 23(2): 293–319. Publisher Full Text

Psychometric Evaluation of A Creative Thinking Performance Test for Science Education

Abstract

Background

Methods

Results

Conclusions

Keywords

Introduction

Research methodology

Participants

Table 1. Demographic characteristics of respondents.

Theoretical construction

Table 2. Theoretical synthesis of creative thinking skills dimensions.

Blueprint construction

Item construction

Scoring

Delphi validation

Pilot test

Data analysis

Research results

Descriptive results

Table 3. Descriptive statistics results.

Exploratory Factor Analysis (EFA)

Table 4. Exploratory factor analysis results on test instrument.

Figure 1. Exploratory factor analysis results on test instrument.

Confirmatory Factor Analysis (CFA)

Table 5. Model fit indices for confirmatory factor analysis of creativethinking skills.

Discriminant validity

Table 6. Discriminant validity results on instruments.

Criterion validity

Table 7. Criterion validity results on instrument.

Model fit

Figure 2. Person fit model analysis results.

Figure 3. Results of person fit model.

Internal consistency

Item characteristic curves

Figure 4. Example of student response patterns on item 1.

Figure 5. Example of student answer response patterns on item 5.

Person-item histograms

Figure 6. Person-item histograms of person and item results on the test instrument.

Measurement of creative thinking subscale on the topic of renewable energy

Table 8. Results of measurement of creative thinking subscale on the topic of renewable energy.

Discussion

Limitations and future directions

Conclusions and implications

Ethics statement

AI use disclosure

Data availability statement

Underlying data

Extended data

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated