Keywords
classroom assessment, academic status, assessment training, teaching experience
Higher education teachers dwell on various classroom assessment practices. Their assessment practices play a significant part of the teaching learning process and in determining student achievement. In this respect, their different attributes determine the nature of their classroom assessment practices. This cross-sectional survey study aimed to examine differences in assessment practices among public university teachers with respect to their academic status, teaching experience, assessment training and specializations.
The study adopted Assessment Practices Inventory (API) which was distributed to a sample of 380 randomly selected teachers from four public universities. The majorities (93.4%) of questionnaires which were properly completed were entered to SPSS version 25 software and used for analysis. Multivariate analysis of variance (MANOVA) was used for analysis and Pillai’s Trace was utilized for interpretation of the results.
The results showed that academic status (P= .000); teaching experience (P = .000); and assessment training (P =.030) had significant differences for the combined dependent variables (assessment practice). However, field of specialization (P =.130) did not show significant difference in the overall assessment practice although differences were observed in assessment design and interpretation.
Academic status, teaching experience, and assessment training are important attributes of teachers that shape their assessment practices. Therefore, as teachers hold the sole responsibility of classroom assessment, training and development courses in assessment, experience sharing and learning opportunities are necessary regardless of their specializations, academic status, and teaching experience.
classroom assessment, academic status, assessment training, teaching experience
Assessment is a process of getting information about learners’ performance, using clearly defined assessment criteria to determine what the learners know or can do and then using that information as evidence for making judgments about learners’ achievements (Department of Basic Education, 2012). The roles of assessment are a critical aspect of ensuring quality education and student development in any academic institution (Brown, et al., 2019). In fact, according to Brown et al. (2019), no factor influences a learning environment as much as assessment. Teachers are actively engaged on various classroom assessment practices as part of making the learning process as effective as it could possibly be. These assessment practices entail a wide range of dimensions such as assessment design, assessment administration, scoring and interpretation, assessment grading, and assessment data utilization or evaluation (Koloi-Keaikitse & Traynor, 2023). Thus, teachers’ assessment practices are a very important aspect of the learning process and they are directly related to students’ learning achievement (Crichton & McDaid, 2016) and key components to effective teaching and learning (Pastore & Andrade, 2019).
Higher education assessment practices play a crucial role of ensuring the achievement of student learning outcomes by providing feedback for improvement. They are the cornerstones of informing instructional decision-making teachers have to rely on (Moss & Brookhart, 2019). In this regard, assessment in Ethiopian universities is one of the major components of the teaching learning process throughout the academic year. Teachers engage themselves on various assessment activities that range from devising an assessment technique to practically putting that in to practice and giving feedback to students. This requires the teachers to make appropriate decisions related to for example, what sorts of assessments are suitable for students, when and how to carry out assessments, what sorts of resources are required to conduct assessments, and how to utilize the information obtained from the assessments effectively. This indicates that teachers are the key drivers of the classroom assessment process from start to end. Therefore, as teachers’ classroom assessment practices are a means by which student learning outcomes are ensured, they need to be sufficiently taken care of Hunduma et al. (2023).
Studies such as Bailey and Garner (2010) found that teachers play a critical role in designing and implementing assessments, and their academic levels influence the quality and effectiveness of their assessment practices. Holmboe et al. (2010) also argued that teachers with higher academic levels demonstrate a deeper understanding of assessment principles and are more likely to employ effective assessment strategies in their teaching. Moreover, Furtak et al. (2016) indicated that teachers with higher academic qualifications and specialized training in formative assessment demonstrated greater proficiency in implementing effective formative assessment strategies. Similarly, students under the guidance of academically qualified teachers demonstrated higher levels of academic achievement through improved self-assessment practices (Yan, 2019). Other studies also outlined the importance of teachers’ level of academic status in designing and implementing assessment tools that effectively measure students’ skills and knowledge (Paiva et al., 2022; Brown et al., 2019; Yan & Brown, 2021).
On the other hand, teaching experience was found to play a crucial role in shaping teachers’ approaches to assessing student learning (Popham, 2011). It is reported that more experienced and academically qualified teachers provide more comprehensive and constructive assessment feedback compared to their less experienced counterparts (Bailey & Garner, 2010). Moreover, other studies (such as Gamage et al., 2020; Susan & Catherine, 2017; Moss & Brookhart, 2019; Black and Wiliam, 2018) argue that experienced teachers demonstrate a deeper understanding of assessment principles and are adept at implementing formative and summative assessment techniques more effectively.
The other attribute having a profound effect on teachers’ assessment practice differences is the assessment training received by teachers. In this regard, Jacob, Hill, and Corey (2017) indicated that training programs can significantly improve teachers’ knowledge and instructional practices, which could also extend to assessment practices. Similarly, Brookhart (2011) provided insights into the educational assessment knowledge and skills for teachers and indicated that teachers who have undergone specific trainings in assessment exhibit more confidence and competence in designing and implementing appropriate assessment techniques. Some aspects of the assessment practices of the academic staff may be explained in relation to subject specializations (DeLuca et al., 2016) but the importance of educational assessment knowledge and skills for teachers acquired through objective training is emphasized (Brookhart, 2011; DeLuca et al., 2016; Yan & Brown, 2021). Therefore, regardless of a teacher’s field of specialization, there definitely is a need to put an emphasis on assessment literacy in teacher training programs to ensure that educators are equipped with the necessary assessment skills and preparedness.
In different levels of academic institutions including universities, although teachers play a key role in the overall classroom assessment process, there are disparities among them. This is supported by researchers such as Bruggeman et al. (2021) and Matovu and Zubairi (2015) who found out that assessment practices vary among teachers of different higher education institutions as well as in their departments, faculties, and universities. As a result, they suggested consistency of assessment policies and strategies among institutions and within their constituent departments and faculties. Moreover, other studies reported inconsistencies in the factors influencing teachers’ assessment practices (Boud & Associates, 2010; Zhang & Burry-Stock, 2003; Koloi-Keaikitse & Traynor, 2023; Popham, 2009; Koloi-Keaikitse, 2017; Borghouts, et al., 2016). However, many of these studies have been done in primary and secondary schools although a few (such as Postareff et al., 2012; Xu, & Liu, 2009) have been done in universities.
In Ethiopia, a few researchers (Soromessa, 2015; Mekonnen, 2014; Chalchisa, 2014; Belay, 2017; Abera et al., 2017; Moges, 2018) investigated some aspects of assessment such as implementation and challenges of continuous assessment. Tulu and Tolosa (2018) studied classroom assessment in secondary schools but it was limited to the techniques used and the challenges faced by the teachers during their assessment works. Hunduma et al. (2023) investigated perceptions and practices of continuous assessment in government higher learning institutions. However, no research exists regarding the differences in assessment practices of teachers with respect to their attributes in public universities of Ethiopia. Thus, understanding assessment practices of university teachers serves as a way of finding out whether they adopt or use quality assessment practices to meet the learning needs of students and the varying needs of other stakeholders. Therefore, this research was aimed to investigate the relationship of teachers’ academic status, years of teaching experience, field of specialization, and levels of assessment training with their classroom assessment practices. Thus, the study formulated the following basic research questions and hypotheses.
• What is the relationship of teachers’ academic status, years of teaching experience, field of specialization, and levels of assessment training with their classroom assessment practices?
• Are there significant differences in teachers’ assessment practices due to their academic status, years of teaching experience, levels of assessment training and types of specializations?
• H0a: There is no significant relationship among teachers’ academic status, years of teaching experience, field of specialization, and levels of assessment training with their classroom assessment practices.
• H0b: There is no significant difference in teachers’ assessment practice due to their academic status, years of teaching experience, levels of assessment training and types of specializations.
Hawassa University, College of Education, Research Ethics Review Committee (RERC) has approved the implementation of the research on June 03, 2023; with reference number COE- REC-007/2023 (Letter of Research Ethics Review Committee). The Committee confirmed that the research met the criteria that subject risk is minimized; subjects are selected equitably; respect for person-informed consent is determined and has been documented appropriately and privacy and confidentiality are maintained. Written informed consent was obtained from all participants.
This study employed a quantitative cross-sectional survey research design which entails collecting quantitative data at a specific point in time in the lives of respondents. Surveys are a powerful and useful tool for collecting data on human characteristics, such as their beliefs, attitudes, thoughts, and behavior (Creswell, 2012). Moreover, it is asserted that for large-sample studies, survey research is more objective and appropriate (Choy, 2014).
The populations for this study were teachers in public universities of Ethiopia. According to the ministry of education of Ethiopia, there are around 40000 teachers in the universities. There are around 42 public universities in the country. These universities are grouped by the Ministry of Science and Higher Education (MoSHE) under four categories (research, applied, comprehensive, and specialized universities). First, four universities were chosen by simple random sampling from their respective categories (one from each category). Thus, Hawassa, Bulehora, Wolkite, and Addis Ababa Science and Technology universities were selected (Hawassa from research, Bulehora from comprehensive/general, Wolkite from applied, and Addis Ababa Science and Technology University from the specialized universities). Then, six specializations (Engineering and technology, Natural and computational sciences, Agricultural science, Computing and informatics, Social science and humanities, and Education and behavioral sciences) were selected using simple random sampling. Using simple random sampling from the different stratum aggregated by gender, academic status, and specializations, teachers (n=380) were included from the four randomly selected public universities. This ensured that samples were representative of the relevant subgroups (see Table 1). The number of samples was determined considering the teacher population of public universities in Ethiopia based on the sample table of Krejcie and Morgan (1970). Samples from each of the academic statuses were included carefully because they also relate to teaching experience and assessment training. With regard to participants, all active teachers during the data collection were eligible for being included in the study. This means teachers who were engaged in professional development programs elsewhere during data collection and those non-Ethiopian citizens were not included. The difference in the number of male and female samples was due to the persistent gap in the number of male and female teachers in public universities of Ethiopia.
For the purpose of this study, the API- Assessment Practices Inventory (Zhang & Burry-Stock, 2003) was adopted and used for data collection. It consisted of 67 items measured on two rating scales: “Use” and “Skill”. The use scale ranged from 1 (not at all used) to 5 (used very often), the skill scale from 1 (not at all skilled) to 5 (very skilled). This instrument was selected because it covered a relatively wider range of issues by providing a larger pool of items that fit to the objective of this research. This study incorporated 58 API and 5 demographic items giving a total of 63 items. Based on the pilot study, 6 items were adapted to suit to the purpose and context of the study (public universities of Ethiopia), while 43 items were adopted from API and 9 items were dropped due to validity and reliability issues, or not being applicable at the university level in the Ethiopian context. From the questionnaires distributed to 380 respondents, 360 of them were returned. However, 355 were used for analysis because 5 were not completed properly.
With regard to reliability, in the API in Zhang & Burry-Stock (2003), the Cronbach’s alpha values of the scales ranged from .89 to .77 for assessment practices (use scale) and from .91 to .85 for self-perceived assessment skills (skill scale). A pilot study was conducted prior to the actual study from which the ‘skill’ scale had an overall Cronbach’s alpha value of.94, and the ‘use’ scale had an overall value of α = .96 indicating a strong internal consistency of the items. On the final data, the overall Cronbach’s alpha value for the ‘skill’ scale was .97 and for the ‘use’ scale was.95 indicating a strong internal consistency. The Cronbach alpha value of the components varied from .94 to .82 for the ‘skill’ scale and from .90 to.80 for the ‘use’ scale. All the items in the scales as well as within the factors were reliable as they were all above the required minimum standard .70.
Construct validity is about how well a test measures the concept it was designed to measure. It’s crucial to establishing the overall validity of an instrument. For ensuring construct validity of an instrument, a factor analysis utilizing Principal Component Analysis (PCA) with varimax rotation method can be conducted (Taherdoost, 2016). This is because the Principal Component Analysis (PCA) method is capable of reducing and eliminating factors that are less dominant or relevant in influencing a certain performance. Therefore, for this study, PCA was used to determine the construct validity of the instrument. Thus, items loading above 0.40 (the minimum recommended value in research) were considered for further analysis but items cross loading above 0.40 were deleted as recommended. In this way, the factor analysis results will satisfy the criteria of both of the construct validity types including discriminant validity (loading of at least 0.40, no cross-loading of items above 0.40) and convergent validity (eigenvalues of 1, loading of at least 0.40). As a result, 9 items which were loading below the minimum recommended value (0.40) and which were cross-loading above 0.40 were removed. Finally, 58 items were retained for use in the final data collection. In addition, face validity and content validity of the instrument was ascertained by subject specialists in the area of assessment.
During data collection, all the necessary procedures were given due attention starting from getting permission from respective bodies. Thus, the researchers obtained support and approval letters from Hawassa University’s School of Graduate Studies before beginning the data collection activities. The letters were given to the representatives of the selected universities and permission to contact teachers was solicited from deans of sampled colleges from the selected universities. All participants signed an informed consent and agreed to the processing of data for research purposes. Participants’ names were made anonymous. The data collectors were briefed in advance on some important points with respect to the study such as its objectives, the nature of the questionnaire and the respondents and what they should do before, during and after the data collection. Respondents were accessed physically based on their convenience by travelling to their respective universities (Hawassa, Wolkite, Bulehora, and Addis Abeba science and technology universities) and were provided with the questionnaire to fill in, through department heads and college deans in most cases. The data collection was completed in 25 days from June 5 to 30, 2023. Emphasis was given to every step of the way in order to minimize non-response rate. During the data collection and data analysis period all reasonable care was employed to ensure that the confidentiality of the data was maintained. Codes were used to identify universities and participants in the study. Identifiable and sensitive information linking to universities and participants were withheld to ensure that the integrity of the study was not compromised.
Multivariate analysis of variance (MANOVA) was employed to examine the differences in assessment practices among teachers with respect to their different academic status, teaching experience, assessment training and specializations. MANOVA was used to determine whether or not two or more categorical grouping variables (and/or their combinations) significantly affect optimally weighted linear combinations of two or more normally distributed outcome variables. As there were four continuous dependent variables out of the assessment practices scale and four independent categorical variables, MANOVA was the best option for the analysis.
With respect to MANOVA assumptions, all the dependent variables are continuous and the four independent variables included in the MANOVA are categorical and have more than two groups. There was independence of observations. Moreover, this study has made sure that adequate sample size was incorporated. The Box’s M test of 533.849 indicates that the homogeneity of covariance matrices across groups is not fulfilled (F(220, 6968.646) = 1.78, p = 0.000) since the p-value for the Box’s Test is.000, which is lower than the commonly used value of α = .001. In such cases, it is recommended to utilize a more robust MANOVA test statistic, Pillai’s Trace, when interpreting the MANOVA results (Tabachnick and Fidell, 2007). Finally, the dependent variables are moderately correlated (not greater than 0.9) indicating no effect of multicollinearity with a maximum value of.816 in this study.
The four categorical independent variables included in the study are teachers’ academic status, years of teaching experience, field of specialization, and levels of assessment training received. Academic status has five categories (graduate assistant, lecturer, assistant professor, associate professor, and professor). Teaching experience, although a continuous variable, was categorized for convenience in the analysis and interpretation with five years range (1-5 years, 6-10, 11-15, 16-20, and 21 and above years). Field of specialization has six categories (Engineering and technology, Natural and computational science, Agricultural science, Computing and informatics, Social science and humanities, and Education and behavioral sciences). Finally, assessment training received has four categories (no assessment training at all, training on some topics of assessment, a course dedicated to assessment, and more than one course of assessment). Moreover, the combined dependent variable for the analysis was assessment practice which included four specific variables that were dealt with the MANOVA analysis. These are designing assessment (Usedesign); administering, scoring and interpreting assessments (AdmnScorInterp); grading assessments (UseGrading); and application of assessment results (UseApplication).
Pillai’s Trace was used for interpretation of results because it is recommended to be a more robust test statistic for interpreting the MANOVA results when the covariance matrices are not homogeneous (Tabachnick and Fidell, 2007). Moreover, when groups differ along more than one variate, Pillai’s trace is most powerful and Roy’s root is least (Field, 2009). Therefore, Pillai’s trace was used in the analysis instead of Wilk’s Lambda which is the most common.
The MANOVA results of the main effects for the combined dependent variables (see Table 2) indicate that academic status/level has- Pillai’s Trace = .306, F (16, 984) = 5.09, P = .000, η2 = .076; Teaching experience- Pillai’s Trace = .259, F (16, 984) = 4.27, P = .000, η2 = .065; Field of specialization- Pillai’s Trace = .108, F (20, 984) = 1.37, P = .130, η2 = .027; Assessment training- Pillai’s Trace = .091, F (12, 735) = 1.91, P = .030, η2 = .030. These outcomes depict that academic status, teaching experience and assessment training had significant differences with respect to the combined dependent variable (assessment practices) but field of specialization was not significant here. However, significant interaction effects of academic status and specialization was observed on the combined dependent variable at P = .027.
Effect | Value | F | Df | Error df | Sig. | η2 |
---|---|---|---|---|---|---|
Acadstatus | .306 | 5.090 | 16.000 | 984.000 | .000*** | .076 |
Texperience | .259 | 4.265 | 16.000 | 984.000 | .000*** | .065 |
Specialization | .108 | 1.366 | 20.000 | 984.000 | .130 | .027 |
Training | .091 | 1.912 | 12.000 | 735.000 | .030* | .030 |
Acadstatus * Texperience | .084 | 1.326 | 16.000 | 984.000 | .173 | .021 |
Acadstatus * specialization | .263 | 1.445 | 18.000 | 984.000 | .027 | .066 |
Acadstatus * training | .036 | .736 | 12.000 | 735.000 | .716 | .012 |
Texperience * specialization | .278 | 1.227 | 60.000 | 984.000 | .120 | .070 |
Texperience * training | .163 | 1.042 | 40.000 | 984.000 | .400 | .041 |
specialization * training | .237 | 1.109 | 56.000 | 984.000 | .276 | .059 |
Acadstatus * Texperience * specialization | .027 | .824 | 8.000 | 488.000 | .582 | .013 |
Acadstatus * Texperience * training | .028 | .855 | 8.000 | 488.000 | .554 | .014 |
Acadstatus * specialization * training | .104 | 1.636 | 16.000 | 984.000 | .054 | .026 |
Texperience * specialization * training | .142 | .752 | 48.000 | 984.000 | .893 | .035 |
Acadstatus * Texperience * specialization * training | .000 | .b | .000 | .000 | . | . |
Analysis of variance (ANOVA) was conducted on each dependent variable as a follow-up test to MANOVA (see Table 3). Thus, significant differences were observed in academic status for three of the dependent variables, i.e. assessment design, assessment grading, and assessment application. For assessment design F (4, 246) = 3.22, P = .013, η2 = .050; for assessment grading F (4, 246) = 17.33, P = .000, η2 = .220; for assessment application F (4, 246) = 12.84, P = .000, η2 = .173. Academic status had not revealed significant difference for assessment administration, scoring and interpretation. In teaching experience, significant differences were observed for two of the dependent variables, i.e. assessment grading and assessment application. For assessment grading F (4, 246) = 9.77, P = .000, η2 = .137; for assessment application F (4, 246) = 13.81, P = .000, η2 = .183. Teaching experience did not reveal significant differences for two of the dependent variables, i.e. assessment design; and assessment administration, scoring and interpretation. In the contrary, field of specialization of teachers revealed significant differences in assessment design only, F(5, 246) = 2.82, P = .017, η2 = .054. Assessment training received by teachers showed significant differences in all of the dependent variables. For assessment design F (3, 246) = 2.59, P = .020, η2 = .031; for assessment administration, scoring and interpretation F (3, 246) = 4.10, P = .007, η2 = .048; for assessment grading F (3, 246) = 3.43, P = .018, η2 = .040; for assessment application F (3, 246) = 3.82, P = .011, η2 = .045. Lastly, no significant interaction effects of the independent variables were observed in all of the dependent variables.
Tests of between-subjects effects | |||||||
---|---|---|---|---|---|---|---|
Independent V. | Dependent variable | S.S | Df | MnSq | F | Sig. | η2 |
Acadstatus | Usedesign | 307.151 | 4 | 76.788 | 3.220 | .013* | .050 |
UseAdmnScorInterp | 32.256 | 4 | 8.064 | 2.142 | .076 | .034 | |
UseGrading | 213.996 | 4 | 53.499 | 17.329 | .000*** | .220 | |
UseApplication | 212.729 | 4 | 53.182 | 12.839 | .000*** | .173 | |
Texperience | Usedesign | 62.710 | 4 | 15.678 | .657 | .622 | .011 |
UseAdmnScorInterp | 4.891 | 4 | 1.223 | .325 | .861 | .005 | |
UseGrading | 120.603 | 4 | 30.151 | 9.766 | .000*** | .137 | |
UseApplication | 228.789 | 4 | 57.197 | 13.808 | .000*** | .183 | |
Specialization | Usedesign | 336.996 | 5 | 67.399 | 2.827 | .017* | .054 |
UseAdmnScorInterp | 24.752 | 5 | 4.950 | 1.315 | .258 | .026 | |
UseGrading | 24.621 | 5 | 4.924 | 1.595 | .162 | .011 | |
UseApplication | 27.545 | 5 | 5.509 | 1.330 | .252 | .016 | |
Training | Usedesign | 185.613 | 3 | 61.871 | 2.595 | .020* | .031 |
UseAdmnScorInterp | 46.314 | 3 | 15.438 | 4.100 | .007** | .048 | |
UseGrading | 31.765 | 3 | 10.588 | 3.430 | .018* | .040 | |
UseApplication | 47.516 | 3 | 15.839 | 3.824 | .011* | .045 |
Post Hoc tests were conducted to find out which groups are significantly different for the dependent variables for which significant univariate significance was found. Thus, academic status of teachers showed differences with respect to the four assessment practices dependent variables. For example, in designing assessments, graduate assistants and lecturers were significantly different from all the other groups at p < .001. Assistant professors, associate professors and professors significantly differed in assessment design with graduate assistants and lecturers at p < .001 but not with each other. In administration, scoring and interpretation of assessments, graduate assistants and lecturers significantly differed from all the other groups at p < .001. Assistant professors and associate professors were significantly different from graduate assistants and lecturers at p < .001 and the former differed significantly with professors at.035 while the latter didn’t show significant difference. In assessment grading, graduate assistants and lecturers significantly differed from all the other levels at p < .001. Assistant professors, associate professors and professors were significantly different from graduate assistants and lecturers at p < .001 but not significantly with each other. Similarly, in application of assessment results, assistant professors, associate professors and professors had significant differences with graduate assistants and lecturers at p < .001 but not with each other (see Table 4 of underlying data).
Moreover, differences were observed within years of teaching experience with respect to the assessment practices dependent variables. For example, in designing assessments, teachers with teaching experience of below 10 years were significantly different from all the other groups above 10 (i.e., 11-15, 16-20, 21 and above) at p < .001. similarly, in administering, scoring and interpreting assessments, teachers having below 10 years of experience had significant differences with all the other groups having 11 and above years at p < .001. Teachers with 11 to 15 years of experience had significant differences with 1 to 5, 6-11 and 21 and above years of experience at p < .001 but they did not have significant differences with 16-20 years of experience. Teachers with 16 to 20 years of experience had significant differences with those groups below 10 years of experience at p < .001 but not significantly with the other groups. Lastly, teachers with 21 and above years of experience had significant differences with those groups having experience of below 15 years at p < .001 but not significantly with those 16 to 20 years of experience. In grading assessments, the groups below 10 years of experience had significant differences with those groups having experience of 16 years and above at p < .001. This scenario was similar in application of assessments as well. Generally, this results show that teaching experience matters with respect to teachers’ assessment practices (see Table 5 of underlying data).
Significant differences were also observed in the different levels of assessment training teachers received with respect to the assessment practices dependent variables. For example, in designing assessments, teachers who received no assessment training at all had significant differences from all the other groups at p < .001. Those who received training on some topics of assessment had significant differences with all the other groups p < .001 but not significantly with those who had a course in assessment. Those who received more than one course of assessment had significant differences from all the other groups at p < .001. In assessment administration, scoring and interpretation, teachers who received no assessment training at all and teachers who received training on only some topics of assessment had significant differences from all the other groups at p < .001. Those teachers who received one course dedicated to assessment had significant differences from all the other groups at p < .001 except with those who took some topics of assessment with which significant difference was not observed. Lastly, those teachers who received training on more than one course of assessment had significant differences with all the other groups at p < .001. In relation to grading of assessments, teachers who received no assessment training at all and those who received training on only some topics of assessment had significant differences with all the other groups at p < .001. This scenario was similar for those who received training on one course of assessment and also more than one courses of assessment as well. In the same way, in application of assessment results, all the groups having any level of assessment training had significant differences with those having no training at all at p < .001 (see Table 6 of underlying data).
Lastly, from the Post Hoc Test results, it is seen that significant differences were observed in assessment design between education and behavioral sciences and the other specializations at p < .001. In administering, scoring and interpreting assessments also education and behavioral sciences had significant differences with all specializations at p < .05.
The study revealed that level of academic status, years of teaching experience and assessment training teachers received depicted significant differences in teachers’ assessment practices. Moreover, significant interaction effects of academic status and specialization was observed on the assessment practices of teachers. In a more detailed analysis, the different levels of each of the independent variables (academic status, teaching experience, field of specialization, and assessment training) were compared on the basis of the four dependent variables of assessment practices (designing assessments; administering, scoring and interpreting assessments; grading assessments; and application of assessment results).
Academic status (graduate assistants, lecturers, assistant professors, associate professors, and professors) was compared against each of the dependent variables and significant differences were observed among the different levels among them. For instance, in designing assessments, graduate assistants and lecturers were significantly different from all the other higher academic statuses and with each other as well. Higher academic statuses (assistant professors, associate professors, professors) were significantly different from the relatively lower academic statuses (graduate assistants and lecturers) but not with each other.
In administering, scoring, and interpreting assessments, graduate assistants and lecturers significantly differed from all the other higher academic statuses and with each other as well. Assistant professors were significantly different from graduate assistants, lecturers, and professors but not with associate professors. Associate professors were significantly different from graduate assistants and lecturers but not from assistant professors and professors. Lastly, professors were significantly different from graduate assistants, lecturers and with assistant professors. However, significant difference was not observed with associate professors.
In assessment grading, graduate assistants and lecturers were significantly different from all the other levels. Assistant professors were significantly different in assessment grading form graduate assistants and lecturers but not from associate professors and professors. Similarly, associate professors and professors were significantly different in assessment grading form graduate assistants and lecturers but not with assistant professors. Lastly, in application of assessment results, graduate assistants and lecturers were significantly different from all the other academic levels. However, assistant professors, associate professors and professors had significant differences with graduate assistants and lecturers in a similar way but not significantly with each other.
Overall, the findings revealed that academic status of university teachers is significantly related to their assessment practices. This is supported by Park et al. (2016) who revealed that the academic level of teachers is a sole contributor to their assessment practices because teachers with higher academic levels demonstrate a more nuanced understanding and use of assessment practices. Gable et al. (2012) also found that teachers with higher academic levels and academic status exhibited greater preparedness to implement evidence-based assessment practices. Other studies also suggested that teachers with a higher academic levesl demonstrate a deeper understanding of assessment principles and are more likely to employ effective assessment strategies in their teaching (Bailey and Garner, 2010; Scarino, 2013; Holmboe et al., 2010).
Differences were observed among university teachers with different years of teaching experience (1 to 5, 6 to 10, 11 to 15, 16-20, 21 and above years) with respect to the different assessment practices dependent variables. For example, in designing assessments, teachers with 1 to 5 and 6 to 10 years of teaching experience were significantly different from all the other groups with more than 10 years of teaching experience. Teachers with more than 10 years of experience had significant differences from those with below 10 years of experience, but were not significantly different from one another.
In administering, scoring and interpreting assessments, teachers having below 10 years of teaching experience had significant differences with all the other groups with more than 10 years of experience. Teachers with 11 to 15 years of experience had significant differences from those with below 10 years and 21 and above years of experience but not from those with 16 to 20 years of experience. Teachers with 16 to 20 years of experience had significant differences with those less than 10 years of experience but not from the other groups with more than 10 years of experience. Lastly, those teachers with 21 and above years of experience had significant differences with those less than 15 years of experience but not from those with 16 to 20 years. In grading assessments, the groups below 10 years of experience had significant differences with those groups having experience of 16 years and above. Similarly, those with 21 and above years of experience had significant differences from all the other groups except from those with 16 to 20 years of experience which were not significant. This scenario in grading was similar in application of assessments as well.
Overall, these findings revealed that teaching experience matters in classroom assessment practices of teachers and has practical implications for institutional administrators who work with both inexperienced and experienced teachers. In this respect, studies found that teaching experience significantly influences teachers’ assessment practices and that experienced teachers demonstrate a deeper understanding of assessment principles and are adept at implementing formative and summative assessment techniques (Berendonk et al., 2012; Gamage et al., 2020; Cook and Ellaway, 2015). Similarly, other studies emphasized the dynamic nature of assessment practices and the need for teachers to develop multifaceted assessment literacy through training and practical experiences to develop their confidence to adapt to changing educational landscapes (Willis et al., 2013; Brookhart, 2011). As it has been shown that teachers with more teaching experience gain more confidence in using different assessment practices, they can therefore be asked to mentor the novice teachers in their institutions, departments etc.
In the different levels of assessment training (no assessment training at all, training on only some topics of assessment, training on one course dedicated to assessment, and training on more than one courses of assessment), significant differences were observed when compared on the basis of the four assessment practices dependent variables. For example, in designing assessments, teachers who received no assessment training at all depicted significant differences from all the other groups. Teachers who received training on some topics of assessment had significant differences with all the other groups except from those who had a course dedicated to assessment. Those who received a course dedicated to assessment had significant differences from all the other groups except from those with training on some topics of assessment. Those who received more than one course of assessment had significant differences from all the other groups.
In assessment administration, scoring and interpretation, teachers who received no assessment training at all had significant differences from all the other groups. Those teachers who received training on only some topics of assessment had significant difference with all the other groups except with those who took a course dedicated to assessment which showed no significant difference. Those teachers who received one course dedicated to assessment had significant differences from all the other groups except those who took some topics of assessment which significant difference was not observed. Lastly, those teachers who received training on more than one course of assessment had significant differences with all the other groups.
In relation to grading of assessments, those who received no assessment training at all and those who received training on only some topics of assessment had significant differences with all the other groups. This scenario was similar for those who received training on one course of assessment and also more than one courses of assessment as well. Lastly, in application of assessment results, those teachers who received no assessment training at all had significant differences with all the other groups. This scenario was also similar for the other groups as well. Therefore, the findings of this study clearly depict that assessment training teachers acquire fundamentally influences their assessment practices and have implications for teacher educators.
With training, it is noted that the quality of assessment training attained by the academic staff can determine the effectiveness of assessments they use in any learning institution (Yan and Brown, 2021). For this reason, delivering assessment training courses are some of the ways institutions can equip their academic staff with the required assessment competencies and skills (DeLuca et al., 2016; Livingston, & Hutchinson, 2016). Similarly, Shernoff et al. (2017) emphasized the necessity of integrating a course in assessment with an appropriate text to provide beginning teachers with a more comprehensive understanding of assessment.
Generally, no significant differences were observed among different specializations with respect to teachers’ overall assessment practices. However, significant interaction effects of teachers’ academic status and specialization was observed in this case. This meant that teachers of different specializations with different academic levels were different in their assessment practices. Moreover, significant differences were observed in assessment design between education and behavioral sciences and the other specializations (engineering and technology, agricultural sciences, computational sciences, computing and informatics, and social sciences and humanities). In administering, scoring and interpreting assessments also education and behavioral sciences had significant differences with all the other specializations. Similarly, Matovu and Zubairi (2015) reported statistically significant differences in the interpretation of assessment results in the specializations of Education and Arts. This difference is attributed to the fact that teachers from the education discipline are provided with assessment trainings as part of their university course works. This highlights the importance of developing teachers’ general assessment knowledge and skills that are applicable across different disciplines. This is because assessment skills and practices are not limited to subject-specific knowledge, but rather require a broader understanding of the assessment process, which is not heavily influenced by the teacher’s field of specialization (Brookhart, 2011; Scarino, 2013; Matovu and Zubairi, 2015; Popham, 2011). Therefore, prospective teachers need to be equipped with sufficient assessment knowledge and expertise that would make it possible for them to undertake effective assessments irrespective of their specializations.
As teachers play a pivotal role in the classroom assessment process, understanding how their attributes such as academic status, teaching experience, assessment training they received and specializations relate to their classroom assessment practices is very important. Therefore, findings from this study may provide valuable insights for understanding teachers’ personal attributes with respect to their classroom assessment practices in Ethiopia and elsewhere. This helps to adequately emphasize teachers’ readiness during their training to adequately conduct classroom assessments relative to the significant amount of time they spend on assessment-related activities.
This study revealed some important findings that can inform both assessment policy and practice especially in the context of higher education institutions. However, the findings were limited by the use of a self-reported survey questionnaire only. The study would have benefited from analysis of relevant documents or having some important interviews with teachers. The gender disparity in the respondent pool was also one of the constraints faced by the study. Future studies may use multiple methods of data collection including classroom observation, document analysis such as analysis of teacher-made assessments, and interviews to validate teacher self-report data.
As teachers are primarily responsible for conducting classroom assessment, their classroom assessment practices are fundamental to the education process. Therefore, this study examined differences in assessment practices of public university teachers of Ethiopia with respect to their different academic status, teaching experience, assessment training and specializations. The study concluded that different levels of academic status, teaching experience and assessment training have significant relationships with assessment practices and that they trigger significant differences in the overall assessment practices of teachers. There are significant differences in the different assessment practice variables between teachers with higher academic statuses and those with relatively lower ones. This scenario works in a similar way for years of teaching experience as well as levels of assessment training they received. There are significant differences in the different levels of experience and assessment training teachers received with respect to the specific assessment practices dependent variables. Therefore, the null hypothesis that there is no significant difference in assessment practices of teachers with respect to their different academic status, teaching experience, and assessment training is rejected. The claim that there is no relationship between these variables is also rejected. Moreover, significant differences were observed among some specializations with respect to some specific assessment practices dependent variables although significant differences were not observed with respect to the combined dependent variable (assessment practice).
Therefore, the findings from this study indicate that professional development programs having practical implications for classroom assessment should be devised and provided to university teachers based on appropriate needs. Real time assessment experiences and training opportunities for teachers should be arranged according to the curriculum needs and existing gaps in assessment irrespective of their fields of specializations. The content of teacher education course requirements should be analyzed based on the curriculum needs and ensure the inclusion of assessment courses. In addition, any form of in-service workshop offerings should consider teachers’ needs based on the curriculum requirements to narrow persistent assessment practice gaps among teachers of different academic status, experiences and specializations. Training targeted directly towards critical assessment skills that teachers are required to have with respect to their specializations and the academic level of their students. This is fundamental in ensuring that all learners are assessed appropriately. Future studies who wish to study teachers’ classroom assessment practices should be able to corroborate findings from appropriate qualitative methods to complement findings from quantitative analyses. In future studies, besides the teacher related factors, considering other variables which are potentially related to assessment practices in universities would be appropriate.
Hawassa University, College of Education, Research Ethics Review Committee (RERC) has approved the implementation of the research on June 03, 2023; with reference number COE- REC-007/2023 (Letter of Research Ethics Review Committee). The Committee confirmed that the research met the criteria that subject risk is minimized; subjects are selected equitably; respect for person-informed consent is determined and has been documented appropriately and privacy and confidentiality are maintained. Written informed consent was obtained from all participants.
The datasets used in our analysis are available from:
Zenodo: Teacher attributes as predictors of assessment practices in public universities of Ethiopia, https://doi.org/10.5281/zenodo.12518529 (Yilma et al., 2024).
The study contains the following underlying data:
• Table 1.xlsx: depicts demographic characteristics of the respondents.
• Table 2.xlsx: depicts multivariate tests for the combined dependent variables.
• Table 3.xlsx: depicts the follow-up analysis using univariate ANOVA
• Table 4.xlsx: depicts multiple comparisons among different academic status of teachers.
• Table 5.xlsx: depicts multiple Comparisons for teaching experience.
• Table 6.xlsx: depicts multiple comparisons for assessment training teachers received.
Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0)
Zenodo: Teacher attributes as predictors of assessment practices in public universities of Ethiopia, https://doi.org/10.5281/zenodo.12518529 (Yilma et al., 2024).
The study contains the following extended data:
• Items used for pilot study.xlsx: depicts the original sets of items used in the pilot study.
• Modified items.xlsx: depicts items retained with some modifications based on the pilot study.
• Items excluded.xlsx: depicts items excluded based on results from the pilot study.
• Items used for final data collection.xlsx: depicts the final sets of items used for the final data collection.
• STROBE Checklist.xlsx: depicts STROBE checklist for cross-sectional study.
Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Assessment practices, mathematics teaching and learning, teacher professional development.
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 1 04 Jul 24 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)