[Home ] [Archive]

[ فارسی ]

Main Menu

Home

Journal Information

Articles archive

For Authors

For Reviewers

Registration

Contact us

Site Facilities

Search in website

Receive site information

Search published articles

Showing 4 results for Rasch Measurement

A New Method for Standard-setting Using the Rasch Model

Purya Baghaei, Reza Pishghadam, Safoora Navari,
Volume 12, Issue 1 (3-2009)

Abstract

Due to deficiencies of the traditional models of standard setting, this study intends to suggest a new method for setting standards employing Rasch measurement. Precise and efficient methods for setting performance standards and linking tests to ability scales is a much-felt need in today's educational contexts. The introduction of the Common European Framework of Reference as a common paradigm for language teaching and assessment stressed the need for such methods. The suggested method combines the classic test-centered method of standard setting with the probabilistic properties of the Rasch model to set several cut points on the ability continuum. The Wright map which jointly depicts the difficulty location of items and the ability location of persons on a common scale is the cornerstone of this method.

Rater Bias in Assessing Iranian EFL Learners’ Writing Performance

Mahnaz Saeidi, Mandana Yousefi, Purya Baghayei,
Volume 16, Issue 1 (3-2013)

Abstract

Evidence suggests that variability in the ratings of students’ essays results not only from their differences in their writing ability, but also from certain extraneous sources. In other words, the outcome of the rating of essays can be biased by factors which relate to the rater, task, and situation, or an interaction of all or any of these factors which make the inferences and decisions made about students’ writing ability undependable. The purpose of this study, therefore, was to examine the issue of variability in rater judgments as a source of measurement error this was done in relation to EFL learners’ essay writing assessment. Thirty two Iranian sophomore students majoring in English language participated in this study. The learners’ narrative essays were rated by six different raters and the results were analyzed using many-facet Rasch measurement as implemented in the computer program FACETS. The findings suggest that there are significant differences among raters concerning their harshness as well as several cases of bias due to the rater-examinee interaction. This study provides a valuable understanding of how effective and reliable rating can be realized, and how the fairness and accuracy of subjective performance can be assessed.

Rater Errors among Peer-Assessors: Applying the Many-Facet Rasch Measurement Model

,
Volume 18, Issue 2 (9-2015)

Abstract

In this study, the researcher used the many-facet Rasch measurement model (MFRM) to detect two pervasive rater errors among peer-assessors rating EFL essays. The researcher also compared the ratings of peer-assessors to those of teacher assessors to gain a clearer understanding of the ratings of peer-assessors. To that end, the researcher used a fully crossed design in which all peer-assessors rated all the essays MA students enrolled in two Advanced Writing classes in two private universities in Iran wrote. The peer-assessors used a 6-point analytic rating scale to evaluate the essays on 15 assessment criteria. The results of Facets analyses showed that, as a group, peer-assessors did not show central tendency effect and halo effect; however, individual peer-assessors showed varying degrees of central tendency effect and halo effect. Further, the ratings of peer-assessors and those of teacher assessors were not statistically significantly different.

Construct Validation of a Rating Scale through a Training Program: A Multifaceted Rasch Analysis in Speaking Assessment

Wander Lowie, Houman Bijani, Mohammad Reza Oroji, Zeinab Khalafi, Pouya Abbasi,
Volume 26, Issue 2 (9-2023)

Abstract

Performance testing including the use of rating scales has become highly widespread in the evaluation of second/foreign oral assessment. However, few studies have used a pre-, post-training design investigating the impact of a training program on the reduction of raters’ biases to the rating scale categories resulting in increase in their consistency measures. Besides, no study has used MFRM including the facets of test takers’ ability, raters’ severity, task difficulty, group expertise, scale category, and test version all in a single study. 20 EFL teachers rated the oral performances produced by 200 test takers before and after a training program using an analytic rating scale including fluency, grammar, vocabulary, intelligibility, cohesion and comprehension categories. The outcome of the study indicated that MFRM can be used to investigate raters’ scoring behavior and can result in enhancement in rater training and validating the functionality of the rating scale descriptors. Training can also result in higher levels of interrater consistency and reduced levels of severity/leniency; however, it cannot turn raters into duplicates of one another, but can make them more self-consistent. Training helped raters use the descriptors of the rating scale more efficiently of its various band descriptors resulting in reduced halo effect. Finally, the raters improved consistency and reduced rater-scale category biases after the training program. The remaining differences regarding bias measures could probably be attributed to the result of different ways of interpreting the scoring rubrics which is due to raters’ confusion in the accurate application of the scale.

Page 1 from 1

Persian site map - English site map - Created in 0.07 seconds with 28 queries by YEKTAWEB 4666