Using Multifacet Rasch Analysis to Examine the Effectiveness of Rater Training

This paper sought to provide a comprehensive analysis of a rater training program through the use of multifacet Rasch measurement. The purpose was to display how such an analysis can provide specific information on raters that is useful for feedback, and also important information concerning the performance of the rating form and training materials. This information is particularly useful for the ongoing development of a rater training program. 

The interaction analysis indicated that several rater trainees were engaging in inconsistent rating patterns with specific crews. This provides a particularly valuable piece of information for the training facilitator. It allows the facilitator to provide this feedback to these raters and to investigate their reasons for the ratings they provided. It also begs the question of how consistent raters will remain following a training program. If follow-up training were to be provided, the consistency of raters over time could be analyzed using the multifacet Rasch bias analysis. 

One of the benefits of this type of bias analysis is in its ability to identify discrepant and unexpected interactions between raters and ratees. Feedback can be given to, and just as importantly sought, from raters concerning their perceptions of crews with whom they have unexpected interactions. It is this individual-level of interaction analysis that makes the multifacet Rasch approach useful for the evaluation of rater training. Although interactions can be modeled using G-theory, information about the interactions of individual raters and ratees is not possible. If it were acceptable to the parties involved, an adjustment to raters’ total scores for specified crews could be made, based upon the results of the bias analysis. From the standpoint of the actual evaluations that are given to aircrews following training, such corrections could be made based upon a rater’s estimated severity. In multifacet Rasch parlance, this would result in a more “objective” assessment.