logo logo European Journal of Educational Research

EU-JER is is a, peer reviewed, online academic research journal.

Subscribe to

Receive Email Alerts

for special events, calls for papers, and professional development opportunities.

Subscribe

Publisher (HQ)

Eurasian Society of Educational Research
Eurasian Society of Educational Research
7321 Parkway Drive South, Hanover, MD 21076, USA
Eurasian Society of Educational Research
Headquarters
7321 Parkway Drive South, Hanover, MD 21076, USA
answer copying indices item response theory pirls cheating detection standardized testing test integrity

Implementation of the Omega (ω) Index to Detect Large-Scale Systematic Cheating

Alvin Vista

Cheating detection is an important issue in standardized testing, especially in large-scale settings. Statistical approaches are often computationally.

C

Cheating detection is an important issue in standardized testing, especially in large-scale settings. Statistical approaches are often computationally intensive and require specialised software to conduct. We present a two-stage approach that quickly filters suspected groups using statistical testing on an IRT-based answer-copying index. We also present an approach to mitigate data contamination and improve the performance of the index. The computation of the index was implemented through a modified version of an open source R package, thus enabling wider access to the method. Using data from PIRLS 2011 (N=64,232) we conduct a simulation to demonstrate our approach. Type I error was well-controlled and no control group was falsely flagged for cheating, while 16 (combined n=12,569) of the 18 (combined n=14,149) simulated groups were detected. Implications for system-level cheating detection and further improvements of the approach were discussed.

Keywords: Answer-copying indices, item response theory, PIRLS, cheating detection, standardized testing, test integrity.

cloud_download PDF
Cite
Article Metrics
Views
399
Download
553
Citations
Crossref
0

Scopus
0

References

Benbow, J., Mizrachi, A., Oliver, D., & Said-Moshiro, L. (2007). Large class sizes in the developing world: What do we know and what can we do. Washington, DC: American Institute for Research.

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). Reading, MA: Addison-Wesley.

Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37(1), 29.51.

Bock, R.D., & Aitkin, M. (1981). Marginal Maximum Likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443-459.

British Broadcasting Corporation. (2015). India students caught 'cheating' in exams in Bihar. Retrieved from http://www.bbc.com/news/world-asia-india-31960557.

Chalmers R.P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1-29.

Chajewski, M.,  Kim, Y., Antal, J. & Sweeney, K. (2014). Macro level systems of statistical evidence indicative of cheating. In Kingston, N. M., & Clark, A. K. (Eds.). Test fraud: Statistical detection and methodology. New York, NY: Routledge.

Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum Likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B, 39(1), 1-38.

Holland, P. W. (1996). Assessing unusual agreement between the incorrect answers of two examinees using the K-index: Statistical theory and empirical support (ETS Tech. Rep. No. 96–4). Princeton, NJ: Educational Testing Service.

International Association for the Evaluation of Educational Achievement. (2012). PIRLS 2011. Boston, MA: TIMSS & PIRLS International Study Center.

Levine, M. & Drasgow, F. (1982). Appropriateness measurement: Review, critique and validating studies. British Journal of Mathematical Statistical Psychology, 35, 42-56.

Levine, M. & Rubin, D. (1979). Measuring the appropriateness of multiple-choice test scores. Journal of Educational Statistics, 4, 269-290.

Martin, M.O. & Mullis, I.V.S. (Eds.). (2012). Methods and procedures in TIMSS and PIRLS 2011. Chestnut Hill, MA: TIMSS & PIRLS International Study Center.

Mislevy, R.J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56(2), 177–196.

Mislevy, R.J., Johnson, E.G. & Muraki, E. (1992). Scaling procedures in NAEP. Journal of Educational Statistics, 17(2), 131–154.

Muraki, E. (1992). A Generalized Partial Credit Model: Application of an EM Algorithm. Applied Psychological Measurement, 16(2), 159-176.

R Core Team. (2016). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0. Retrieved from http://www.R-project.org/

Romero, M., Riascos, A., & Jara, D. (2015). On the Optimality of Answer-Copying Indices: Theory and Practice. Journal of Educational and Behavioral Statistics, 40(5), 435–453.

Rubin, D.B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: John Wiley & Sons.

Sunbul, O., & Yormaz, S. (2018). Effects of Test Level Discrimination and Difficulty on Answer-Copying Indices. International Journal of Evaluation and Research in Education, 7(1), 32-38.

Sotaridona, L. S., & Meijer, R. R. (2002). Statistical properties of the K-index for detecting answer copying in a multiple-choice test. Journal of Educational Measurement, 39(2), 115–132.

Sotaridona, L.S., & Meijer, R.R. (2003). Two new statistics to detect answer copying. Journal of Educational Measurement, 40(1), 53-69.

van der Linden, W.J., & Sotaridona, L.S.(2006). Detecting answer copying when the regular response process follows a known response model. Journal of Educational and Behavioral Statistics, 31(3), 283-304.

Wollack, J.A.(1997). A nominal response model approach to detect answer copying. Applied Psychological Measurement, 21, 307–320.

Wollack, J.A.(2004). Detecting answer copying on high-stakes tests. The Bar Examiner, 73(2), 35-45.

Wollack, J. A., & Cohen, A. S. (1998). Detection of answer copying with unknown item and trait parameters. Applied Psychological Measurement, 22(2), 144-152.

Zopluoglu, C., & Davenport Jr, E. C. (2012). The empirical power and type I error rates of the GBT and ω indices in detecting answer copying on multiple-choice tests. Educational and Psychological Measurement, 72(6), 975-1000.

Zopluoglu, C. (2013). CopyDetect: An R package for computing statistical indices to detect answer copying on multiple-choice examinations. Applied Psychological Measurement, 37(1), 93-95.

...