catIRT tools: A â€œShinyâ€ Application for Item Response Theory Calibration and Computerized Adaptive Testing Simulation

Eren Can Aybek

catIRT tools: A â€œShinyâ€ Application for Item Response Theory Calibration and Computerized Adaptive Testing Simulation

Authors

Eren Can Aybek
Faculty of Education, Pamukkale University

Keywords:

Computerized Adaptive Testing, Item Response Theory, Post-Hoc Simulation, Shiny Application

Abstract

The study aims to introduce catIRT tools which facilitates researchersâ€™ Item Response Theory (IRT) and Computerized Adaptive Testing (CAT) simulations. catIRT tools provides an interface for mirt and catR packages through the shiny package in R. Through this interface, researchers can apply IRT calibration and CAT simulations although they do not have any coding skills. Dichotomous and polytomous IRT models are supported in IRT calibration and Yenâ€™s Q3 statistics is calculated for the estimation of local independence. In CAT simulation, researchers can use their own parameters and responses, and can produce items or responses. In addition to several item selection and ability estimation methods, researchers can also decide on the specific stopping rule to be used.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Downloads

Published

2021-03-23

How to Cite

Aybek, E. C. (2021). catIRT tools: A â€œShinyâ€ Application for Item Response Theory Calibration and Computerized Adaptive Testing Simulation. Journal of Applied Testing Technology, 22(1), 23–27. Retrieved from http://jattjournal.net/index.php/atp/article/view/155939

Download Citation

Issue

Volume 22, Issue 1, 2021

Section

Articles

References

Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43(4), 561-573. https://doi.org/10.1007/BF02293814.

Babcock, B. and Weiss, D. (2012). Termination criteria in computerized adaptive tests: Do variable - length CATs provide efficient and effective measurement? Journal of Computerized Adaptive Testing, 1(1), 1-18. https://doi.org/10.7333/1212-0101001.

Barton, M. A. and Lord, F. M. (1981). An upper asymptote for the threeâ€parameter logistic itemâ€response model. ETS Research Report Series, 1981(1), i-8. https://doi.org/10.1002/j.2333-8504.1981.tb01255.x.

Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37(1), 29-51. https://doi.org/10.1007/ BF02291411.

Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1-29. https://doi.org/10.18637/jss.v048.i06.

Chalmers, R. P. (2016). Generating adaptive and non-adaptive test interfaces for multidimensional item response theory applications. Journal of Statistical Software, 71(5), 1-39. https://doi.org/10.18637/jss.v071.i05.

Chang, W., Cheng, J., Allaire, J. J., Xie, Y. and McPherson, J. (2018). shiny: Web application framework for R. R package version 1.2.0. https://CRAN.R-project.org/package=shiny.

Chen, S. K., Hou, L. and Dodd, B. G. (1998). A comparison of maximum likelihood estimation and expected a posteriori estimation in CAT using the partial credit model. Educational and Psychological Measurement, 58(4), 569595. https://doi.org/10.1177/0013164498058004002.

Chen, S. Y., Ankenmann, R. D. and Chang, H. -H. (2000). A comparison of item selection rules at the early stages of computerized adaptive testing. Applied Psychological Measurement, 24(3), 241-255. https://doi.org/10.1177/01466210022031705.

Choi, S. W. (2009). Firestar: Computerized adaptive testing simulation program for polytomous item response theory models. Applied Psychological Measurement, 33(8), 644-645. https://doi.org/10.1177/0146621608329892.

Choi, S. W. and Swartz, R. J. (2009). Comparison of CAT item selection criteria for polytomous items. Applied Psychological Measurement, 33(6), 419440. https://doi.org/10.1177/0146621608327801. PMid: 20011456, PMCid: PMC2791416.

Davey, T. (2011). A guide to computer adaptive testing systems. Council of Chief State School Officers.

De Ayala, R. J. (2009). The Theory and Practice of Item Response Theory. New York: The Guilford Press.

Eggen, T. J. H. M. (1999). Item selection in Adaptive Testing with the sequential probability ratio test. Applied Psychological Measurement, 23(3), 249-261. https://doi.org/10.1177/01466219922031365.

Hambleton, R., Swaminathan, R. and Rogers, H. J. (1991). Fundamentals of Item Response Theory. California: Sage Pub.

Han, K. T. (2007). WinGen: Windows software that generates IRT parameters and item responses. Applied Psychological Measurement, 31(5), 457-459. https://doi.org/10.1177/0146621607299271.

Kosinski, M. and Rust, J. (2011). The development of Concerto: An open-sourceonline adaptive testing platform. Paper presenetd at the International Association for Computerized Adaptive Testing, Pacific Grove, CA.

Linacre, J. (2000). Computer-Adaptive Testing: A Methodology whose Time has Come. In: S. Chae, U. Kang, E. Jeon, & J. Linacre (Eds.), Development of Computerized Middle School Achievement Tests. Seoul, Korea: Komesa Press.

Lorenzo-Seva, U. and Ferrando, P. J. (2013). FACTOR 9.2 a comprehensive program for fitting exploratory and semiconfirmatory factor analysis and IRT models. Applied Psychological Measurement, 37(6), 497-498. https://doi.org/10.1177/0146621613487794.

Magis, D. and Barrada, J. R. (2017). Computerized adaptive testing with R: Recent updates of the packagecatR. Journal of Statistical Software, Code Snippets, 76(1), 1-19. https://doi.org/10.18637/jss.v076.c01.

Magis, D. and Raiche, G. (2012). Random generation of response patterns under computerized adaptive testing with the R package catR. Journal of Statistical Software, 48(8), 1-31. https://doi.org/10.18637/jss.v048.i08.

Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149-174. https://doi.org/10.1007/ BF02296272.

Meyer, J. P. (2014). Applied Measurement with jMetrik. Routledge. https://doi.org/10.4324/9780203115190.

Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. ETS Research Report Series, 1992(1), i-30. https://doi.org/10.1002/j.2333-8504.1992.tb01436.x.

R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.

Rizopoulos, D. (2006). ltm: An R package for latent variable modelling and item response theory analyses. Journal of Statistical Software, 17(5), 1-25. https://doi.org/10.18637/jss.v017.i05. http://www.jstatsoft.org/v17/i05/.

Samejima, F. (1996). Polychotomous responses and the test score. Tennessee: The University of Tennessee.

Van der Linden, W.J. (2017). Introduction. In: W.J. Van der Linden (Ed.), Handbook of Item Response Theory Volume One: Models. London: CRC Press. https://doi.org/10.1201/ b19166.

Wang, T. and Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing. Journal of Educational Measurement, 35(2), 109-135.

https://doi.org/10.1111/j.1745-3984.1998.tb00530.x.

Weiss, D. J. (2004). Computerized adaptive testing for effective and efficient measurement in counseling and education. Measurement and Evaluation in Counseling and Development, 37(2), 70. https://doi.org/10.1080/07481756.2004.11909751.

Weiss, D. J. and Kingsbury, G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361-375.

https://doi.org/10.1111/j.1745-3984.1984.tb01040.x.

Weiss, D. J. (1983). New Horizons in Testing. Latent Trait Test Theory and Computerized Adaptive Testing. New York: Academic Press.

catIRT tools: A â€œShinyâ€ Application for Item Response Theory Calibration and Computerized Adaptive Testing Simulation