我们的网站为什么显示成这样?

可能因为您的浏览器不支持样式,您可以更新您的浏览器到最新版本,以获取对此功能的支持,访问下面的网站,获取关于浏览器的信息:

|本期目录/Table of Contents|

 参数估计误差对多级评分题型测验等值的影响(PDF)

《心理学探新》[ISSN:1003-5184/CN:36-1228/B]

期数:
 2024年06期
页码:
 550-558,565
栏目:
 
出版日期:
 2024-12-20

文章信息/Info

Title:
 The Impact of Parameter Estimation Error on IRT Linking Methods with Polytomous Items
文章编号:
1003-5184(2024)06-0550-09
作者:
 王少杰1张敏强2黄菲菲3刘 颖4
 (1.广东第二师范学院教育学院,广州 510303; 2.华南师范大学心理学院,广州 510631; 3.广东技术师范大学教育科学学院,广州 510665; 4.广东第二师范学院教师教育学院,广州 510303)
Author(s):
 Wang Shaojie1Zhang Minqiang2Huang Feifei3Liu Ying4
 (1.School of Education,Guangdong University of Education,Guangzhou 510303; 2.School of Psychology,South China Normal University,Guangzhou 510631; 3.School of Educational Science,Guangdong Polytechnic Normal University,Guangzhou 510665; 4.School of Teacher Education,Guangdong University of Education,Guangzhou 510303)
关键词:
 参数估计误差 多级评分题型 测验等值 信息量加权 特征曲线方法
Keywords:
 Parameter estimation error polytomous item IRT linking information weighted characteristic curve methods
分类号:
 B841.2
DOI:
 -
文献标识码:
 A
摘要:
 信息量加权特征曲线方法在二级评分题型测验等值中表现优异,但未有研究探讨参数估计误差的作用。本文将其扩展到多级评分题型。通过模拟研究探讨参数估计误差、考生能力差异、题目数量对等值的影响。采用特征曲线与误差类指标评估等值表现。结果发现测验信息量加权特征曲线方法略优于传统方法,其他方法与传统方法相当。参数估计误差与考生能力差异越小,题目数量越大,测验等值表现越优。偏差与方差权衡现象为测验等值提供新方向。
Abstract:
 The information-weighted characteristic curve methods have shown excellent performance in IRT linking with dichotomous items.However,few researches explore the effect of parameter estimation error on test linking.This paper extends the information-weighted characteristic curve methods to IRT linking with polytomous items and explores the effects of parameter estimation error,ability differences,and test length on linking through simulation studies.IRT linking performance was evaluated using indices related to characteristic curves and errors.The results indicated that the information-weighted characteristic curve methods performed slightly better than the traditional characteristic curve methods,while other new methods performed as well as the traditional methods.The linking performance was better when the parameter estimation error and ability differences were smaller,and the test was longer.The bias and variance tradeoff provides a new direction for test linking and equating.

参考文献/References

 戴海崎.(2000).等级反应模型项目特征曲线法等值研究.心理学探新,20(3),49-53.
王菲,任杰,张泉慧,曹文静.(2013).等级记分模型下几种等值方法的比较研究.中国考试,(6),10-17.
王少杰,张敏强,黄菲菲,黄丽芳,袁琪婷.(2022).项目反应理论观察分数核等值的影响因素.心理科学,45(4),988-997.
周骏,欧东明,徐淑媛,戴海琦,漆书青.(2005).等级反应模型下项目特征曲线等值法在大型考试中的应用.心理学报,37(6),126-132.
Andersson,B.(2018).Asymptotic variance of linking coefficient estimators for polytomous IRT models.Applied Psychological Measurement,42(3),192-205.
Barrett,M.D.,& van der Linden,W.J.(2019).Estimating linking functions for response model parameters.Journal of Educational and Behavioral Statistics,44(2),180-209.
Chalmers,R.P.(2012).Mirt:A multidimensional item response theory package for the R environment.Journal of Statistical Software,48(6),1-29.
De Ayala,R.J.,Smith,B.,& Norman Dvorak,R.(2018).A comparative evaluation of kernel equating and test characteristic curve equating.Applied Psychological Measurement, 42(2),155-168.
Haebara,T.(1980).Equating logistic ability scales by a weighted least squares method.Japanese Psychological Research,22(3),144-149.
He,Y.,& Cui,Z.(2020).Evaluating robust scale transformation methods with multiple outlying common items under IRT true score equating.Applied Psychological Measurement,44(4),296-310.
Hori,K.,Fukuhara,H.,& Yamada,T.(2022).Item response theory and its applications in educational measurement Part I:Item response theory and its implementation in R.Wiley Interdisciplinary Reviews:Computational Statistics,14(2),e1531.
Kang,T.,& Petersen,N.S.(2012).Linking item parameters to a base scale.Asia Pacific Education Review,13(2),311-321.
Kaskowitz,G.S.,& De Ayala,R.J.(2001).The effect of error in item parameter estimates on the test response function method of linking.Applied Psychological Measurement,25(1),39-52.
Kim,S.(2006).A comparative study of IRT fixed parameter calibration methods.Journal of Educational Measurement,43(4),355-381.
Kim,S.(2010).An extension of least squares estimation of IRT linking coefficients for the graded response model.Applied Psychological Measurement,34(7),505-520.
Kim,S.,& Kolen,M.J.(2007).Effects on scale linking of different definitions of criterion functions for the IRT characteristic curve methods.Journal of Educational and Behavioral Statistics,32(4),371-397.
Kolen,M.J.(2020).Equating with small samples(commentary).Applied Measurement in Education,33(1),77-82.
Kolen,M.J.,& Brennan,R.L.(2014). Test equating,scaling,and linking:Methods and practices.New York:Springer.
K?nig,C.,Spoden,C.,& Frey,A.(2020).An optimized Bayesian hierarchical two-parameter logistic model for small-sample item calibration.Applied Psychological Measurement,44(4),311-326.
Lee,W.C.,& Ban,J.C.(2009).A comparison of IRT linking procedures.Applied Measurement in Education,23(1),23-48.
Li,Y.H.,& Lissitz,R.W.(2004).Applications of the analytically derived asymptotic standard errors of item response theory item parameter estimates.Journal of Educational Measurement,41(2),85-117.
Liu,R.(2020).Addressing score comparability in diagnostic classification models:An observed-score equating and linking approach.Behaviormetrika,47(1),55-80.
Manna,V.F.,& Gu,L.(2019).Different methods of adjusting for form difficulty under the Rasch model:Impact on consistency of assessment results.ETS Research Report Series,(1),1-18.
Marcq,K.,& Andersson,B.(2022).Standard Errors of Kernel Equating:Accounting for Bandwidth Estimation.Applied Psychological Measurement,46(3),200-218.
Mohri,M.,Rostamizadeh,A.,& Talwalkar,A.(2018).Foundations of machine learning.MIT Press.
Peabody,M.R.(2020).Practical issues in linking and equating with small samples.Applied Measurement in Education,33(1),1-2.
R Core Team.(2019).R:A language and environment for statistical computing.R Foundation for Statistical Computing,Vienna,Austria.
Roberts,J.S.,Bao,H.,Huang,C.W.,& Gagne,P.(2003,April).Exploring alternative characteristic curve approaches to linking parameter estimates from the generalized partial credit model.Paper presented at the annual meeting of the National Council on Measurement in Education,Chicago,Illinois.
Sinharay,S.,& Holland,P.W.(2010).A new approach to comparing several equating methods in the context of the NEAT design.Journal of Educational Measurement,47(3),261-285.
Stocking,M.L.,& Lord,F.M.(1983).Developing a common metric in item response theory.Applied Psychological Measurement,7(2),201-210.
Trierweiler,T.J.,Lewis,C.,& Smith,R.L.(2017,July).Reducing conditional error variance differences in IRT scaling.Paper presented at the annual meeting of the Psychometric Society,Zurich,Switzerland.
von Davier,M.,Yamamoto,K.,Shin,H.J.,Chen,H.,Khorramdel,L.,Weeks,J.,...Kandathil,M.(2019).Evaluating item response theory linking and model fit for data from PISA 2000-2012.Assessment in Education:Principles,Policy & Practice,26(4),466-488.
Wallin,G.,H?ggstr?m,J.,& Wiberg,M.(2021).How Important is the Choice of Bandwidth in Kernel Equating?Applied Psychological Measurement,45(7-8),518-535.
Wang,S.,Zhang,M.,Lee,W.,Huang,F.,Li,Z.,Li,Y.,& Yu,S.(2022).Two IRT characteristic curve linking methods weighted by information.Journal of Educational Measurement,59(4),423-441.
Zhang,Z.(2021a).Asymptotic standard errors of generalized partial credit model true score equating using characteristic curve methods.Applied Psychological Measurement,45(5),331-345.
Zhang,Z.(2021b).Asymptotic standard errors of parameter scale transformation coefficients in test equating under the nominal response model.Applied Psychological Measurement,45(2),134-138.

备注/Memo

备注/Memo:
 基金项目:广州市哲学社会科学共建项目(2023GZGJ169)。
通信作者:王少杰,E-mail:wang021112@126.com。
更新日期/Last Update:  2024-12-20