Medianta Tarigan, Fadillah Fadillah
| Abstract views: 102 | views: 75


Intelligence as one of the individual abilities that is widely used in everyday life has been extensively studied and measured using psychological measurement tools. One of them is the Intelligenz Structure Test (IST). However, at this time IST has leakage through discussions made by many parties. Moreover, the process of IST adaptation to the Indonesian version which tends to translate each word allegedly results in a bias of meaning that can affect the validity of this measurement tools. Therefore, this study is aimed to evaluating the current quality of IST by testing the feasibility of the Indonesian version of IST items for verbal ability, namely SE (Satzergaenzung), WA (Wortauswahl), and AN (Analogien). Item Response Theory (IRT) is used as a research method. The data were collected from 2.064 participants who live in Bandung. The results of the analysis revealed that the SE, WA, and AN subtest are still valid. Based on 60 items analyzed, 71.67% of the items have good quality, i.e. 43 of the 60 items have estimation of discriminant (a) parameter is acceptable. In addition, based on the fit item statistics it was also known that 78.33% of significant items followed the IRT model. Furthermore, based on statistics of item fit, it is also known that 78.33% of items fit the IRT model. This shows that the Indonesian version of IST is still valid to be used particularly in measuring verbal comprehension (V) through 3 subtests (SE, WA, and AN). However, it is necessary to revise the items that have been infected with DIF, in which 25% of items were declared to have a gender bias. 

Inteligensi sebagai salah satu kemampuan individu yang banyak berperan dalam kehidupan sehari-hari telah banyak diteliti dan diukur menggunakan alat ukur psikologi. Salah satunya adalah Intelligenz Struktur Test (IST). Namun, saat ini IST telah mengalami kebocoran melalui pembahasan yang dibuat oleh banyak pihak. Selain itu, proses adaptasi IST ke bahasa Indonesia yang cenderung menerjemahkan setiap kata secara langsung diduga mengakibatkan terjadinya bias makna yang dapat mempengaruhi keabsahan alat ukur ini. Oleh karena itu, penelitian ini ditujukan untuk mengevaluasi kualitas terkini IST dengan menguji kelayakan butir soal IST Bahasa Indonesia untuk kemampuan verbal, yaitu SE (Satzergaenzung), WA (Wortauswahl), dan AN (Analogien). Item Response Theory (IRT) digunakan sebagai metode penelitian ini. Data penelitian ini diperoleh dari 2.064 partisipan yang berdomisili di kota Bandung. Adapun penelitian ini menunjukkan hasil bahwa subtes SE, WA, dan AN masih tergolong valid. Berdasarkan 60 item yang dianalisis, 71,67% item memiliki kualitas yang cukup baik, yaitu 43 dari 60 item memiliki estimasi daya beda yang dapat diterima. Selain itu, berdasarkan statistik item fit juga diketahui 78,33% item signifikan mengikuti model IRT. Hal ini menunjukkan bahwa IST Bahasa Indonesia masih valid untuk digunakan terutama dalam mengukur verbal comprehension (V) melalui 3 subtes (SE, WA, dan AN). Namun, perlu dilakukan revisi terhadap item soal yang terjangkit DIF, di mana 25% butir soal dinyatakan mempunyai bias jenis kelamin.


intelligenz struktur test (ist); teori jawaban butir soal; verbal comprehension

Full Text:



Aliyu, R. T. (2015). Construct validity of mathematics test items using the rasch model. International Journal of Social Science and Humanities Research, 3(2), 22–28.

An, X., & Yung, Y. (2014). Item response theory: What it is and how you can use the IRT procedure to apply it. SAS Institute Inc., 1–14.

Becker, K. A. (2003). Stanford-Binet intelligence scales, assessment service bulletin number 1 history of the Stanford-Binet intelligence scales: Content and psychometrics. Intelligence, 1, 14.

Bichi, A. A., Embong, R., Talib, R., Salleh, S., & Bin Ibrahim, A. (2019). Comparative analysis of classical test theory and item response theory using chemistry test data. International Journal of Engineering and Advanced Technology, 8(5 C), 1260–1266.

Brocke, B., Beauducel, A., & Tasche, K. (1998). Der intelligenz-struktur-test: Analysen zur theoretischen grundlage und technischen güte. Diagnostica, 44(2).

Chiesi, F., Morsanyi, K., Donati, M. A., & Primi, C. (2018). Applying item response theory to develop a shortened version of the need for cognition scale. Advances in Cognitive Psychology, 14(3), 75–86.

da Silva, C. A. O., Cavalcanti, A. P. R., Lima, K. da S., Cavalcanti, C. A. M., Valente, T. C. de O., & Büssing, A. (2020). Item response theory applied to the spiritual needs questionnaire (SPNQ) in Portuguese. Religions, 11(3).

Diputera, A. M. (2018). Analisis IRT menggunakan Wingen 3: Teori respon butir & aplikasi. Uwais Inspirasi Indonesia.

Erguven, M. (2013). Two approaches in psychometric process: Classical test theory & item response theory. Journal of Education, 2(2), 23–30.

Fatkhudin, A., Surarso, B., & Subagio, A. (2016). Item response theory model empat parameter logistik pada computerized adaptive test. Jurnal Sistem Informasi Bisnis, 4(2), 121–129.

Foster, G. C., Min, H., & Zickar, M. J. (2017). Review of item response theory practices in organizational research: Lessons learned and paths forward. Organizational Research Methods, 20(3), 465–486.

Garcia, E., Aryal, S., Spence-Almaguer, E., Rohr, D., & Walters, S. T. (2018). Use of the IRT model to validate test items from a technology assisted health coaching program. Open Journal of Statistics, 8(3), 519–532.

Gittler, G. (1984). Entwicklung und erprobung eines neuen testinstruments zur messung des räumlichen vorstellungsvermögens. Zeitschrift Für Differentielle Und Diagnostische Psychologie, 5(2).

Guilford, J. P. (1972). Thurstone’s primary mental abilities and structure-of-intellect abilities. Psychological Bulletin, 77(2).

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory Library. SAGE Publications.

Ilhan, M., & Guler, N. (2018). A comparison of difficulty indices calculated for open-ended items according to classical test theory and many facet rasch model. Egitim Arastirmalari - Eurasian Journal of Educational Research, 2018(75), 99–114.

Kumolohadi, R., & Suseno, M. N. (2012). Intelligenz struktur test dan standard progressive matrices: (dari konsep inteligensi yang berbeda menghasilkan tingkat inteligensi yang sama). Jurnal Inovasi Dan Kewirausahaan, 1(2), 79–85.

Le, D. T. (2013). Applying item response theory modeling in educational research. [Disertasi, IOWA State University].

Pathak, A., Patro, K., Pathak, M., & Valecha, M. (2013). Item response theory. International Journal of Computer Science and Mobile Computing, 2(11), 7-11.

Rahmawati, E. (2014). Evaluasi karakteristik psikometri intelligenz struktur test (IST). Proceeding Seminar Nasional Psikometri, 270–282.

Ridho, A. (2013). Differential item functioning potensi akademik pada kelompok SMA-MA. Prosiding konferensi ilmiah nasional himpunan evaluasi pendidikan Indonesia (HEPI): Evaluasi Implementasi Kurikulum 2013 dan Sistem Penilaian. (pp.192-204). Himpunan Evaluasi Pendidikan Indonesia.

Rohmah, S., Kaniawati, I., & Ramalis, T. R. (2018). Analysing PISA-like assessment test measuring scientific literacy using three-parameter logistic (3PL) of IRT-2018. Journal of Physics: Conference Series, 1108(1).

S., C. E., & Thurstone, L. L. (1938). Primary mental abilities. The Mathematical Gazette, 22(251).

Sadhu, S., & Laksono, E. W. (2018). Development and validation of an integrated assessment for measuring critical thinking and chemical literacy in chemical equilibrium. International Journal of Instruction, 11(3), 557–572.

Schmidt-Atzert, L., & Deter, B. (1993). Intelligenz und ausbildungserfolg: Eine untersuchung zur prognostischen validität des I-S-T 70. Zeitschrift Für Arbeits- Und Organisationspsychologie, 37(2).

Spearman, C. (1904). “General Intelligence”, objectively determined and measured. The American Journal of Psychology, 15(2).

Sudaryono. (2011). Implementasi teori responsi butir (Item Response Theory) pada penilaian hasil belajar akhir di sekolah. Jurnal Pendidikan Dan Kebudayaan, 17(16), 719–732.

Veldhuis, M., Matton, N., & Vautier, S. (2014). Using IRT to evaluate measurement precision of selection tests at the french pilot training. International Journal of Aviation Psychology, 22 (1), 18-29.

Webster, A. S., & Wechsler, D. (1958). The measurement and appraisal of adult intelligence. The Journal of Criminal Law, Criminology, and Police Science, 49(4).

Xia, J., Tang, Z., Wu, P., Wang, J., & Yu, J. (2019). Use of item response theory to develop a shortened version of the EORTC QLQ-BR23 scales. Scientific Reports, 9(1), 1–10.

Copyright (c) 2021 Jurnal Muara Ilmu Sosial, Humaniora, dan Seni
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


  • There are currently no refbacks.