Evaluating Research Reports on the Qualities of Tests of English Language Skills in Indonesian Schools: A Systematic Review

Saved in:
Bibliographic Details
Title: Evaluating Research Reports on the Qualities of Tests of English Language Skills in Indonesian Schools: A Systematic Review
Language: English
Authors: Patrisius Istiarto Djiwandono, Daniel Ginting
Source: Language Education & Assessment. 2025 8(1).
Availability: Castledown Publishers. Ground Level, 470 St Kilda Road, Melbourne, 3004, Australia. Tel: 646-520-0676; e-mail: contact@castledown.com; Web site: https://www.castledown.com/journals/lea/
Peer Reviewed: Y
Page Count: 18
Publication Date: 2025
Document Type: Journal Articles
Information Analyses
Descriptors: Foreign Countries, Language Tests, English (Second Language), Second Language Learning, Second Language Instruction, Databases, Research Reports, Construct Validity, Content Validity, Test Validity, Periodicals, Psychometrics, Test Items, Item Analysis, Test Reliability, Faculty Development, Testing, Specialists, Test Construction, Multiple Choice Tests, Testing Problems, Formative Evaluation, Summative Evaluation, Writing Tests
Geographic Terms: Indonesia
ISSN: 2209-3591
Abstract: The teaching of English as a foreign language in Indonesia has a long history, and it is always important to ask whether the assessment of the students' language skills has been valid and reliable. A screening of many articles in several prominent databases reveal that a number of evaluation studies have been done by Indonesian scholars in the last 14 years. This paper reports a systematic review with the aim of critiquing those evaluation studies to see the soundness of their methods and their results. PRISMA framework was used to screen a large number of articles from the databases and to finally obtain 14 research papers published in various journals. The findings indicate that most of the studies were focused on the analysis of the items in multiple-choice tests, and on the content validity, reliability and construct validity of those tests. A further scrutiny revealed that many of these studies lacked methodological rigor, including the absence of expert judgment in content validation, limited application of psychometric frameworks such as Aiken's V formula, and insufficient procedures for construct validation. While the measurement of the item difficulty, item discriminatory power, and distractors' efficiency were relatively adequate, the approaches to determining the content validity, construct validity, and reliability of the tests remained overly subjective and inconsistent. These findings highlight the need for improvements in language test research practices in Indonesia, including structured training for teachers in language assessment, the adoption of psychometric-based validation methods, and systematic involvement of expert judgment in test development processes.
Abstractor: As Provided
Entry Date: 2025
Accession Number: EJ1481093
Database: ERIC
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://eric.ed.gov/contentdelivery/servlet/ERICServlet?accno=EJ1481093
    Name: ERIC Full Text
    Category: fullText
    Text: Full Text from ERIC
Header DbId: eric
DbLabel: ERIC
An: EJ1481093
AccessLevel: 3
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 0
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Evaluating Research Reports on the Qualities of Tests of English Language Skills in Indonesian Schools: A Systematic Review
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Patrisius+Istiarto+Djiwandono%22">Patrisius Istiarto Djiwandono</searchLink><br /><searchLink fieldCode="AR" term="%22Daniel+Ginting%22">Daniel Ginting</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <searchLink fieldCode="SO" term="%22Language+Education+%26+Assessment%22"><i>Language Education & Assessment</i></searchLink>. 2025 8(1).
– Name: Avail
  Label: Availability
  Group: Avail
  Data: Castledown Publishers. Ground Level, 470 St Kilda Road, Melbourne, 3004, Australia. Tel: 646-520-0676; e-mail: contact@castledown.com; Web site: https://www.castledown.com/journals/lea/
– Name: PeerReviewed
  Label: Peer Reviewed
  Group: SrcInfo
  Data: Y
– Name: Pages
  Label: Page Count
  Group: Src
  Data: 18
– Name: DatePubCY
  Label: Publication Date
  Group: Date
  Data: 2025
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Journal Articles<br />Information Analyses
– Name: Subject
  Label: Descriptors
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Foreign+Countries%22">Foreign Countries</searchLink><br /><searchLink fieldCode="DE" term="%22Language+Tests%22">Language Tests</searchLink><br /><searchLink fieldCode="DE" term="%22English+%28Second+Language%29%22">English (Second Language)</searchLink><br /><searchLink fieldCode="DE" term="%22Second+Language+Learning%22">Second Language Learning</searchLink><br /><searchLink fieldCode="DE" term="%22Second+Language+Instruction%22">Second Language Instruction</searchLink><br /><searchLink fieldCode="DE" term="%22Databases%22">Databases</searchLink><br /><searchLink fieldCode="DE" term="%22Research+Reports%22">Research Reports</searchLink><br /><searchLink fieldCode="DE" term="%22Construct+Validity%22">Construct Validity</searchLink><br /><searchLink fieldCode="DE" term="%22Content+Validity%22">Content Validity</searchLink><br /><searchLink fieldCode="DE" term="%22Test+Validity%22">Test Validity</searchLink><br /><searchLink fieldCode="DE" term="%22Periodicals%22">Periodicals</searchLink><br /><searchLink fieldCode="DE" term="%22Psychometrics%22">Psychometrics</searchLink><br /><searchLink fieldCode="DE" term="%22Test+Items%22">Test Items</searchLink><br /><searchLink fieldCode="DE" term="%22Item+Analysis%22">Item Analysis</searchLink><br /><searchLink fieldCode="DE" term="%22Test+Reliability%22">Test Reliability</searchLink><br /><searchLink fieldCode="DE" term="%22Faculty+Development%22">Faculty Development</searchLink><br /><searchLink fieldCode="DE" term="%22Testing%22">Testing</searchLink><br /><searchLink fieldCode="DE" term="%22Specialists%22">Specialists</searchLink><br /><searchLink fieldCode="DE" term="%22Test+Construction%22">Test Construction</searchLink><br /><searchLink fieldCode="DE" term="%22Multiple+Choice+Tests%22">Multiple Choice Tests</searchLink><br /><searchLink fieldCode="DE" term="%22Testing+Problems%22">Testing Problems</searchLink><br /><searchLink fieldCode="DE" term="%22Formative+Evaluation%22">Formative Evaluation</searchLink><br /><searchLink fieldCode="DE" term="%22Summative+Evaluation%22">Summative Evaluation</searchLink><br /><searchLink fieldCode="DE" term="%22Writing+Tests%22">Writing Tests</searchLink>
– Name: Subject
  Label: Geographic Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Indonesia%22">Indonesia</searchLink>
– Name: ISSN
  Label: ISSN
  Group: ISSN
  Data: 2209-3591
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: The teaching of English as a foreign language in Indonesia has a long history, and it is always important to ask whether the assessment of the students' language skills has been valid and reliable. A screening of many articles in several prominent databases reveal that a number of evaluation studies have been done by Indonesian scholars in the last 14 years. This paper reports a systematic review with the aim of critiquing those evaluation studies to see the soundness of their methods and their results. PRISMA framework was used to screen a large number of articles from the databases and to finally obtain 14 research papers published in various journals. The findings indicate that most of the studies were focused on the analysis of the items in multiple-choice tests, and on the content validity, reliability and construct validity of those tests. A further scrutiny revealed that many of these studies lacked methodological rigor, including the absence of expert judgment in content validation, limited application of psychometric frameworks such as Aiken's V formula, and insufficient procedures for construct validation. While the measurement of the item difficulty, item discriminatory power, and distractors' efficiency were relatively adequate, the approaches to determining the content validity, construct validity, and reliability of the tests remained overly subjective and inconsistent. These findings highlight the need for improvements in language test research practices in Indonesia, including structured training for teachers in language assessment, the adoption of psychometric-based validation methods, and systematic involvement of expert judgment in test development processes.
– Name: AbstractInfo
  Label: Abstractor
  Group: Ab
  Data: As Provided
– Name: DateEntry
  Label: Entry Date
  Group: Date
  Data: 2025
– Name: AN
  Label: Accession Number
  Group: ID
  Data: EJ1481093
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1481093
RecordInfo BibRecord:
  BibEntity:
    Languages:
      – Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 18
    Subjects:
      – SubjectFull: Foreign Countries
        Type: general
      – SubjectFull: Language Tests
        Type: general
      – SubjectFull: English (Second Language)
        Type: general
      – SubjectFull: Second Language Learning
        Type: general
      – SubjectFull: Second Language Instruction
        Type: general
      – SubjectFull: Databases
        Type: general
      – SubjectFull: Research Reports
        Type: general
      – SubjectFull: Construct Validity
        Type: general
      – SubjectFull: Content Validity
        Type: general
      – SubjectFull: Test Validity
        Type: general
      – SubjectFull: Periodicals
        Type: general
      – SubjectFull: Psychometrics
        Type: general
      – SubjectFull: Test Items
        Type: general
      – SubjectFull: Item Analysis
        Type: general
      – SubjectFull: Test Reliability
        Type: general
      – SubjectFull: Faculty Development
        Type: general
      – SubjectFull: Testing
        Type: general
      – SubjectFull: Specialists
        Type: general
      – SubjectFull: Test Construction
        Type: general
      – SubjectFull: Multiple Choice Tests
        Type: general
      – SubjectFull: Testing Problems
        Type: general
      – SubjectFull: Formative Evaluation
        Type: general
      – SubjectFull: Summative Evaluation
        Type: general
      – SubjectFull: Writing Tests
        Type: general
      – SubjectFull: Indonesia
        Type: general
    Titles:
      – TitleFull: Evaluating Research Reports on the Qualities of Tests of English Language Skills in Indonesian Schools: A Systematic Review
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Patrisius Istiarto Djiwandono
      – PersonEntity:
          Name:
            NameFull: Daniel Ginting
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2025
          Identifiers:
            – Type: issn-electronic
              Value: 2209-3591
          Numbering:
            – Type: volume
              Value: 8
            – Type: issue
              Value: 1
          Titles:
            – TitleFull: Language Education & Assessment
              Type: main
ResultId 1