Why quality is crucial in language tests
Every day, recipients of test results base far-reaching decisions on this information – decisions that significantly influence careers, educational paths, and professional prospects. These high-stakes decisions profoundly impact lives. They must be sound, fair, and empirically robust.
Our language certificates provide you with more than just a snapshot of your performance. They offer evidence-based decision-making confidence. We understand quality not as an abstract seal of approval, but as a scientifically grounded process. Based on the international standard ISO 29992 and the standards of psychometric research, we ensure that our tests deliver exactly what you expect: valid data for your specific decision-making purpose.
1. The purpose of the test: Precision instead of one-size-fits-all solutions
In modern aptitude diagnostics, the following applies: No test is universally valid. Validity depends on the interpretation and use of the results. A test result is only valuable if the testing procedure is precisely tailored to the decision-making situation.
Our approach: Each exam format defines a clear purpose. We determine which language skills are relevant.
Your advantage: You will not receive a general assessment, but a precise, domain-specific statement about whether the tested person can meet the requirements of your field.
2. Validity: The foundation of your decision
The most important measure of a test's quality is its validity. From a scientific perspective, this means: How well do empirical evidence and theoretical foundations support the correctness of the conclusions you draw from the test results?
We work according to the principle of argument-based validation, i.e., we prove the quality through a chain of evidence-based arguments:
Content validity: Our tasks represent the relevant language skills and avoid irrelevant knowledge (“Construct Underrepresentation”).
Criterion validity: Test results demonstrably correlate with actual language proficiency in practice.
Construct validity: Our tasks specifically measure the language skills that are relevant, and not other, irrelevant skills.
3. Standards for quality assurance: ISO 29992 and DeuZert
To make quality verifiable, we adhere to the international standard ISO 29992:2018. This standard covers the entire testing lifecycle: from item creation through administration and evaluation to the archiving of results.
In 2020, the ISO standards committee ISO/TC 232 "Education and Learning Services" published ISO 29992:2018, a standard aimed at organizations that provide learning services, as well as those that select, use, or develop assessments. The international standard ISO 29992:2018 has been adopted as a national standard by the German Institute for Standardization (DIN) and is published under the title DIN ISO 29992. The standard addresses the planning, development, administration, evaluation, and quality control of all types of assessments of learning events and explicitly includes the assessment of language skills. In the context of implementing ISO 29992 in accordance with the standard, test providers must ensure that the assessments they develop meet key quality requirements, particularly regarding validity, reliability, objectivity, fairness, and transparency. It is particularly important to note that the standard emphasizes the use and consequences of the test results.
For independent quality assurance, we collaborate with DeuZert Deutsche Zertifizierung in Bildung und Wirtschaft GmbH. DeuZert audits our processes annually according to ISO/IEC 17065, confirms compliance with ISO 29992, and awards the "DeuZert – ISO 29992" quality seal.
4. From test result to responsible decision
A test result is more than just a number – it's the basis for decisions about people. We ensure this responsibility through a rigorous, scientifically sound quality process:
Objectivity & Standardization: All procedures are standardized, and raters are regularly trained and calibrated. Inter-rater reliability is continuously monitored, ensuring that results remain comparable regardless of rater, location, or daily condition.
Fairness & Bias Control: Our tasks are statistically tested for construct-irrelevant variance. Using methods such as paired t-tests, multifaceted Rasch analyses (MFRM), and differential item functions (DIF), we ensure that results reflect only language proficiency – independent of cultural background, gender, or world knowledge.
Reliability & Consistency: We check the stability of the results over time (test-retest reliability), the comparability of different test versions (parallel test reliability), and the internal consistency of the tasks (Cronbach's alpha).
Validity & Significance: The results are systematically analyzed to confirm that they measure the intended competencies (construct validity, factor analyses) and allow reliable predictions about real requirements (criterion validity, extrapolation validity).
Item quality: Each task is tested for difficulty, discrimination, and the effectiveness of distractors, so that only valid and differentiating items are included in the evaluation.
Scientifically based results: Our certificates provide you with the result of a scientifically sound testing procedure. You can rely on the results being objective, reliable, and valid.
5. Common misconceptions: What the CEFR can do – and what it can't.
The CEFR is a Council of Europe instrument for describing language skills, not a "language police" or accreditation authority. There is no central body that "officially approves" test providers for the CEFR.
Quality is not demonstrated by simply mentioning the CEFR, but by verifiable alignment. We document that our tasks accurately reflect the "can-do" descriptions of levels A1–C1 and assess the required competencies.
Licensing vs. quality assurance: Licensing models organize testing centers, but do not replace independent quality assurance.
Educational institutions: Even state-approved institutions do not automatically possess the competence to independently develop or evaluate examinations. Quality is based on scientific standards and proven procedures.
