Header

Criterion validity: 2 Types & Examples

Types of Criterion validity

Criterion validity indicates the extent to which a test is correlated with an already-proven standard known as a criterion.

If the results of a measurement instrument meet those of another accepted instrument called a “gold standard”, then that instrument has criterion-related validity.

The criterion variable, or the gold standard, measures the same constructor behavior. The process of finding criterion validity becomes straightforward if there is a gold standard. For example, in medical research, test scores can be compared to clinical assessments.

However, there is no gold standard in many cases. For instance, for pain measurement, there is no gold standard. The researcher has to rely on the responses of the respondents. In such a situation, the criterion of validity cannot be established.

The important point to note is that the criterion validity is as worthy as the level of a gold standard or reference measure. If the gold standard or reference measure has some sort of bias, its validity can be affected. Alternatively, a comparison made with a biased gold standard will result in low criterion validity. It is because the two biased tests or measures confirm one another.

So it is not guaranteed through criterion validity that the tool is valid.

Types of Criterion Validity

Types of Criterion validity

The different types of validity are explained as follows:

Predictive validity

Predictive validity is a way to demonstrate that a test score can predict future performance on another criterion.It is important to have good predictive validity when choosing measures for employment or educational purposes. It is because this will increase the likelihood of choosing individuals who perform well in those areas.

It is established by demonstrating that a measure correlates with an external criterion measured at a later point in time.

For example, by measuring the grades of students in the same program at the same university as their previous scores on A-levels, a researcher could determine the predictive criterion validity of A-level assessments.

If there is a high correlation, predictive criterion validity would be high; if there is little or no correlation, predictive content validity would be low.

Concurrent validity

Concurrent criterion validity is established by demonstrating that a measure correlates with an external criterion that is measured simultaneously. For example, concurrent validity could be measured if scores on a math test correlate highly with scores on another math test administered simultaneously.

This approach is useful when the constructs being measured are similar but may not perfectly overlap. In this case, it is important to demonstrate that the measure under study predicts variance in the criterion. Additionally, the above can be predicted by other measures of the same construct.

This approach is also useful when the focus is on practical outcomes and a measure’s concurrent validity. Therefore, it needs to be demonstrated in relation to other measures of a similar outcome.

Concurrent validity is usually demonstrated through correlational analyses, although other approaches, such as regression, may also be used.

There are two ways to establish criterion validity. In the first instance, the researcher will administer the new as well as the validated measure to the same respondents and compare the results. In the second instance, the researcher, after administering the new measure to the respondents, will compare the results with the expert’s judgments.

 

Measurement of criterion validity

Criterion validity can be assessed in two ways.

  1. When a new measurement is statistically tested against an independent criterion or standard for the establishment of concurrent validity,
  2. When statistically tested against future performance for the establishment of predictive validity.

It is important to note that the new measure that is to be validated should have a correlation with a well-established measure of the construct under study. This is called a criterion variable.

This criterion variable will be used to find the correlations with the test’s scores. For this purpose, a correlation coefficient like Pearson’s r will be used. This “r” indicates the strength of the correlation between the two variables. Its values lie between -1 and +1. The different values of “r” show the degree of correlation between the two variables in the following situations:

  • r= -1 indicates perfect negative correlation
  • r= 0 indicates no correlation
  • r = +1 indicates perfect positive correlation

This correlation coefficient can be directly calculated through Excel, R, SPSS, or any other software.

A positive correlation indicates a valid test, while a zero or negative correlation between the measure and the criterion variable indicates the test is not valid and that the measure and the criterion variables don’t measure the same concept.

An example of measuring criterion validity:

Imagine that the researcher is interested in developing his own test for measuring self-efficacy. For the development of validity, the new test will be compared with a criterion variable. These two tests, namely the new test and the criterion variable, will be given to the same sample of respondents. Then a correlation will be found between these two tests. A high correlation value will indicate that the new test is valid.

Examples of Criterion-Related Validity

The following examples will clarify the concept of criterion validity.

Intelligence tests (including emotional intelligence)

Intelligence tests have criterion validity when they can correctly identify individuals who will succeed or excel in a particular area. For example, the Stanford-Binet Intelligence Scale is often used to identify students who may need special education services.

Emotional intelligence tests, on the other hand, can be useful predictors of job performance in customer service or management positions or in any context where people are expected to work successfully in teams together.

Intelligence tests often assess criterion-related validity in comparison to a known standard. For example, a new intelligence test might be validated against the Stanford-Binet Intelligence Scale, or an emotional intelligence test might be validated against a measure of job performance.

If the new test is found to be a good predictor of the criterion, it can be considered to have criterion validity. This is an example of concurrent criterion validity.

Job applicant tests

Other examples of criterion-related validity include measures of physical fitness related to on-the-job safety and measures of memory or knowledge related to academic performance.

As with intelligence tests, the best prediction of job performance is usually obtained when using a combination of different types of measures.

In general, criterion-related validity is strongest when the criterion (what you are trying to predict) is objective and quantifiable, such as test scores or sales figures (Schmidt, 2012).

Psychiatric diagnosis

Psychiatric diagnosis is the process of classifying individuals with psychological disorders using both clinical assessments and symptomatology.

The most common methods used to diagnose psychological disorders are the Diagnostic and Statistical Manual of Mental Disorders (DSM) and the International Classification of Diseases (ICD).

Here, tests of criterion validity can span from the diagnostic criteria to the validity of the external measures used to confirm a diagnosis. However, according to many researchers, the DSM has failed to achieve its goal of validity.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table of Contents

Discover more from Theresearches

Subscribe now to keep reading and get access to the full archive.

Continue reading