article

A computer-adaptive test (CAT) is a method for administering tests that adapts to the examinee's ability level. For this reason, it has also been called tailored testing.

How CAT works


CAT successively selects questions so as to maximize the precision of the exam, based on what is known about the examinee from previous questions. From the examinee's perspective, the difficulty of the exam seems to tailor itself to their level of ability. For example, if an examinee performs well on an item of intermediate difficulty, he will then be presented with a more difficult question. Or, if he performed poorly, he would be presented with a simpler question. Compared to static multiple choice tests, computer-adaptive tests require fewer test items to arrive at equally accurate scores.

The basic computer-adaptive testing method is an iterative algorithm with the following steps:

  1. The pool of available items is searched for the optimal item, based on the examinee's current ability estimate
  2. The chosen item is presented to the examinee, who then answers it correctly or incorrectly
  3. The ability estimate is updated, based upon all prior answers
  4. Steps 1–3 are repeated until a termination condition is met

Nothing is known about the examinee prior to the administration of the first item, so the algorithm is generally started by selecting an item of medium, or medium-easy, difficulty as the first item.

As a result of adaptive administration, different examinees receive quite different tests. The psychometric technology that allows equitable scores to be computed across different sets of items is item response theory (IRT). IRT is also the preferred methodology for selecting optimal items which are typically selected on the basis of information rather than difficulty, per se.

The TOEFL and the GRE General Test are currently primarily administered as a computer-adaptive test. This methodology is also used in the Uniform Certified Public Accountant Examination.

Advantages


Adaptive tests can provide uniformly precise scores for most test-takers. In contrast, standard fixed tests almost always provide the best precision for test-taker's of medium ability and increasingly poorer precision for test-takers with more extreme test scores.

An adaptive test can typically be shortened by 60% and still maintain a higher level of precision than a fixed version. This translates into a time savings for the test-taker. Test-takers do not waste their time attempting items that are too hard or trivially easy.

Like any computer-based test, adaptive tests may show results immediately after testing.

Disadvantages


Review of past items is generally disallowed. Adaptive tests tend to administer easier items after a person answers incorrectly. Supposedly, an astute test-taker could use such clues to detect incorrect answers and correct them. Or, test-takers could be coached to deliberately pick wrong answers, leading to an increasingly easier test. After tricking the adaptive test into building a maximally easy exam, they could then review the items and answer them correctly--possibly achieving a very high score. Test-takers frequently complain about the inability to review.

Although adaptive tests have exposure control algorithms to prevent overuse of a few items, the exposure conditioned upon ability is often not controlled and can easily become close to 1. That is, it is common for some items to become very common on tests for people of the same ability. This is a serious security concern because groups sharing items may well have a similar functional ability level. In fact, a completely randomized exam is the most secure (but also least efficient).

In order to model the characteristics of the items (e.g., to pick the optimal item), all the items of the test must be pre-administered to a sizable sample and then analyzed. To achieve this, new items must be mixed into the operational items of an exam (the responses are recorded but do not contribute to the test-takers' scores). This presents significant logistical, ethical, and secuity issues. For example, it is impossible to field an operational adaptive test with brand-new, unseen items. And each program must decide what percentage of the test can reasonably be composed of unscored pilot test items.

See also


References


Further Information


School examinations | Educational technology

 

This article is licensed under the GNU Free Documentation License. It uses material from the "Computer-adaptive test".

Home Pageartsbusinesscomputersgameshealthhospitalshomekids & teensnewsphysiciansrecreationreferenceregionalscienceshoppingsocietysportsworld