2.5.2 reliability

If test validity is defined as accuracy of measurement, test reliability is related to consistency of measurement. A reliable test score will be consistent across different characteristics of the testing situation. Unless test scores are relatively consistent, they cannot give any information at all about the ability measured. Another aspect of overall test reliability is rater reliability. Raters must maintain consistency in their own marking standards. This kind of reliability is called intra-marker reliability (Underhill, 1987). Or the same work marked by different raters should produce similar results, which is named inter-marker also by Underhill. If some raters rate more severely than others, the ratings of different raters are not consistent, and the scores obtained could not be considered to be reliable. Oral tests belong to the kind calling for subjective judgement on the part of the marker, so the scores awarded in an oral test cannot be believed to always have such high reliability.

It is also necessary to recognize that inconsistencies cannot be eliminated entirely. Nevertheless, it is possible to minimize the effects of the potential sources of inconsistencies under control in test design (Bachman and Palmer, 1996). Amongst factors affecting test performance, the characteristics of the test tasks are partly under control. In language test design and development, thus, it is possible to minimize variations in the test task characteristics that do not correspond to variations in target language tasks.

Test administration also involved in the concept of reliability has not been given proper attention at some universities at the present time. Administrating a test involves exam invigilators and such test conditions as classrooms, equipment, materials, exam rules and procedures dealing with test takers’ cheating.

2.5.3 practicality

Test practicality pertains to ‘the ways in which a test will be implemented, and, to a large degree, whether it will be developed and used at all’ (Bachman&Palmer, 1996, p. 35). It concerns practical matters such as the amount of time, human and material resources available for constructing a test, administering it, marking it, and interpreting the results. If the test resources required for implementing a test exceed the resources available, the test will be impractical. Human resources are a crucial component of test construction and administration involving such individuals as test writers, scorers or raters, and test administrators as well as clerical and technical support personnel. In fact, not all institutions have sufficient staff to be in charge of all these well-defined roles. One person may be in charge of several functions. Test writers, key personnel in the process of test development, are involved not only in writing tests but also in collecting materials, editing and recording. Material resources include space (the number of classrooms, language labs needed), equipment (typewriters, computers, cassette players, overhead projectors), test materials (test booklets, answer sheets, audiotapes). Time consists of test development time and the time required to complete the parts of each stage of the test development process.

