This is an excerpt from Measurement and Evaluation in Human Performance 6th Edition With HKPropel Access by James R. Morrow, Jr.,Dale P. Mood,Weimo Zhu & Minsoo Kang.
Recall that, in the Measurement and Evaluation Challenge, John was asked by the head coach to select an agility test for their team. As we learned in chapters 6 and 7, the first two things to consider when we select a test should be validity (i.e., if a test measures what it is supposed to measure) and reliability (i.e., if the test is consistent). Though we ideally prefer a test with both validity and reliability coefficients of .80, these selection criteria could vary somewhat based on the nature of the performance test. If participants can easily demonstrate their maximal effort in a test (e.g., 50-yard dash), coefficients of .80 should be expected. If, on the other hand, significant skills are required to perform a test (e.g., Harre circuit test), a relatively low coefficient (e.g., .70) might be considered acceptable. Other factors that could have an impact on validity and reliability include sex, age, ability level, and familiarity with the test. The best way to learn about the validity and reliability of a test is to conduct a literature search. For example, by checking Kirby’s Guide to Fitness and Motor Performance Tests mentioned earlier, John could easily determine that the SEMO Agility Test had a good reliability of r = .88 between trials 1 and 2 but relatively low validity from its correlations with dodging run (r = .72), shuttle run (r = .63), and side-step test (r = .61).
Besides validity and reliability, other secondary features related to practicality should also be considered. For example, is the test easy to administer? Will the test require expensive equipment? Can the test be administered to a large group and therefore be completed in a short time period? Can the test be used as an exercise in daily physical education classes? Can the test be administered by the teacher? Can the test be easily and objectively scored? With all these considerations and information collected together, the most appropriate test can be selected.
After selecting a test, the next step is to plan the administration of the test, which also results in consideration of complex functions. First, the participant must understand the test instructions and then be able to translate these instructions into the movements required by the test. Assume, for example, that John is going to run a pilot test of the shuttle run test before recommending it to the coach. On the first attempt the participant makes several errors—perhaps he or she threw the block across the line instead of placing it on the floor or forgot the number of times he or she should change direction. Did these errors occur because he or she did not understand the instructions or because he or she was unable to follow them? Before actual testing, the participant should see a demonstration of the test, first in slow motion, and then at full speed. Then he or she should be allowed to move through the test slowly while the test administrator gives cues on his or her movements. The test instructions can also be written on an index card or posted on a bulletin board. If the participant has difficulty with the instructions, a review of the written instructions might be helpful. Finally, the participant should have the opportunity to practice the test at maximum speed. Now he or she is ready to be tested. Remember that if the participant cannot perform the test properly, the trial must be thrown out and another taken. This is time consuming and inefficient. The time taken to prepare participants for the test is time well spent.
Preparation for Testing
When fitness tests are administered, testers expect the participant to perform at his or her maximum ability. If the participant has not received the preparation described previously, his or her true ability may not be measured—rather, the ability to perform a novel task would be measured, which is not the purpose of a fitness test. Although proper preparation for testing should be emphasized, however, it can be carried to extremes. Requiring the participants to practice a specific test day after day as part of the regular training program serves no purpose. If practicing the test increases the participants’ overall performance-related fitness, some practice may be justifiable. Sometimes, other means of improving this specific component of performance-related fitness should be incorporated into the training program. This is because no developed test can measure all the components of fitness. Furthermore, it is more likely that some component of fitness is highly task-specific. This means that a performance on one test might not reflect how well (or poorly) one would perform on a different test even though both tests are meant to measure the same component. Yet, it is still very popular in the physical education and training fields to select a single test as a measure of a component of fitness, assuming that this test measures one’s “general” ability of this component. This is not a fault of the test, but rather of the user’s erroneous assumption. Other testing preparations include wearing appropriate clothes and shoes, performing the tests on nonslippery surfaces, and preparing methods for recording scores ahead of time.
Because fitness testing is usually conducted in large groups, assistants are often required. Students, teachers, parents, or other members of the community can be recruited as assistants. They should be thoroughly familiarized with the tests they will administer, which will not occur if they are simply given the test instructions to read. The following is a good rule of thumb: Preparing properly for testing always takes longer than you think it will! Do not assume anything. Bring assistants together for a practice session. Review and demonstrate the tests, giving each assistant experience in administering the tests or performing the assigned administrative task. Even when a test is administered year after year in a school, a brief review of procedures is essential.
Number of Trials
When a test is developed, information should always be included on the number of trials to be administered. The number of trials required is usually a function of determining how many trials are needed to obtain a reliable score. Let’s suppose John is also interested in testing his players’ dynamic balance ability and decided in advance that the test should have a reliability coefficient of at least 0.80. John decided to use the Johnson modification of the Bass Test described previously, but found that the reported reliability of this test was only .75. This is very close to the target reliability. With the addition of one or two trials, it should be possible to reach the target reliability. After experimenting with different numbers of trials (using the Spearman–Brown prophecy formula learned in chapter 6), John should be able to determine the number of trials needed to administer this test. Again, the main point is that the number of trials of a test should not be determined arbitrarily.
Use of Norms
It is very helpful if norms are available for the test you plan to administer, especially if you wish to determine the effectiveness of your program. To be used properly, norms should be available for the same age and sex as your participants. The norms should be used to determine the percentile scores of individual participants only. Using tables of norms to interpret group data, although appealing, is not appropriate, because the variability of individually based norms differs from the variability of group means. Caution should also be taken if outdated norms are used—in fact, many norms of commonly used performance-related fitness tests were developed decades ago (Hoffman, 2006).