Study Guide - Error and Measurement

Click Here to view my study guide for the midterm.

Partial notes are below:

What can be said uncategorically about error and measurement:

- All measurement contains error.

True score theory says that:

- Every measurement is an additive composite of two components, true ability & random error. The true ability is unknown.
- If it were possible for an individual to be measured many times using a different parallel form, the average of the resulting errors would approach 0.

The formula:

- Observed score (X) = True Score (T) + Error (E) (Remember, all measurement contains error)

Validity is:

- The extent to which a test measures what it is supposed to measure, and therefore the appropriateness with which we can make inferences based on the test results.

Describe what is meant by construct underrepresentation and construct irrelevance, two threats to construct validity.

- Construct underrepresentation is when the assessment is too narrow and fails to include important dimensions or facets of the construct (a test on color recognition only covers green and orange ¨C the subject may know red, yellow, brown, blue, purple and violet, but not green, and will score very poorly, because the full range of the construct was not included in the test).
- Construct irrelevance is when the assessment is too broad, containing excess reliable variance associated with other distinct constructs as well as method variance such as response sets or guessing propensities that affects responses in a manner irrelevant to the interpreted construct. This can be in terms of difficulty (some aspect of the task not relevant to the construct make the task difficult, but that difficulty is not relevant to the construct being studied ¨C I.E., reading the story problems on a math test with reading troubles will result in low math scores) and easiness (there are clues in the item enabling the test taker to guess the correct answer, when otherwise he or she would not have known it, or the testing material is too familiar. I.E., a reading passage on a play that the class just studied in depth in a test on comprehension ¨C the student already has studied the answers).

Messick distinguishes between the evidential basis and the consequential basis for both test interpretation and test use. What are the relevant concepts in each of the cells of this matrix:

See the MS Word notes for the table

- Cell 1: Construct Validity ¨C the test measures what it is supposed to measure. We interpret the test as being a sound statement of the construct at large. This is empirical

- Cell 2: CV+ Relevance/Utility ¨C When used in a given situation, the test will still measure what it is supposed to measure. This means that not only is the test valid in theory, but that it is valid when actually used with subjects. This is also empirical

- Cell 3: CV + Value Implications ¨C Concern for the value laden labels that will be used to describe the test taker as a result. This means that we must not only look at the validity of the test, but we must look at the consequences that the test will have if we use it to label someone. Just because the test is valid, will that make the resulting label valid?

- Cell 4: CV + Relevance/Utility + Value Implications + Social Consequences ¨C Considering the social consequences of test use or the application of inferences. Taking cell three further, once we label someone because of this test, what will that do to the person socially? How will they be seen by society because of this test? MR? DD? - those labels can have a far reaching and life changing impact. Are we considering that with this test? This is not just empirical, because this will be different with different people ¨C moral characteristics or distinctions are unique with all people. Using a test will have different moral and social consequences with each person who uses it.

What purpose is the standard error of measurement (SEM):

- The SEM is used to show the range of the possible scores, based on the single number score received from the subject. It is a function of the reliability and the SD of the total test.
- The SD is the "Spread" of scores for a group of test takers The SEM is the "Spread" of scores for one individual on an infinite number of test takings, always as if for the first time. It is the SD of the repeated test scores.
- The formula is: SD = Standard Deviation, r = reliability coefficient.
- Sm = SD * (Square root of) 1-r

In order to acknowledge that the observed score is not the true score when interpreting test scores, what do we do? How do we correctly interpret a test score?

- We used confidence intervals. The confidence interval is the +/-SEM around the obtained score. We can say with X% Certainty that the S's real score falls in that range
- Remember, 68% fall between +/- 1SD, 95% fall between +/- 2SD, and 99.7% fall between +/-3SD.
- So, if the SEM for a score on a test is 3 (as in the case of the WISC) then we can say with 95% accuracy that the true score falls within +/-6 of the given score.

Filed under: EDC 512-513 Cognitive Assessment and Practicum
Copyright: October, 2003 - David Profitt