The aim of testing is to match each student with a score that reflects 1) the amount learned relative to a norm group—the current class or previous classes, or 2) an absolute proportion of amount learned based on a very well constructed mastery test. For either style of testing, the only method that will provide information on the quality of the test is an item-by-item examination. Careful study of the statistical analysis output provides a basis for assessing the reliability and validity of a test and improving the quality of future classroom tests.
The following are examples and explanations of the statistics that appear on Opscan results.
Return to Statistics and Printout Questions
There are two printouts of scores: one alphabetized by last name and another
sorted by University ID. The percentage correct, percentile rank, T-scores,
mean, and standard deviation are also printed on the lists of scores.

A T-score is a standardized score that allows you to compare test scores for tests with different scales and for different classes. A T-score assumes that the test mean is 50 and the standard deviation is 10. The T-score provides an index of the distance a particular score lies from the average. In cases where the scores are normally distributed, approximately 68% of the students would have T-scores between 40 and 60. This is similar to the Z-score, where we assume the mean is 0 and the standard deviation is 1.
The relation between a Z-score and a T-score is as follows: If μ is the mean of the tests and σ is the standard deviation, then the Z score z of an individual test x is calculated as
![]()
and then the T-score t is calculated as
.
On the standard output, this score is listed in the last column with the student’s actual score and percentage correct.
Return to Statistics and Printout Questions
The frequency distribution gives you a table showing the frequency distribution
of the test results. The printout includes the following columns.
Return to Statistics and Printout Questions
A histogram is a graphical representation of a frequency distribution. The histogram below is the approximate shape of a normal distribution, or a bell curve. The x-axis represents the weighted scores and the y-axis indicates the frequency (how many students received each score on the test).
In most cases student scores will not form a perfect bell curve for a variety
of reasons (including small class size). A histogram, therefore, is more
useful for classes in which there are 30 or more students.

Return to Statistics and Printout Questions
An item analysis takes each item, or question, on the test and gives you a variety of statistics regarding the answers chosen by the students. The item analysis allows you to evaluate a question and decide whether to use it on future tests.
An item-by-item analysis of tests is available in either short or long form.
Return to Statistics and Printout Questions
The short form item analysis gives information for the overall class and
covers the information listed below. The short form item analysis also offers
one statistic not found on the long form: point-biserial correlation.

If an item carries a high validity, it means that overall, high scoring individuals (i.e., those with high scores on the total test) answered the item correctly while low scoring individuals tended to miss the question. Therefore, a question with high validity has a high correlation with the total test score. If one considers the total test score to be a better indicator of a student’s knowledge, then the higher the relationship between the item and the total test, the more valid the item.
There are a number of factors to consider when examining an item’s validity. In contrast to standardized entrance exams, classroom tests often contain some items that discriminate poorly. For example, it may be an instructor’s intention to begin a test with several easy items in order to put students at ease or to establish a baseline. In cases where everyone answers a question correctly, the item validity is zero. However, it may be desirable to keep the item anyway.
A high negative validity indicates that there is something definitely wrong–either there is something wrong with the item, such as an ambiguous distracter, or the item has been keyed incorrectly. In the case of a zero or very low negative validity (e.g., -.10), the item may be very easy (a difficulty close to 1.0) or very difficult with even a few good students getting the item wrong. It may also be due to random guessing.Return to Statistics and Printout Questions
The long-form item analysis gives information for the overall class and
covers the information listed below. The long-form analysis shows almost
all the same information as the short-form analysis, but with a different
layout and with some additional information.

If an item carries a high validity, it means that overall, high scoring individuals (i.e., those with high scores on the total test) answered the item correctly while low scoring individuals tended to miss the question. Therefore, a question with high validity has a high correlation with the total test score. If one considers the total test score to be a better indicator of a student’s knowledge, then the higher the relationship between the item and the total test, the more valid the item.
There are a number of factors to consider when examining an item’s validity. In contrast to standardized entrance exams, classroom tests often contain some items that discriminate poorly. For example, it may be an instructor’s intention to begin a test with several easy items in order to put students at ease or to establish a baseline. In cases where everyone answers a question correctly, the item validity is zero. However, it may be desirable to keep the item anyway.
A high negative validity indicates that there is something definitely wrong–either there is something wrong with the item, such as an ambiguous distracter, or the item has been keyed incorrectly. In the case of a zero or very low negative validity (e.g., -.10), the item may be very easy (a difficulty close to 1.0) or very difficult with even a few good students getting the item wrong. It may also be due to random guessing.Return to Statistics and Printout Questions
The difficulty index is a printout included in both the Long Form and Short
Form Item Analysis options that displays the range of difficulty values over
the entire test. The questions are grouped together based on their difficulty
values to help you analyze how your test was handled.

Return to Statistics and Printout Questions
The discrimination index is a printout included in both the Long Form and
Short Form Item Analysis options that displays the range of point-biserial
correlation coefficient values over the entire test. These coefficient values
are shown within the short form item analysis.

Return to Statistics and Printout Questions
A summary of test statistics is located at the bottom of any page of the
item analysis (short-form or long-form) and includes the following information:

Reliability describes the extent to which the test scores can be depended on to provide an actual measurement of the students’ abilities and knowledge. The Kuder-Richardson formula (KR20) is one such coefficient that measures reliability. The reliability coefficient ranges from 0.0 to 1.0. The closer the coefficient is to 0, the less of a relationship exists between the test scores and the students’ true abilities.* In other words, a score close to 0 means the scores for the test are random and don’t accurately reflect the student’s knowledge. The closer the coefficient is to 1, the more the obtained score reflects the student’s actual knowledge.
In determining acceptable levels of reliability, several factors must be considered:
Return to Statistics and Printout Questions
The student test responses report lists all the students, their social security
numbers, their scores, and a compact printout of their chosen answers on
the test.

The boxes under each student's name represents a list of ten questions on the test. Correct answers are only shown with a dash (-); you only see the incorrect answers the student chose. For example, for Student01, in the first box you see his responses for questions 1 through 10. He got seven correct answers but missed question 3, where he chose response D; question 7, where he chose response E; and question 9, where he chose response B. In the next box to the right, we see his responses for questions 11 through 20, where he only got five questions correct.
Return to Statistics and Printout Questions
The individual student feedback output is a sheet of paper for each student
who took the test. The header gives information about the test if the instructor
filled in this information on the answer key. Below that is the student's
name, University ID number, the number of correct items, and their weighted
score. The weighted score reflects the number of points each question was
worth, as determined by the instructor, while the number right reflects a
simple count of correct answers.

For each item (i.e. question), the correct response is listed along with the student's response. If the student's response was not the correct answer, then a dollar sign ($) is listed next to it. If the student filled in more than one response bubble, an asterisk (*) is shown as their response.