Студопедия
Случайная страница | ТОМ-1 | ТОМ-2 | ТОМ-3
АрхитектураБиологияГеографияДругоеИностранные языки
ИнформатикаИсторияКультураЛитератураМатематика
МедицинаМеханикаОбразованиеОхрана трудаПедагогика
ПолитикаПравоПрограммированиеПсихологияРелигия
СоциологияСпортСтроительствоФизикаФилософия
ФинансыХимияЭкологияЭкономикаЭлектроника

Grading Pupil Performance 3 страница



Thus far we have seen that a teacher must decide what standard of comparison will be used in assigning grades. This means decid­ing to use either a norm-referenced or a criterion-referenced stan­dard. Once this decision has been made, the teacher must establish a grading curve in the norm-referenced approach or a set of per­formance standards in the criterion-referenced approach. Having made judgments about what type of comparison and what curve or performance standards will be used, the teacher must next deter­mine what performances will be included in the grade. Since grades are commonly intended to convey information about a pupil's mas­tery of the subject matter taught, rather than the pupil's affective or personality qualities, grades should be based primarily on formal assessments that provide direct information about pupil achieve­ment. It was also recognized that teachers' perceptions and insights will influence the grading process to some extent, and that this is acceptable as long as the influence does not greatly distort the pu­pil's true achievement.

 

SUMMARIZING ASSESSMENTS

To arrive at a report card grade in a subject area, a teacher must summarize each pupil's performance on the individual assessments carried out during the marking period. Since the aim of the grade is to communicate information about the pupil's academic accom­plishments in the subject area over the entire term or marking pe­riod, pupil performances over the full term should be included in the grade assigned. In some subject areas, summarization is easy and straightforward. Suppose a fourth grade teacher, Ms. Fogarty, is getting ready to assign grades in spelling. Figure 8.3 shows a page from her marking book for spelling. (As in all examples in this book, Ms. Fogarty and her pupils are fictitious.) Notice that the teacher has indicated the subject area, marking period, pupils' names, score on each spelling test given, and the lessons covered by each test. All teachers maintain marking books like this, in which they record information about pupil performance on tests, home­work, projects, and quizzes. For grading purposes, it is important to maintain formal assessment information in written form, because it is difficult to remember the details of each pupil's performance on each formal assessment given during a term or marking period.

The assessment information contained in the marking book for spelling is all of one type: scores from twenty-word spelling tests given at the end of each lesson. There are eleven scores for each pupil. Ms. Fogarty's task is to summarize these scores and use the resulting number to assign a report card grade to each pupil. As can be seen from Figure 8.3, this is a relatively easy task. The only information used is pupils' scores on the eleven tests; each test was scored on the basis of 100 total points, and each test was equivalent in importance to each other test. Notice also that the distribution of scores on each spelling test was about the same, usually ranging from a high score of 100 to a low score near 60 or 70.

Let us assume that Ms. Fogarty had decided to assign grades us­ing a criterion-referenced approach. Let us further assume that the performance standards that Ms. Fogarty set up for her class were that 100 to 90 is an A grade, 89 to 80 is a B grade, 79 to 70 is a C grade, and below 70 is a D grade. Ms. Fogarty decided not to flunk any pupils in the first term, so D was the lowest grade she wanted to give. She also decided not to use plusses or minuses, but instead to use only A, B, C, and D as the possible grades. It is important to recognize that not all teachers would have made the same decisions that Ms. Fogarty did. Other teachers might have used a norm-referenced grading system or selected different performance stan­dards. Some teachers would give F grades to pupils who did poorly in the first marking period. Still others would use plus and minus grades. Grading is based on teacher judgment informed by knowl­edge of the pupils in the class and what standards and practices are appropriate for them. There is no one best way to assign grades in all situations. The example being discussed is just that, an example which allows us to look at the issues that should be considered in grading.



Now we can see how Ms. Fogarty would assign spelling grades. Ms. Fogarty must first summarize each of her pupil's performances on the eleven tests. Then she will use that information to assign a grade to the pupil. For some pupils, she has only to look across their test scores to recognize what grade they should receive. For exam­ple, looking across the scores of J. Aston, W. Babcock, E. Gonzales, O. Ross, G. Stamos, and T. Yeh, one sees almost all 100s, 95s, and 90s. Ms. Fogarty does not need a calculator to know that the per­formance of these pupils across all eleven tests is greater than 90, which is the lowest score that can receive an A grade. Thus, each of these pupils should receive an A. For other pupils, the pattern of performance is less clear and consistent. T. Cannata, F. Grodsky, and J. Saja, for example, had quite different performances from test to test. To find the overall performances of these and other pu­pils, Ms. Fogarty will have to add each pupil's scores together and divide the sum by 11 to find the pupil's average or mean perfor­mance over the marking period. The mean scores for T. Cannata, F. Grodsky, and J. Saja are 81, 73, and 64, respectively, so based on the performance standards, their respective grades will be B, C, and D. Other pupils' grades can be determined in the same way. The process in this example is simple and straightforward.

Among the factors that made determining a spelling grade for each pupil fairly easy was the fact that Ms. Fogarty had entered the test scores in her mark book as numbers, not as letter grades. It is simple to compute the mean score for each pupil when the pupil's performance is expressed as a number; the teacher has only to add up the scores and divide by the total number of scores (in this case, 11) to get each pupil's mean performance. Some teachers enter pu­pil test performance as letter grades, checkmarks, and other nonnumerical indicators. Thus, instead of a series of numerical scores in the marking book, the teacher is confronted at grading time by a series of letter grades or checkmarks for each pupil. Al­though one can always convert the grades back into numerical equivalents (e.g., A equals a score of 95, B equals a score of 85, etc.), this would not be necessary if numerical scores were re­corded in the marking book in the first place. Thus, it is recom­mended that marking book records be kept in numerical form whenever possible.

This spelling example provides a basic frame of reference for un­derstanding the grading process. It shows how standards come into play in allocating grades, how formal assessment evidence is main­tained in a marking book for use in grading, and how scores can be summarized to provide an overall indication of pupil performance to use in assigning a report card grade. However, most grading sit­uations are not as simple as this example. Consider, for example,

GRADING PUPIL PERFORMANCE 335

 

 

the more typical marking book page for Ms. Fogarty's fourth grade class in the area of social studies. Figure 8.4 shows this page.

Notice two important differences between the information Ms. Fogarty has available to grade social studies and the information that was available to grade spelling. In spelling, the only formal as­sessments available were the tests for each unit's words. In social studies, many different kinds of performance indicators have been collected during the term. Four homework assignments, two quiz results, four unit test results, and two projects make up the infor­mation Ms. Fogarty can use to determine pupil grades in social studies. In spelling, all test results were expressed numerically, on a scale of 0 to 100 in percents. In social studies, different formats are used to record pupils' performance on different indicators: home­work assignments are rated +,, —; quizzes and tests are re­corded on a scale of 0 to 100 in percents; and the projects are re­corded as letter grades. Grading social studies will be a more complicated process than grading spelling.

However, Ms. Fogarty begins both processes with the same con­cerns. First, what standard of comparison will be used to award grades? Second, what aspects of performance will be included in the grade? Let us assume that in social studies, as in spelling, Ms. Fogarty wishes to use a criterion-referenced grading approach. Like most teachers, she uses the same grading approach in all the subject areas she teaches. Let us also assume that, in social studies, Ms. Fogarty wishes to use plusses and minuses in the grades she gives. With this decision made, she must next determine what as­pects of performance will be included in the grades. In spelling, the choice was simple because there was only one kind of performance information available, the eleven test scores. In social studies, the choice is not as easy. There are four different kinds of performance information which might be included in the grade. She must decide which of these to include. But she also has to decide whether each kind of information should count equally in determining the grades or whether some kinds of information should count more than oth­ers. For example, should a project count as much as a unit test? Should two quizzes count as much as one unit test or four home­work assignments? These are questions all teachers face when they try to combine different kinds of assessment information into a sin­gle indicator. The following sections contain suggestions for an­swering such questions.

 

What Should Be Included in a Grade?

It has already been strongly recommended that subject matter grades be based primarily on pupils' academic performances. But


Figure 8.4 shows a marking book page that has four different indi­cators of academic performance: homework, quizzes, unit tests, and projects. Should all these be included in the grades? A general rule of thumb is that as long as the evidence being considered pertains primarily to subject matter mastery, several different kinds of evi­dence about learning are better than a single kind. They give pupils more opportunity to show what they know and can do. By this rea­soning, all four kinds of evidence about pupils' social studies per­formance could be included in the grade.

Almost all teachers would include the unit test and the project re­sults when determining their pupils' grades. These are major, summative indicators of pupil achievement, and, as such, should be reflected in the grade. Most teachers would also include quiz results and homework, although there would be less unanimity about in­cluding homework performance (Ebel and Frisbie, 1986; Stiggins, Frisbie, and Griswold, 1989).

The reason teachers make different judgments about whether to include homework and quiz results in their grades has to do with their perceptions of the purposes of homework and quizzes. Some teachers regard homework as having a formative purpose—that is, giving pupils practice in the kinds of things they are being taught. These teachers view homework performance more as a part of in­struction than as a part of assessment and therefore do not include it when determining a grade. They would say that the function of homework, and to some extent quizzes, is to motivate study and practice, not to assess the outcomes of instruction. They might also say that they can never be certain about whose work a homework paper represents. Other teachers adopt a different view of home­work and quizzes, saying that they do provide achievement infor­mation about pupil learning of daily lessons, and thus should be used as part of the pupil's grade. The Final decision rests with the teacher.

In our example, let us assume that Ms. Fogarty has decided to include each of the four types of assessment information in her pu­pils' social studies grades. Having decided what pupil performances will be included, she now must determine whether each kind of in­formation will count equally or whether some kinds should be weighted more heavily than others.

 

 

SELECTING WEIGHTS FOR ASSESSMENT INFORMATION

 

In order to arrive at a summary indicator of each pupil's perfor­mance in social studies, the different kinds of available evidence


must be combined. An immediate concern in summarizing is how the different kinds of evidence about a pupil should be weighted. In general, a teacher should weight the more important types of pupil performance, typically tests and projects, more heavily than other types of performance. The reason for heavier weighting of tests and projects is that these kinds of assessment evidence provide more complete, integrated information about a pupil's learning than less extensive kinds of assessment like short quizzes or home­work assignments. What types of evidence are most important is a matter that each teacher must decide based on his or her classroom instruction, expectations, and assessment practices. In making this judgment, however, it should be kept in mind that the main pur­pose of grading is to communicate information about each pupil's subject matter achievement. The indicators that do this best ought to be the ones weighted most heavily in the grades.

How, then, should Ms. Fogarty weigh the four kinds of assess­ment information she has? Once again, the decisions she makes will be based on her knowledge of her pupils and the nature of the in­struction she has provided. Each reader will have to make similar judgments in his or her own classroom, judgments which may dif­fer from those made by Ms. Fogarty.

Looking over the information in her marking book, Ms. Fogarty decided that unit tests and projects should count more than home­work and quiz results because the former provided a broader, more representative picture of her pupils' achievements. She was fairly certain that she had used valid tests which reflected the important aspects of her instruction and that the projects assigned required each pupil to integrate knowledge about the project topic in the way she desired. Thus, she was confident using tests and projects as the main components of the social studies grade. She was also wise enough to know the importance of valid assessment instruments as a basis for grades that communicate their intended message.

She did want homework and quizzes to count in the pupils' grades, though not as much as tests and projects. She decided that homework performance and quiz results would each count as much as one unit test. She also decided that a project would count the same as a unit test.

Many teachers do not count homework directly in determining a pupil's grade, but do tell pupils that if more than three or four homework assignments are not turned in, their report card grade will be lowered. Used this way, homework is less a subject matter assessment than an affective assessment of effort or cooperation. Some teachers, who do not compute pupil averages, look at home­work performance to "get an informal, nonstatistical sense" of how a pupil performed. They then use this "sense" to adjust a pupil's


grade upward or downward if homework performance is very dif­ferent from test, quiz, and project performance. If the teacher judges that homework performance is in line with performance on other assessment instruments, homework does not enter directly into a pupil's grade. Teachers use this approach to save themselves the trouble of calculating an average of homework performance to use in computing each pupil's grade. The danger of relying on a "sense" of pupils' performances is that factors other than perfor­mance may influence the teacher's judgment.

Regardless of how a teacher weights each kind of assessment in­formation available, it is strongly suggested that the weightings be kept in simple-to-compute ratios. It is better to weight some things twice as much as others than it is to weight some things five times as much and other things four times as much. All things being equal, the final grades arrived at using a simple weighting scheme will not differ greatly from those arrived at using a more complex weight­ing scheme, so it is better to use the simpler one.

After deciding on her weightings for homework, quizzes, unit tests, and projects, Ms. Fogarty identified eight pieces of informa­tion that she would combine and average to determine her pupils' report card grades in social studies. The eight pieces of information were

• One overall assessment of homework assignments

• One overall assessment of quiz results

• Four scores from the unit tests

• Two project grades

In the final weightings, homework and quiz results each count one-eighth of the grade, unit tests count one-half of the grade, and projects count one-quarter of the grade. Ms. Fogarty next had to combine the available information according to the selected weights.

 

Combining Different Assessment Information

In combining different pieces of assessment information into a sin­gle score, many factors must be taken into account (Ebel and Frisbie, 1986; Gronlund, 1985; Hills, 1981; Nitko, 1983; Stiggins, Frisbie, and Griswold, 1989). One factor that is immediately appar­ent when one looks at Figure 8.4 is that pupil performance on the different assessments is represented in different ways. Homework performance is indicated by checkmarks, quizzes and unit tests by percentage score out of 1OO, and project quality by letter grades.


Somehow Ms. Fogarty must combine these different forms of infor­mation into a single score or summary for the marking period.

Another factor to be considered before combining information for grading purposes is the quality of the assessment information gathered during the term. Grades will be only as meaningful as the information on which they are based. If the project grades were as­signed subjectively, with no clear criteria in mind and with shifting teacher attention to scoring, they will not reflect pupil achievement accurately. If the unit tests were unfair to pupils or did not test a representative sample of what was taught, the scores pupils attained likely will not be a valid indication of their achievement. In this re­gard, Ms. Fogarty ought to look particularly at the results of the test for unit 3. In comparison to pupils' performance on the other unit tests, performance on the unit 3 test was much lower. Does this re­sult indicate a problem with the test or a problem with the effort pupils put into preparing for the test? How should this result be handled in grading? These questions will have to be answered be­fore information can be combined and used for grading purposes.

In order to summarize across assessment information, the infor­mation must be expressed in the same way. Thus, to arrive at a sin­gle score for each pupil based on the assessment information shown in Figure 8.4, some of the information will have to be changed into another form. Since Ms. Fogarty is going to combine different pieces of information, it is best if all the information in the marking book be expressed numerically. Then it can be easily manipulated in computations. This means that the checkmarks for homework as­signments and the letter grades for the two projects will have to be converted into numerical scores. Moreover, they should be con­verted to numerical scores that describe pupil performance on a scale of 0 to 100 percent, so that they will correspond to the scores for the quizzes and unit tests. It is important that all performance indicators be expressed in terms of the same scale, so that they can be combined meaningfully and easily.

For example, suppose a teacher gave two tests, one with 50 items and one with 100 items. Suppose also the teacher wanted the tests to count equally in determining a pupil's grade. Sam got all 50 items right on the first test, but no items right on the second. Mike got no items right on the first test, but 100 items right on the second. Each pupil got a perfect score on one test and zero score on the other. Since the tests are to count equally, Sam and Mike's grades should be the same, regardless of the number of items on the tests. But if the teacher calculates the average performance of Sam and Mike using the number of items they got right, the result will be quite different averages (Sam's average = (50 + 0)/2 = 25; Mike's average = (0 + 100)/2 = 50). Using these averages, Mike would get a higher grade than Sam, even though they each attained a perfect score on one test and got a zero on another. Clearly, this approach will not work. The problem with it is that the teacher did not take into account the difference in the number of items on the two tests—that is, the teacher did not put the two tests on the same scale before computing an average. If the teacher had changed the scores from number of items correct to percentage of items correct before averaging, Sam and Mike would have had the same overall performance (Sam = (100 + 0)/2 = 50; Mike = (0 + 100)/2 = 50). Or if the teacher had expressed performance on both tests in terms of the 100-item test, the averages would have been the same, since Sam's perfect score on a 50-item test would be worth 100 points on a 100-point scale. If scores are not expressed in a common scale, pupil performance can be distorted and grades will not reflect ac­tual achievement.

Thus, a way must be found to express homework and project performance on a scale that corresponds to the 0 to 100 percent scale used for quizzes and unit tests. To do this, the teacher must make some judgments about what a —,, and +correspond to in terms of a percentage scale. Only the classroom teacher can do this because only the classroom teacher knows the nature of the homework assignments and the characteristics of the homework pa­pers that got different levels of checkmarks. Ms. Fogarty decided that a + would correspond to a score of 95 percent correct, a would correspond to a score of 85 percent correct, and a — would correspond to a score of 75 percent correct. For project grades, the following commonly used scale would be applied to give numerical scores to the projects: 95 = A, 92 = A -, 88 = B +, 85 = B, 82 = B —, 78 - C +, 75 = C, 72 = C -, 68 = D +, 65 = D, 62 = D —, less than 60 = F. If, for example, a pupil got a B — on one of the projects, that pupil's numerical score on the project would be 82. When Ms. Fogarty applied these values to the home­work and projects, she ended up with the information shown in Fig­ure 8.5. It is important to note that Ms. Fogarty's is not the only way that the different scores could be put on the same scale, nor is it without limitations (Stiggins, Frisbie, and Griswold, 1989; Terwilli-ger, 1971). It is, however, a way she could accomplish the task and a method she could feel comfortable using. With this task completed, she has to confront one additional issue prior to computing grades.

Look at the scores from the unit 3 test in Figure 8.5. These test scores were quite low for all pupils in comparison to their perfor­mance on the other unit tests. Ms. Fogarty noticed this when she scored the test, and no doubt asked herself why the scores were so low. This is an important question to ask and answer, because if the pupils' grades are to reflect their learning of what they were taught, then it is necessary that the tests used to determine grades be good reflections of the instruction provided. Normally the question of the match between the assessment instrument and the things pupils were taught would occur before the assessment instrument was used. However, sometimes mismatches are overlooked or do not become apparent until after the instrument is administered to pupils and the teacher notices that the scores are not typical. In the end, it is only after a test is administered and scored that one is able to iden­tify unanticipated weaknesses in the test. Generally, it is unexpect­edly low scores which provoke concern and attention; rarely do un­expectedly high scores also alert the teacher to possible problems with the instrument. The reason for this is that most teachers prob­ably assume that unexpectedly low scores are the result of a faulty assessment instrument, while unexpectedly high scores are the re­sult of their superior teaching ability.

Ms. Fogarty looked over the items in the unit 3 test, which was a textbook test, and compared the items to the topics and skills she taught during instruction on unit 3. She found that one section of the unit which she had decided not to teach contributed a large number of items to the textbook test. The match between the unit test and classroom instruction was not good. Pupils were being pe­nalized because instruction had not exposed them to the concepts needed to answer many of the test items. To use the scores on this test would distort pupils' actual achievement on the things they had been taught in the unit and this, in turn, would reduce the validity of the grades they received.

To avoid this, Ms. Fogarty decided to change the pupils' scores on the unit 3 test to better reflect their achievement of the things she actually taught. She estimated that about 20 to 25 percent of the items on the test were taken from the section she had not taught. She checked and saw that most pupils had done poorly on these items. She decided to increase each pupil's score by 20 percentage points on this test. She correctly reasoned that the increased scores would provide a better indication of what pupils had learned from the instruction provided than the original results of the test. After 20 percentage points are added to each pupil's score on the unit 3 test, all of the assessment information in Figure 8.5 will be placed on a common scale which ranges from 0 to 100 and indicates the per­centage of mastery by each pupil on each piece of assessment infor­mation.

It is very important to point out that Ms. Fogarty adjusted the low scores on the unit 3 test upward only after reexamining the test and the instruction she had presented. She did not raise the scores to make the pupils feel better about themselves, to have them like her more, or for other, similar reasons. The test scores were raised so that they would provide a better indication of pupil learning from instruction and make the pupils' report card grades more accu­rately reflect their subject matter mastery. She made a judgment to


increase the scores based on evidence obtained from reviewing the test and her instruction, not based on a whim or a desire to give her pupils high grades whether or not they deserved them.

Low assessment scores should not be raised simply because they are low or because the teacher is disappointed with them. There are times when pupils as a group do poorly not because of deficiencies in the assessment instrument, but because instruction was not ade­quate or because they didn't study hard enough. To raise low scores that occurred for these reasons would be to decrease the validity of the scores. If poor instruction is judged to be the cause of poor test per­formance, the teacher can reteach the objectives and retest, using the second set of test scores for grading instead of the first. Of course, a different version of the unit test should be used for the retesting.

 

Computing Pupils' Overall Scores

Having decided on score equivalents for the homework and project assessments and having adjusted scores on the unit 3 test in light of the partial mismatch between instruction and the test items, Ms. Fogarty is ready to compute her pupils' social studies grades. To do this, she must give each kind of assessment information the weight she decided on; sum the scores; and divide by 8, which is the num­ber of pieces of assessment information she is combining (f overall homework score, 1 overall quiz score, 4 unit test scores, and 2 project scores). This computation will provide an average score for each pupil for social studies during the first marking period. Figure 8.6 shows the eight components to be included in each pupil's grade, their sum, and the average performance across the eight assessments.

To make her task simpler, Ms. Fogarty adopted two rules. First, she would only deal with whole numbers; any numbers that were not whole numbers would be rounded off to the nearest whole number. Second, she would not compute the average homework score for each pupil, but instead would give the pupil the homework score that corresponded to the majority of that pupil s homework per­formances. Thus, J. Aston would receive a homework score of 85, P. Farmer a homework score of 95, and O. Ross a homework score of 85. Ms. Fogarty could have computed the average of the four homework marks, but the shortcut method she adopted will make little difference in the score a pupil receives and it will save valuable time.

Strictly speaking, the actual weight a particular assessment carries in the determination of a grade is dependent on the spread of scores on that assessment compared to the spread of scores on other assessments (Ebel and Frisbie, 1986; Gronlund, 1985; Hills, 1981; Nitko, 1983; Sax, 1980). The greater the spread of scores on an as­sessment, the greater the influence that assessment will have on the


Дата добавления: 2015-09-28; просмотров: 19 | Нарушение авторских прав







mybiblioteka.su - 2015-2024 год. (0.016 сек.)







<== предыдущая лекция | следующая лекция ==>