Monday, June 7, 2021

FTL: #6 Introduction to Student Assessment

Foundations of Teaching for Learning: Introduction to Student Assessment

by Commonwealth Education Trust


I think one interview speak out my thought:


I see assessment as having two very different roles. There's the assessment that needs to happen for myself, as a teacher, to meet the need of each child. And then there's the assessment that's required for a governance level, for the government for data, so they can number crunch and see who's above, who's below - that's a number game. And that's high stakes, because of there is a fear for-- I believe even for myself-- that you are getting judged for the quality and your ability as a teacher based on those numbers. So, there's two definite roles. The role for me that I think is most crucial for our students, is for me to know where our students are at, prior to the learning. I want to be able to measure and know how far they've learned and how much they've gained in the time we've learned. So, the school doesn't always provide that assessment at all or that test, so I create my own.



I think the bigger problem with the large scale international testing systems is they actually remove all the cultural indicators out of test questions to try to make them as culturally neutral as possible.

I would suggest that the disadvantaged children in any society, probably benefit from questions that are extremely culturally laden, because they recognise and understand it, and if you strip that out of it, One: it's boring, and two: why should I care? And three: it's got nothing to do with me, and why should I try? So, I'm not convinced that de-biasing on international comparisons is actually productive. 


This course will provide:

  • an overview of the importance of teacher beliefs about assessment
  • an introduction to a framework for understanding and evaluating the quality of assessment
  • the importance feedback and reporting of achievement as key steps in improving teaching and learning
  • a review of key research on assessment purposes, goals, and effects
  • a reminder of the importance of error in all assessment practices and means for reducing the impact of that error on interpretations and decisions
  • an overview of best practice guidelines for objectively scored testing and for subjective scoring of open-ended responses.
  • Specific attention as to how assessment can be adapted to meet the needs of cultural minority students
Week 1: Concepts and types of assessment in teaching and learning: formal, informal, diagnostic, formative, and summative

  • Video lecture 1: Conceptions of Assessment: Reflecting on and being aware of what you believe or understand to be the purposes of assessment
  • Video lecture 2: Formative & Summative: Understanding the impact of timing on assessment decisions
  • Video lecture 3: The Curriculum-Teaching-Assessment cycle: A framework for integrating assessment with instruction
  • Video lecture 4: Cultural Concern: Perspectives of minorities on educational assessment


Week 2: The nature of feedback in improving teaching and learning;

  • Video lecture 1: Feedback Effectiveness: Hattie & Timperley’s framework; Gan’s Question prompts
  • Video lecture 2: Building Feedback into Teaching: Goals/Intentions; Strengths & Weaknesses
  • Video lecture 3: Teacher & Student perceptions of feedback
  • Video lecture 4: Cultural Concern: Perspectives of minorities on appropriate feedback mechanisms

Week 3: Reporting student achievement
  • Video lecture 1: Reporting useful feedback: Where now, where going, what next?
  • Video lecture 2: Grades, ranks, and scores: Problems with using these approaches
  • Video lecture 3: Reporting against objectives, not test questions or items
  • Video lecture 4: Cultural Concern: Perspectives of minorities on appropriate reporting mechanisms

Week 4: Guidelines for developing and using objectively answered question procedures
  • Video lecture 1: Anatomy of a good MCQ: The Question and Answer  (MCQ: multiple choice question)
  • Video lecture 2: Anatomy of a good MCQ: The Wrong Answers
  • Video lecture 3: Valid alternatives to bad MCQ: Binary choice; Matching
  • Video lecture 4: Valid alternatives to bad MCQ: Sequencing; Sorting

Week 5: Guidelines for developing and using human judgement scoring procedures
  • Video lecture 1: Human judgement: Being aware of the errors we make
  • Video lecture 2: Guiding judgements: Analytic & Holistic Rubric design and use
  • Video lecture 3: Essay marking: Working towards more reliable and valid scoring
  • Video lecture 4: Moderation: A key to ensuring reliable and valid scoring

Week 6: Guidelines for developing and using procedures that involve students in assessment
  • Video lecture 1: Rationale and Goals of Involving Students: Improved learning & self-regulation
  • Video lecture 2: Good Peer Assessment practices
  • Video lecture 3: Good Self-Assessment practices
  • Video lecture 4: Concerns with peer and self-assessment: Keeping in mind validity


Week 1 Formative & Summative: 

quality, the formats of assessment, and the purposes of assessment.

And can those tests actually test deep learning? Or are they always only going to be memory tests?

In education, we have a set of targets, objectives, outcomes, goals in our teaching, and I hope that we have checkpoints where we take stock and say where are we up to and what do we need to do next to continue our journey towards our desired goals.

The goal of assessment during teaching is to tell us who needs to be taught what next and conversely, who need what to be not taught this next, because they already know it. There is a debate in the assessment industry between formative and summative.

who needs to be taught what, and how they should be taught.

Michael Scriven invented the terminology of formative and summative in the 1960s and he argued that the difference is when it takes place. Formative takes place before the end. Summative takes place at the end. And both are needed, and both need to be high quality in order to make good decisions.

In the 1980s, Roy Sadler came along and said actually, qualitative is a different construct, and so a summative device, like an exam or test, can't really be used formatively.

And then, much more recently in the assessment for learning, or AfL view, there is a view that says, formative is only the informal interaction that happens in the classroom, between the learner and the teacher.










Throughout education, the goal is to inform teaching and learning changes. How can we make it better? Who do we need to make it better for?


assessment becomes a tool that links the curriculum with your teaching.



But in education, our primary goal is on learning, and so much of what we're going to talk about in this course later in this series will be focused on the assessment of learning, not necessarily attitudes or personality.

Assessment are about two points

  • who needs to be taught what, and how they should be taught.

  • Can those tests actually test deep learning? Or are they always only going to be memory tests?

 

In week one, I learned that Michael Scriven invented formative and summative assessment in the 1960s and he argued that the difference is when it takes place. Formative takes place before the end. Summative takes place at the end. And both are needed, and both need to be high quality in order to make good decisions. 


However, I believe both assessments should be low-stakes. Formative must be at low-stakes so that students wouldn’t be afraid of making mistakes and be able to try new ideas. Its goal is solely for improvement. On the other hand, summative assessment conducted at the end of the school year must be low-stake. It could prevent teaching for tests. The high-stake summative test should be taken at each stage, for example, the elementary school stage when students are ready for middle schools, or the middle school stage when students are ready for high schools. 


In week one, I also learn Perspectives of minorities on educational assessment. I don’t agree with our teacher. I’ve recently finished three courses on TOEFL specialization certification from Coursera. TOEFL  is a standardized test to measure the English language ability of non-native speakers wishing to enroll in English-speaking universities. I went to ETS website for listening practice. I found it is extremely hard. If foreigners are tested for such a high level, why do  schools lower the standard for their own minority groups? Or when you apply for jobs, do companies lower the requirement for you? We are talking about discrimination all the time, here is one right designed in school system. Don’t get me wrong. I certainly understand the merit of this approach.   



Week 2: The nature of feedback in improving teaching and learning

Useful information about who needs to be taught what.purpose of feedback is to reduce the discrepancy between where students are and where we want them to be.

In week two, I learned about feedback. Feedback has three important components:

  • Where are you now? 

  • Where are you going 

  •  what do you need to do next?


The challenge for us as teachers is being expert enough to know the answers to those questions for our students. As teachers, we have to have a clear understanding as teachers as to what we want learners to learn, to do, or to understand. We need to be able to describe that so that we can give useful feedback to students.


Tests that only give us total and rank are insufficient because they don't tell teachers or students enough information on how to improve.  


According to research, students welcome feedback. For example, students want fairness. They also want feedback to be specific. "Tell me exactly how I can improve." The feedback has to be accurate, clearly structured around clear criteria in words and language that students can understand. Also, it is difficult to give students negative feedback when their work does not meet the standards, the criteria, when they don't do what we've taught. But it can be avoided if feedback is non-judgmental and fair. 


Cultural Concern: Perspectives of minorities on appropriate feedback mechanisms. I agree that teachers should give feedback in a culturally appropriate manner, such as, feedback needs to be given to an individual and, not in a group setting or in front of their peers. It may be appropriate that feedback be given in a manner that is less direct and less appearing to be confrontational.


Video interview:

Teachers continue to work in high-stakes exam societies, like Hong Kong or China, end up believing that "By examining children, I'm helping them improve", so they rationalize and justify the things they have to do in light of the things. New Zealand gave up using exams as a roadblock to entry to further schooling. As early as the 1930s, where we abolished the standard six. And then, we have open entry to the university.




Week 3: Reporting student achievement




there is a strong tendency for teachers to focus on surface features about presentation and neatness, quantity of work and effort, and punctuality. These kinds of surface indicators, this is not to say that these things are unimportant, but they tend to draw attention to compliance rather than the deeper issue of what did they actually learn?

Another tendency is to see praise in comments written by teachers, and especially the dangerous praise, "Doing well for his or her age", and "A pleasure to teach". Now, no parent wants their child to be considered a horrible brat.




The key difference between percentage and percentile is the percentage is a mathematical value presented out of 100 and percentile is the per cent of values below a specific value. The percentage is a means of comparing quantities. A percentile is used to display position or rank.



Week 4: Guidelines for developing and using objectively answered question procedures



When should you use a multiple choice question? the formal teacher driven test-like mechanisms
When the question or task has a single, clear, agreed correct answer. If the question could be answered in one way and everybody agrees that that one way is the right answer, then a multiple-choice question is suitable.

Multiple choice questions have the reputation of being recall questions, but they don't have to be.

When we want to remove the workload of the student away from writing, but we don't want to remove it from thinking, we want students to think and choose rather than think and write. And this gives us some efficiency


 



 
And the answers are certainly not suitable for a grade one or grade two test question. But we're using this to illustrate, the kind of thinking that went into choosing the wrong answers. We know that two plus three is five. But if you had subtracted instead of added, you would have got the answer minus one. If you had multiplied instead of added, you would have got the answer six. If you had given the cube of two, you would have got eight. And if you just ignore the plus sign altogether, you would have got 2-3, 23. So, each of these wrong answers has been based on a plausible misunderstanding of that squiggle shape called the plus sign between the two numbers.












Week 5: Guidelines for developing and using human judgement scoring procedures







Lots of terms for rubrics: marking schemes, progress indicators, progress maps, matrices. All of these terms refer to the same thing, a set of rules to guide your judgment as to the quality of work

If you can meet 70% identical or 90% approximately equal in your scores, then you can defend your scores to your school leaders, your department heads, you can defend the scores to the students and you can defend the scores to the parents. 


Week 5: Guidelines for developing and using human judgement scoring procedures

three sources of feedback
  • teachers give feedback
  • self - feedback:  any student thinking upon their own work 
  • other classmates are sources of feedback.

assessment of learning: what we described earlier in the course, formative assessment.
assessment for learning, high involvement of students in the process of judging and evaluating learning.

 









有非常好的使用的建议

if the person was a good friend, they would give a false grade, an inflated grade, but they would give rich constructive comments on how to improve. On the other hand, if the person was a stranger but in the same class, the student would give an accurate grade, but not a very helpful, very vague and very general comments. What this suggests to us is if you want good quality feedback, let peers, let friends mark each other - but don't ask them to grade it, just ask them to comment on it.
On the other hand, if you actually want to make use of what a peer thinks about a student's work, make sure its anonymous, it's a stranger, and you can use a holistic grade, but don't expect any meaningful feedback to the student based on this peer mark.


Week 6: Guidelines for developing and using procedures that involve students in assessment


The generalizability theory research people have shown that to get reliable scoring of student writing, for example, you need anywhere between three and five pieces of writing, judged by anywhere between three and seven judges.

chemistry students write chemistry questions about chemistry and other chemistry students take the questions, answer it, and then comment on its quality, as a way of them learning from each other by testing each other.

No comments:

Post a Comment