Thoughts on managing variability: Summative

Showing posts with label Summative. Show all posts

Sunday, 31 August 2014

Pointless marking?

This post is written in response to a "Thunk" from @TeacherToolkit - see here.

What's the point in marking?
Perhaps a reason that it seems nobody's answered this 'Thunk' before is that it's a bit obvious; we all know one of the basic tasks in a teacher's workload is to mark stuff. When non-teachers go on about long holidays, only working from 9 till 3 and all the standard misconceptions, teachers will universally include marking in the list of things that take up time around the taught lessons. However, if we put the preconception that marking is just a part of a teacher's being to one side, what is the actual point of it? Who gains from all this time spent? Do we do it because we want to, have to or need to? Also, is it done for the students or for the teacher?

What if we all stopped Marking?
I'm a fan of thought experiments, so let's consider a system where there is no marking at all - what would we lose? Let's take it slightly further for a second - no assessment at all by the teacher.

For the sake of argument, with no marking or assessment the teacher's role would look something like this:

At the end of each lesson the teacher would have to decide what to teach in the next lesson based on an assumption of what's been understood. Here you would need to consider the fact that an intended lesson goes through a series of filters between inception, planning, and delivery, and then again from delivery to reception and recall:

Filters from intent to recall…

1 The original intention becomes filtered to the actual plan by what’s possible given constraints of timetable, school, students, staff, resources, etc.

2 The planned lesson becomes filtered to the lesson actually delivered by real life on the day, something not quite going to plan, students not following the expected route, behaviour issues, interruptions, teacher's state of mind, detail of choices on the day, etc.

3 The lesson delivered is filtered to the lesson actually received by prior knowledge, attention levels, language/numeracy skills, cognitive load, method of delivery, etc.

4 The lesson as received is filtered to the lesson recalled by the influence of other factors such as other lessons/happenings after the event, levels of interest, and so on.

You will also see that I've separated the later 3 stages between Teacher's view and Student's view. This is important - the teacher with deep subject knowledge, knowledge of the original intention and plan, and sight of a bigger picture for the subject is likely to perceive the lesson in a different way to the students. In fact the 'Student's perspective' row should really be multiplied by the number of individual students in the class as the experience of one may well be very different to others. (Also note for reference that if the lesson is observed then there would need to be a whole extra row to cover the observer's point of view, but that's another discussion altogether...) Basically what I'm saying here is everyone in the lesson will have their own unique perspective on the learning that took place in it.

How accurate are your assumptions?
As a teacher delivering lessons with no assessment and no marking you would have to rely entirely on your assumptions of what the students receive and recall from each lesson. An inaccuracy in one lesson would likely be compounded in the next until the intended learning path is left behind entirely over a period of time. I'd suggest only the most arrogant of teachers would attempt to argue that they could keep a whole class on track and keep lessons effective without any form of marking or assessment, and frankly they'd be wrong if they tried.

Open loop control
Basically without assessment and without marking, we are using what would be called an open loop control system in engineering terms. A basic toaster is an example of a device that uses open loop control. You put the bread in and it heats on full power for a period of time, and then pops up. The resulting toast may be barely warm bread, perfect toast, or a charred mess. The toaster itself has no mechanism to determine the state of the toast, there is no feedback to tell the toaster to switch off before the toast begins to burn. To improve the system we need to close the loop in the control system; we need to observe the toast and take action if it's burning. Closed loop control is really what we want, as this uses feedback to adjust the input, which takes us to the Deming cycle...

Deming cycle = Plan, Do, Check, Act (PDCA)
Dr W. Edwards Deming pioneered the PDCA cycle in the post WW2 Japanese motor industry. His work on continuous improvement and quality management has become prolific across engineering sectors, and he is generally regarded as the father of modern quality management.

PDCA is simply a closed loop cycle, where you Plan something, Do it, Check if it did what you wanted it to, and then Act in response to your checking to develop things further. The ideal is this then leads into another PDCA cycle to deliver another improvement, with feedback being sought on an ongoing basis to adjust the inputs.

As I trained in engineering and became Chartered Engineer in my career before switching to teaching I have always seen a series of lessons as a series of PDCA cycles. I plan a lesson, I deliver it, I find some way to check how effective it was, and I deliver another one. In my best lessons I manage to incorporate a number of PDCA cycles within the lesson, adjusting the content/activities in response to the progress being made.

Marking helps us to create a closed loop system.
The model with no marking or assessment is open loop. It would rely so heavily on making assumptions about what had or hadn't been learnt that it would become ineffective very quickly for the majority of classes.

By reviewing what students have actually done in a lesson we can determine how effective our teaching has been. We can make adjustments to future lessons, or we can provide guidance and feedback direct to the student to correct misunderstandings. (note there can be a vast difference between what has actually been done and what we think has been done both at an individual and a class level)

As a result of this need to close the loop an absolutely vital role for marking is to provide feedback to the teacher on the impact of their lessons. (As John Hattie says - "know thy impact").

Is it regular enough?
Note that if marking is the only form of feedback a teacher gets then it needs to be done regularly enough to have an impact on their teaching. Between marking cycles the teacher is running an open loop system, with all the issues that this brings with it. As such we either need to mark regularly enough to keep the PDCA cycle as short as possible, minimising the time left with an open loop, or we need to build in some other form of assessment.

Other assessment
Gaining feedback within a lesson or within a marking cycle is where AFL in its truest sense comes in. Through assessment that takes place during lessons the PDCA cycle time is reduced right down - the teacher gets feedback outside of the marking cycle, meaning changes can be made either within lesson or for the next lesson. I'm not going to discuss AFL in detail here as this post is about marking, but this is why AFL is so important, particularly if you have a long cycle time on your marking. (note for the purposes of this discussion I'm drawing a distinction here between AFL techniques deployed in lesson with students present, against marking where a teacher is reviewing work when the students are elsewhere - I appreciate there can be and should be an overlap between AFL and marking, I'm just ignoring it right now)

RAG123 shortens the closed loop

You may have seen my other posts on RAG123, if not see here for a quick guide, or here for all of my RAG123 related posts. I'm sure those of you that have seen my other posts will probably have been waiting for me to mention it!

For me the key thing that RAG123 does is to shorten the marking cycle time, and that's one of the reasons that it is so effective. By reviewing work after every lesson (ideally) you augment any AFL done in lesson, and can plan to make sure your next lesson is well aligned to the learning that took place in the previous one. More on RAG123 as formative planning is in this post.

Marking for the student

I'm guessing by now that some of you will be getting frustrated because I've hardly mentioned the other purpose of marking - giving feedback to the student... After all teaching is all about learning for students!

From a student's perspective I think marking can be about many things depending on their relationship with school, that subject or that teacher. Sometimes it's about checking they've done it correctly. Sometimes it's about finding out what they did incorrectly. Sometimes they engage deeply, sometimes they dismiss it entirely (or appear to).

If we go back to closed loop vs open loop control for a moment then a lack of marking leaves the students functioning in an open loop system as well as the teacher. In engineering terms their control system needs feedback, otherwise they could go off in a direction that is nowhere near correct. Just like a tennis player benefits from the input of an expert coach to help them to develop their game, a student benefits from the input from an expert to help them develop their learning.

Hit and miss

In truth though I think marking as a direct form of feedback to a student is far more hit and miss than teachers using it for feedback on their own practice. Depending on the quality of the marking and the level of engagement from the student this could range from really informative to utterly pointless. Sometimes the best students are given poor feedback, or least engaged students fantastic feedback, arguably both are pointless. Also what seems like fantastic and detailed feedback from a teacher (or observer's) perspective could easily be ignored or misunderstood by a student.

This potential for ineffective marking/feedback is why it is so important to try and establish dialogue in marking; again we're looking for a feedback loop, this time on the marking itself. However I'm keen to highlight that in my view dialogue doesn't always have to be written down. Discussion of feedback verbally can be much more effective than a written exchange in an exercise book, just like a face to face conversation can be more effective and result in fewer misunderstandings than an e-mail exchange.

In summary

To get back to the original question... The point of marking is to give the teacher feedback on their lessons, and to give students feedback on their learning. Both are vitally important.

The best marking has impact on the students so it changes what they do next. Good marking highlights to students that what they do is valued, highlights aspects where they have succeeded, and areas/methods to help them improve.

However the very best marking should also have impact on the teacher and what they do next. It's not a one way street, and we have a responsibility as professionals to adjust our practice to help our students maximise their learning. For example perhaps Kylie needs to develop her skills at adding fractions, or perhaps Mr Lister needs to try a different way of describing fractions to Kylie so she understands it more fully.

In short, if you are marking in a way that doesn't change what you or they do next then you're wasting your time...

This is just what I think, of course you're welcome to agree or disagree!

Saturday, 12 July 2014

Managing with colours - SLTeachmeet presentation

These are the slides I presented at #SLTeachmeet earlier today. Click here

The info shared in the presentation picks up on aspects covered in these posts:
Using measures to improve performance

Using seating plans with student data

RAG123 basics

As always feedback is always welcome...

Saturday, 14 June 2014

Powerful percentages

Numbers are powerful, statistics are powerful, but they must be used correctly and responsibly. Leaders need to use data to help take decisions and measure progress, but leaders also need to make sure that they know where limitations creep into data, particularly when it's processed into summary figures.

This links quite closely to this post by David Didau (@Learningspy) where he discusses availability bias - i.e. being biased because you're using the data that is available rather than thinking about it more deeply.

As part of this there is an important misuse of percentages that as a maths teacher I feel the need to highlight... basically when you turn raw numbers into percentages it can add weight to them, but sometimes this weight is undeserved...

Percentages can end up being discrete measures dressed up as continuous
Quick reminder of GCSE data types - Discrete data is in chunks, it can't take values between particular points. Classic examples might be shoe sizes where there is no measure between size 9 or size 10, or favourite flavours of crisps where there is no mid point between Cheese & Onion or Smoky Bacon.

Continuous data can have sub divisions inserted between them, for example a measure of height could be in metres, centimetres, millimetres and so on - it can keep on being divided.

The problem with percentages is that they look continuous - you can quote 27%, 34.5%, 93.2453%. However the data used to calculate the percentage actually imposes discrete limits to the possible outcome. A sample of 1 can only have a result of 0% or 100%, a sample of 2 can only result in 0%, 50% or 100%, 3 can only give 0%, 33.3%, 66.7% or 100%, and so on. Even with 200 data points you can only have 201 separate percentage value outputs - it's not really continuous unless you get to massive samples.

It LOOKS continuous and is talked about like a continuous measure, but it is actually often discrete and determined by the sample that you are working with.

Percentages as discrete data makes setting targets difficult for small groups
Picture a school that sets an overall target that at least 80% of students in a particular category (receipt of pupil premium, SEN needs, whatever else) are expected to meet or exceed expected progress.

In this hypothetical school there are three equivalent classes, let's call them A, B and C. In class A we can calculate that 50% of these students are making expected progress; in class B it's 100%, and in class C it's 0%. On face value Class A is 30% behind target, B is 20% ahead and C is 80% behind, but that's completely misleading...

Class A has two students in this category, one is making expected progress, the other isn't. As such it's impossible to meet the 80% target in this class - the only options are 0%, 50% or 100%. If the whole school target at 80% accepts that some students may not reach expected progress then by definition you have to accept that 50% might be on target for this specific class. You might argue that 80% is closer to 100% so that should be the target for this class, but that means that this teacher as to achieve 100% where the whole school is only aiming at 80%! The school has room for error but this class doesn't! To suggest that this teacher is underperforming because they haven't hit 100% is unfair. Here the percentage has completely confused the issue, when what's really important is whether these 2 individuals are learning as well as they can?

Class B and C might each have only one student in this category. But it doesn't mean that the teacher of class B is better than that of class C. In class B the student's category happens to have no significant impact on their learning in that subject, they progress alongside the rest of the class with no issues, with no specific extra input from the teacher. In class C the student is also a young carer and misses extended periods from school; when present they work well but there are gaps in their knowledge due to absences that even the best teacher will struggle to fill. To suggest that either teacher is more successful than the other on the basis of this data is completely misleading as the detailed status of individual students is far more significant.

What this is intended to illustrate is that taking a target for a large population of students and applying it to much smaller subsets can cause real issues. Maybe the 80% works at a whole school level, but surely it makes much more sense at a class level to talk about the individual students rather than reducing them to a misleading percentage?

Percentage amplifies small populations into large ones
Simply because percent means "per hundred" we start to picture large numbers. When we state that 67% of books reviewed have been marked in the last two weeks it conjures up images of 67 books out of 100. However that statistic could have been arrived at having only reviewed 3 books, 2 of which had been marked recently. The percentage give no indication of the true sample size, and therefore 67% could hide the fact that the next step better could be 100%!

If the following month the same measure is quoted as having jumped to 75% it looks like a big improvement, but it could simply be 9 out of 12 this time, compared to 8 out of 12 the previous month. Arithmetically the percentages are correct (given rounding), but the apparent step change from 67% to 75% is actually far less impressive when described as 8/12 vs 9/12. As a percentage it suggests a big move in the population; as a fraction it means only one more meeting the measure.

You can get a similar issue if a school is grading lessons/teaching and reports 72% good or better in one round of reviews, and then sees 84% in the next. (Many schools are still doing this type of grading and summary, I'm not going to debate the rights and wrongs here - there are other places for that). However the 72% is the result of 18 good or better out of 25 seen, the 84% is the result of 21 out of 25. So the 12% point jump is due to just 3 teachers flipping from one grade to the next.

Basically when your population is below 100 an individual piece of data is worth more than 1% and it's vital not to forget this. Quoting a small population as a percentage amplifies any apparent changes, and this effect increases as the population size shrinks. The smaller your population the bigger the amplification. So with a small population a positive change looks more positive as a percentage, and a negative change looks more negative as a percentage.

Being able to calculate a percentage doesn't mean you should
I guess to some extent I'm talking about an aspect of numeracy that gets overlooked. The view could be that if you know the arithmetic method for calculating a percentage then so long as you do that calculation correctly then the numbers are right. Logic follows that if the numbers are right then any decisions based on them must be right too. But this doesn't work.

The numbers might be correct but the decision may be flawed. Comparing this to a literacy example might help. I can write a sentence that is correct grammatically, but that does not mean the sentence must be true. The words can be spelled correctly, in the correct order and punctuation might be flawless. However the meaning of the sentence could be completely incorrect. (I appreciate that there might be some irony in that I may have made unwitting errors in this sentence about grammar - corrections welcome!)

For percentage calculations then the numbers may well be correct arithmetically but we always need to check the nature of the data that was used to generate these numbers and be aware of the limitations to the data. Taking decisions while ignoring these limitations significantly harms the quality of the decision.

Other sources of confusion
None of the above deals with variability or reliability in the measures used as part of your sample, but that's important too. If your survey of books could have given a slightly different result if you'd chosen different books, different students or different teachers then there is an inherent lack of repeatability to the data. If you're reporting a change between two tests then anything within test to test variation simply can't be assumed to be a real difference. Apparent movements of 50% or more could be statistically insignificant if the process used to collect the data is unreliable. Again the numbers may be arithmetically sound, but the statistical conclusion may not be.

Draw conclusions with caution
So what I'm really trying to say is that the next time someone starts talking about percentages try to look past the data and make sure that it makes sense to summarise it as a percentage. Make sure you understand what discrete limitations the population size has imposed, and try to get a feel for how sensitive the percentage figures are to small changes in the results.

By all means use percentages, but use them consciously with knowledge of their limitations.

As always - all thoughts/comments welcome...

Saturday, 25 January 2014

RAG123 as formative planning

I'm at risk of becoming a bit evangelical about RAG123 marking, but despite repeated pleas for balance to my enthusiasm on this I've still never EVER found anyone who has tried it that doesn't find it both beneficial to them and students and NOT a drain on their workload. Also so far nobody who has tried it has told me they are stopping! (or maybe they've both stopped RAG123 and stopped talking to me too!)

If you don't know what I'm talking about with RAG123 then see my earlier posts here, here and here. And also this post from Mr Benney (@Benneypenyrheol), who is almost as enthusiastic about RAG123 as I am!

Without a doubt for me, moving from what I would have classed as "good" marking every 2-3 weeks with formative comments to daily RAG123 has improved my practice, AND reduced my overall marking & planning workload.

Fomative marking becomes formative planning
A really important aspect is that having seen the books after one lesson I can react in the next one. I change/tweak plans, I target questions, I provide extra support, I revisit topics, I go and have a chat with specific students.

The students know I am responding to what they did in the last lesson - they link my reponses in the next lesson to their actions in the last, and this also shapes their actions in the next.

I'm finding that I don't need to give a worked example and model it as part of marking - this can be done as a natural part of discussions in my next lesson, writing it while marking serves no extra benefit.

I also don't necessarily need to write specific questions in response in their books - so long as I react in the lesson content and they know what they should be doing next (but that doesn't always need to be written down while marking). Yes this brings with it some issues in terms of evidence of feedback when doing a book trawl, but come and talk to my classes - I challenge anyone to find a student that doesn't know what they need to do to improve.

So they know what to do to improve, how does RAG123 do that?
Is it possible that all students really need to DO to improve is "put enough effort in and try your best to achieve the learning objectives in each lesson" and perhaps "let your teacher know when you haven't understood something"?

If I write in a student's book "you need to improve your algebra skills" exactly how does it help them really? If they struggle with algebra then just telling them that they need to improve it is basically just stating the obvious and misses a vital element of how to improve.

If I correct some work and give a model answer without other explanations/support, etc then that could just be showing them how to solve that problem, not addressing the misconception that caused the original error.

However if I am shaping lesson tasks to help them with their algebra skills then really what they need to do to improve is to put in as much effort they can to the tasks that I am guiding them through. They know that so long as they are putting in plenty of effort and trying to do the tasks I am planning they will make progress, because the plans are based around what they have shown me they can do and therefore what they need next.

Therefore I believe the single most important bit of feedback in RAG123 is the effort grade (RAG for us, but others do it differently). That's the bit the student has direct control over. If they are trying hard then it's up to me to design tasks that maximise learning. However if they're not trying hard then the key step for improvement may well be just try harder first.

I do give them other pointers too, they do get written comments, but often they are very short, and backed up with verbal discussion.

Errors don't accumulate
The most significant thing for me with RAG123 is I can see clearly if a class, a group of students or an individual is working at the level I expect for them. If not I can do something about it. However with RAG123 I do something about it next lesson, not in 2 weeks when the misconception has compounded to make bigger errors, or when the student has completely forgotten the thought process that took them there. Really importantly it's before we move off a topic, so I can re-teach or revisit aspects before leaving them.

For example I could draw the following graph for understanding over time with "traditional" marking - by this I mean big detailed feedback done once every 2-3 weeks.

Before you say it, yes I know learning is non-linear, but lets think of this as a rhetorical picture!

The important part is that with traditional marking, corrections must be big if the student drifts from the intended understanding. I may be exaggerating a bit and it may well be that the traditional marking does close any gaps perfectly between intended and acquired understanding, but equally it may not fully close the gap, simply because it's not timely enough. It's also harder to then check again that the gap has closed - does that happen in the next 2-3 week cycle? That may be too late.

By contrast a RAG123 graph might look like this

Obviously I'm idealising - I like RAG123! But doesn't it make sense that lots of little corrections stand much better chance of getting back or staying on target than fewer bigger ones?

Will it work for any subject?
I honestly don't see why not. Most people I talk to about this respond initially with "well it might work for maths but we mark things differently." Actually I know people using it in science, RE, English, MFL - how much more different can you get?

If you think it might work then why not try it? If you think it won't work why not try it anyway - within a week you'll know your students better than ever!

Actually you don't even have to do it daily - every 2 lessons or every 3 lessons with self assessment every lesson would still be beneficial.

I should stop sounding quite so much like a salesman on this - and my next post will be on something other than RAG123 - I promise!!

However, seriously, if you don't like RAG123 please let me know - I'm keen to understand its limitations and put some balance to my sales pitch on this!

Sunday, 18 August 2013

Some pieces of paper are more important than others

I think I've started to write this post or something like it about 5 times, but I either scrap it or it becomes something else during writing. It's done it again to some extent, and the message I thought I'd sat down to write about was very different to the final post that I've just re-read. That's one of the fascinating things about writing a blog - you think you have this amazing, powerful argument for something groundbreaking and then you write it down to realise it's a bit of a generic rant. Hopefully this has turned into something more worthwhile in the writing!

An old assembly
After finishing my A-levels I did a year out in industry before going to University. As part of the training I did during this year I went back to one of my old schools to do an assembly for the students about to start their GCSEs.

The assembly went roughly like this:

I held up a £10 note and asked the students what I was holding - obviously they all said "ten pounds" (hardly a challenge). We then got into a bit of a discussion about the fact that it was actually just a piece of paper, but society had attached an extra value to the specific printing and format of that piece of paper. If I took another piece of paper and wrote £10 on it then the value doesn't follow it. The message I was trying to send from this is that some pieces of paper are more important than others as society attaches a particular value to them.

Having then quickly established that all of the students (and the watching staff) wanted to have lots of £10 notes in their futures, the students suggested that the best way to do this was to get a good job, and we linked that to good qualifications. From there I steered the discussion towards my newly received A-level certificates. Again I was emphasising that they are all just pieces of paper, but the information on them and what it means to society makes them more important than some other pieces of paper. These certificates (pieces of paper) had helped me to secure a university place, and my plan was that this in turn would allow me to get a good job and earn lots of £10 notes.

In summing up I tried to emphasise the need for the students to work hard in their new school to maximise the value of all of their bits of paper to give them the best chance of having lots of £10 notes.

Important bits of paper stand up to time
I'll fully acknowledge that the assembly I've just described was a bit clumsy and heavily contrived - I was only 19 years old after all! However at various points in my life, and even more so since I have been a teacher, I keep coming back to the fact that we build our society around certain pieces of paper.

Beyond the A-levels I was able to talk about when I was 19 I now have my degree certificates, chartered engineer accreditation, teaching qualifications, marriage certificate, birth certificates of by children, and I could go on. Each one is technically just a piece of paper, but in its own way has a value way beyond that. Critically these pieces of paper endure over time and stand as proof of certain events even if the detail of the events themselves fade from memory.

For example, I have a degree in mechanical engineering from the University of Bristol. To get this I studied topics such as thermodynamics, fluid mechanics, solid mechanics, systems and control, electronics, and several more that don't spring immediately to mind. I took exams in these topics at various points during the course, and I graduated in 1999 with a certificate stating that I achieved first class honors.

I understood the course well at the time and given my results it is clear that I was able to demonstrate this in an exam, however if you now presented me with almost any of the questions I faced back then I would struggle. I can recall some small bits of information on some topics, but the vast majority has been forgotten due to the fact I stopped using it after finishing my degree (even though I immediately took up a job as an engineer)

Importantly though the qualification represented by my degree certificate (piece of paper) doesn't fade with time - it still has the same impact and value on every job or course application form I have ever completed. While my school qualifications, degree certificates and professional accreditations didn't actually secure any of my engineering or teaching jobs on their own, they were there as evidence when I completed the application forms (more bits of paper) that got me shortlisted.

What all of this demonstrates is that though my working knowledge of the subjects I've earned certificates for may have diminished, I have hard proof that I achieved a specified standard at some point in the past. This proof (piece of paper) has a value in society much greater than the detail of the topics needed to gain the certificates in the first place.

Hard evidence
What I'm getting at is that I think it is important to remember that when a student leaves our care the key thing they take with them as hard evidence of our impact on their life is the exam certificate we help them to achieve. This will be the case so long as society attaches a long term value to these pieces of paper.

Yes schools do so much more than just create exam results, and of course it's not always just about the grade, but grades are often the core currency of a student's future. Of course I hope that I encourage students to be interested in maths, and to be the best people they can be in the future. However if I've not also maximised their grade using whatever tools I have available to me then I've let them down in some way.

In 5 years time it is unlikely that anyone could tell based on hard evidence if that particular person was taught to the test or studied a broad and rich curriculum. They may also have just scraped their grade or just missed out on the grade above. The past student may well have forgotten how to add fractions, use the cosine rule and have thrown away their scientific calculator. They might have discovered that trick I showed them as an act of desparation to solve a particular type of exam question stops working when they apply it in all other contexts. Or they might realise why I forced them to derive that formula so that they knew where it had come from, even though it wasn't part of the exam.

Whatever route was taken to the qualification (be it educationally awesome or awful), the piece of paper it gives to the student is the thing that lasts the longest; it could well be of more future value to them than all the skills they used in order to earn it.

Most importantly
I intend to celebrate successes and learn from any disappointments. Then I'll consider how I can best help next year's cohort - their pieces of paper are still blank!

As always all thoughts welcome!

Saturday, 22 June 2013

Marking & feedback - a journey over 18 months

February 2012 saw a internal school departmental review for the maths department. In that review the following observations were made:

Marking, though regular, too often failed to give constructive feedback identifying next steps to improve.
Use of formative and summative assessment to inform pupils of next steps for learning is inconsistent.

Evidence for these statements was drawn both from reviews of books, and also from pupil voice surveys, which were quite often negative towards maths.

A fairly poor situation really. I'd been HoD for 5 months at that point and had come to the same conclusion before the SLT review wrote these findings.

Time for a change
Since then, we've developed the standardised feedback and marking process I've already written about in this post. We have also developed and embedded the formative feedback from summative assessments discussed in this post. We've also done things like buy "Verbal Feedback" stamps to record when we've had a formative conversation with a student.

This wasn't plain sailing - in October 2012 a review of exercise books still showed massive variability in practice. However we've continued to push, developing the systems together and always looking for a better way to do it.

Departmental reviews through 2013 have shown a steady improvement in consistency.

The payoff
The school held a work scrutiny event 2 weeks ago where students from years 7, 8, 9 and 10 were selected to review their work with the SLT and a retired HMI acting as a consultant to the school.

In every year group maths came out as the most regularly, consistently and constructively marked. Pupil voice about the maths marking was unbelievably positive, there wasn't a single negative point directed towards maths.

Not showing off
I'm not looking to boast about this - this blog isn't about self promotion.

I'm quite simply over the moon that the hard work the department has put in to improving feedback and marking has had such a marked impact. It demonstrates that improvements may take some time to embed in a teaching environment, but by working out a system that helps everyone to be at least "good" it is possible to make a really marked improvement.

I want to say thank you to my department - they have come such a long way and put in loads of work to achieve this result. I also want to thank the students who have responded so positively.

For me this is the kind of thing that makes being a HoD so rewarding. I'd love to hear if you have had a similar journey...

Monday, 3 June 2013

Better assessments & better use of assessments

This has been sparked by a recent review & rescheduling of assessments in our schemes of work and also some excellent work on better assessments for maths kicked off recently by @mjfenton. More details of his excellent initiative can be found here. It's a great move and one that I really want to get more involved with over the coming weeks.

I've just read back through this before posting and realised it's a bit of a rant - didn't intend it to be, but it's important - I hope you agree. Please let me know your thoughts, make comments, share any ideas and suggestions you have.

Please note for the purposes of this post when I'm talking about assessments I really mean some kind of test or exam. I'm not talking about other types of assessment for learning that are used more dynamically as part of a lesson (e.g. targeted questioning, mini-whiteboard feedback, etc). The vast majority of schools use some kind of summative tests under controlled or exam type conditions at various points during the year, and these are what this post is referring to...

What are these assessments really for?
Unless we're talking about a FINAL assessment that results directly in the exam certificate, diploma or whatever's appropriate to the end of the course then ALL assessment should be used FOR LEARNING.

To clarify this lets just consider what other possible uses there could be for doing an assessment...

Assessment for reporting (to parents, leadership, whoever)
Assessment for categorisation/sorting/ranking of students
Assessment for teacher performance management
Assessment for occupying the student's time

Alright the last one is a bit tongue in cheek, but lets face it, if there is no formative or learning outcome then none of these look like a good reason to take time away from lessons to complete an assessment (where learning progress may be measured, but is not made).

What's really important is that the assessment allows you to learn something about the student that you didn't previously know - otherwise why spend the time doing it?

To put it another way: If you don't know how the output of an assessment will contribute directly to the learning of your students then you should ask yourself why you are asking them do the assessment.

How can well intended assessments turn bad?
In my view the following list are the 6 key ways that assessments may end up being bad. Unfortunately I can lay my hands on too many that fall foul of one or more of these - we're working to try and improve them, but it's a big job...
1) Questions set at a level that is inappropriate to the students
Either too hard or too easy. What does an assessment really tell you if an entire class scores upwards of 90%, or another class scores less than 10%. Does this kind of result help you or the students to identify how to improve in ways that couldn't have been identified without the test?
2) Questions do not have an appropriate gradient of difficulty
Too big a change in difficulty will hide misconceptions in the gap between the levels. For more complex questions there should be a way for students to display partial knowledge that can be built on later through effective feedback.
3) Too many questions testing the same skill
Does the second, third, fourth question on a given topic give you any more information than the first about what the student knows? e.g. Do you need more than one question that uses Pythagoras's theorem?
4) Test is too long/short
Too long can overly penalise slow workers in terms of time required to complete it. Students can succumb to boredom and give up, meaning that the results and therefore feedback will be inaccurate.
Too short can have all of the features of (1) and (2) - vitally does it give you or the student any more information that you didn't know before the test?
5) Questions are too similar to a method or example given in class.
Does the question test true knowledge or simply the ability to reproduce something they've seen? (though sometimes assessing the ability to simply follow a method is important - this isn't always bad, but needs to be done deliberately rather than accidentally)
6) Viewing assessments as fixed part of a scheme of work/programme of study
Don't get me wrong - as a department leader I am keen that assessments are done as part of a planned, coordinated activity so that they can be used for departmental management as well as for learning. However we shouldn't get caught thinking that we have to do the same assessment at the same time every year because that's what we did before. All SoW or PoS should be live documents, being developed year on year to maximise learning. If improved learning means scrapping an old assessment or inventing/finding a new & better one then tradition shouldn't get in the way.

But the biggest problem of all... (and this can apply even if an assessment is "good" in all other aspects)
Only giving a grade, level, score or percentage as feedback
Basically this doesn't contribute to learning, and gives the student nothing constructive. It rewards the high achievers, it scolds the low achievers and doesn't tell them how to improve. As a result the assessment a complete waste of your and the student's time, which could have been better much spent actively learning something.

As such there must be a formative aspect to all assessments and a feedback process planned that shapes both student activity and teacher planning for the future.

Students need the opportunity to reflect following assessments so that they can learn from them. This reflection needs to be diagnostic and give guidance on topics/aspects that need to be improved, so that if they were to attempt a similar test again in a few weeks they would be able to demonstrate progress (I'm not suggesting that this similar test needs to actually happen).

This opportunity for reflection should be given adequate time during lessons, and be structured and guided, especially for weaker students. I've talked about methods we've used for doing this in an earlier post here. We're constantly developing this, and I see it as a key tool for future successes.

Another similar approach from @justmaths is here and this also includes some really good next steps sheets that can be used to rectify gaps in understanding (something that I'm planning on adopting in the near future).

So are there Good or Bad assessments?
I'll stick my neck out a little here and suggest that there isn't really any such thing as a good or a bad assessment. Even an apparently dry and dull page full of repetitive tasks could be educationally useful for a certain student or groups of students if targeted appropriately and reviewed/responded to in an effective way.

What really makes an assessment good or bad is how it's implemented and how the information it gives is used to improve learning. Both of these are down to skilled teaching and having the right routines and feedback structures in place to support it.

Basically if the assessment is "bad" then don't blame the test paper - it's the teacher that has set it that needs a rethink. In some cases a re-write of the test paper may well be needed for that student or group, however in others it is simply about how the test paper is used. Similarly an apparently "good" assessment could be bad and therefore worthless if given to the wrong group of students, or if it's not used constructively.

The key point I'm trying to make:
Lets make better assessments, but more importantly lets make better USE OF assessments.

@ListerKev

Saturday, 11 May 2013

Things we've done to help year 11

This post is nothing to do with managing variability other than these are actions that have been taken across the department during the past year or so, and as such are shared core practice within our team... I just wanted to record all of the things we've done with our current year 11 that have contributed to a substantial improvement in results this year.

Some may seem obvious, others less so, but all have made a contribution.

1) Establish a progress measure and make it visible to staff and students
We started mock exams with full papers early in the academic year, assigning grades and sharing them with the students and staff. We then did these regularly throughout the year, progressing through different past papers. Early scores were low due to bits of missing content, but we could also identify gaps in knowledge of content we had already covered. All mocks were accompanied by formative feedback & self assessment as discussed in my earlier post. this is in addition to feedback given on book work, which is also a subject of an earlier post.

Tracking this data centrally made the department staff aware of where work was required with particular students. You can see the progress made at a headline level during the year below, but the data was held at an individual student and class level. We could cut it to look specifically at SEN, FSM or other raise groups as well as general progress:

(You may be thinking that these results don't look that impressive?
Yes I know we're still below the target line, but we've not finished the year yet! - we've got more students within reach of a C taking exams in June - this should close the gap and take us up to, and hopefully even past the target line - I'm also keen to point out that it's not all about C grades either - we had a target of just 2 A* grades this year but have already recorded 10 so far.

I'll agree that we're not yet posting results at a level that would put us at the top of league tables, however the school has been posting results in the mid 60% range for the last 6 years, with similar target levels to this year! Therefore to take a step into the mid-high 70%, or even 80% range by the time the summer results are in will be a big improvement, and the best the school has ever delivered.)

We had actually done this regular mock process with this year group during year 10 in preparation for their earlier unit exams, so they were used to the idea of seeing progress their develop during the year.

As well as nicely sloping graphs we posted visual summaries of individual student performance vs target grades in classrooms and talked about a path to improvement. Note that because some students were concerned about publicly displaying low targets or low grades we didn't talk about actual grades, just position vs their personal target via colour coding. This is an example of what they look like:

By keeping these sheets visible in every classroom and doing the regular mocks we were emphasising that the important thing to see during the year is progress towards targets, not necessarily step changes to target. Students were always keen to see how their colours developed as the mocks progressed.

I acknowledge it's not neat & tidy and doesn't show a flawless progression from red to green/blue, but real data often isn't. However it is clear for the students to see the progress they and their peers have made during the year, which is substantial.

2) Carefully targeted revision support
Where groups of students share a common need in terms of revision then we re-grouped them to maximise focus on these areas of weakness - this happened within classes as part of differentiation and also across classes where students were grouped with a particular teacher for a short time according to need.

We also selected some students for 1:1 withdrawal during lessons (including selected extraction from other subjects for those most in need), and also selected some for short 1:1 sessions during morning registration.

3) Be clear about what is required to reach or exceed targets
We used analysis of past grade boundaries and conversions to recommend minimum marks required to achieve both their target grade and the grade above. The students responded really well to knowing that they needed a particular score, and again this helped them to judge progress in mocks. e.g. if Johnny needed at least 55 for an overall grade B, and scored 34 then 45 in successive mocks he could see progression towards his personal target in a clearer way than two grade Cs would have.

4) Parental involvement
I've already mentioned parental involvement in homeworks in an earlier post, and this did help to push the visibility of maths as a subject at home and in class.

Communications home in addition to the homework information included notification of after school and half term/holiday revision sessions, early details of exam dates and expected equipment.

Ahead of exams revision packs of questions were sent home with students, but the answers were e-mailed to parents to help the parents help the students.

Parents have been really positive about the level and types of information that has been sent home.

5) Maximise access to revision materials
We offered revision guides and revision DVDs for sale at a reduced cost via the school at various points during the year. Approximately 65% of the year took us up on this. We also regularly shared revision website information with the students.

6) Use a range of revision lessons
This is still under more development as we think of and find more ideas (and will be the subject of a further more detailed post), but once we get to revision time it is important to give students a varied diet of activities. Things we've used are:

Team question paper attack & peer evaluation
Question grenades
A3 questions
Question design
Jigsaw & loop cards (e.g. this excellent selection on Mr Barton maths)
Grade passports (e.g. these from Mr Slack)
Revision/topical team quizzes
Key fact guides - issued to every student (e.g. these from teemaths)

Is this just teaching to the test?
No, it's not all about teaching to the test. However there comes a time when we have to seek to maximise the results that the students can deliver. In the long term it is both in their interest and in the interest of the school.

Anything else?
In amongst this were other actions such as selected re-takes and some students changing from modular to linear exams, but with the changes to the exam structure in England this will not be possible in future years so it's not really worth discussing in detail.

What about next year and the switch to linear exams?
Our approach is intended to be very similar. We have already started using full GCSE papers with our current year 10 and will use the same tools for sharing the data to show progress through the year. Early indication suggests we're starting in a similar place on the curve as for the current year 11 We might selectively enter some in November if we think they will benefit from it, but we'll hold off if there is a chance that it means they might not achieve to their full potential in the end.

Any thoughts?
I'm keen to know if you've done anything similar, or different that has a beneficial effect to your students. Any suggestions for revision lessons? How are you managing the change to the linear specification?

Saturday, 20 April 2013

Giving summative tests formative impact

Not quite so directly related to managing variability this one, but it is a useful approach and it certainly does help if the whole department is doing it!

The problem
Summative tests such as the completion of mock exam papers just result in a grade. The grade is really useful for departmental level tracking, but the grade alone doesn't give the students an indication of what they need to do to improve.

This is the classic formative vs summative conflict that has been discussed at length in various forums - I'm certainly not the first to encounter this conflict. There is a wealth and breadth of work on this such as "Inside the black box" and other work of Dylan William, among many others.

So the challenge becomes to balance the need to do the summative tests to help predict grades and assess the performance of students, while at the same time making the result more meaningful than just a grade or level so that the students get something useful out of the tests as well.

The approach
I aimed to give the department and students a framework to help them diagnose where their performance was strong or weak in a summative test so that it can be used in a formative way.

Perhaps it's due to my background but Excel came to the rescue again!

I created a sheet that allows student performance on each question or sub question on a test to be entered. (sheet is a bit clunky and may not be the most elegantly presented, but it works) This is a relatively coarse measure - full marks gets a green, some marks gets an amber, no marks gets a red - and it's subjectively assigned by the teacher (or possibly by the student). This can take a little time but data for a full class of 32 can normally be entered in less than 20 minutes once you get into the swing of it. Once filled in it looks like this... (I've chosen one with plenty of colour on it!)

Firstly this is quite a visual guide for the teacher. So in this example it is clear that Q5 & 16 were quite well completed but Q6, 11 and 13 were completely unsuccessful.

This can then be used to create summary graphs to help further analysis at a teacher level, like this...

However this still doesn't really give anything back to the students (useful for the teacher though). To make the leap to formative feedback the same sheet automatically creates a personalised feedback form for each student that looks like this:

The columns identifying "strong", "could improve" and "need to improve" correspond to the red, amber & green ratings on the first sheet.

Now this becomes more useful as it clarifies the topic, sub topic or specific skill that the student needs to work on. Note that the descriptions assigned to each question are teacher generated & editable to give whatever level of detail needed.

In follow up lessons the students can then be guided to tasks that will help them to fill in knowledge in their weaker areas. Additionally the teacher has a clear record of where skills lie or are missing both at class and pupil level, and therefore has guidance for future planning.

A further development of this general idea to give a greater emphasis on self assessment involves giving students time to review tests alongside a structure to help them to reflect and identify strengths and weaknesses. We use the following sheets as a template for the students to complete as a review, and then encourage them to identify 2 strengths and 2 weaknesses.

Depending on the group this sheet can be used alongside or instead of the red/amber/green one.

We've been using all of the above across Key Stages 3 and 4, and also in selected applications in Key Stage 5.

Other uses
I've also used the red/amber/green sheet to give feedback on written assignments and on presentations - the objectives & assessment criteria for the assignments can replace the question topics.

Nothing new
I'm fully aware that this isn't a massive innovation, many teachers review tests at a question by question level. What I'm not so sure of though is how many then use this information to form the basis of specific and personalised feedback to students.

I should also observe that I am aware that there are a great many schools where this type of thing isn't done at all for their summative testing, and as such they are missing an opportunity for some really useful feedback to both students and teachers.

The usefulness of these sheets and structures is in the fact that it is relatively easy to create good feedback that can be reflected and acted upon in follow up lessons.

Benefits - is it worthwhile?
This is one of many strategies we have been using in my department over the last 18 months.

At a basic level it has provoked useful and informed discussions with students about areas to improve. As well as being used for guidance in class, we have had students specifically take copies of these sheets home to use during independent study. Fundamentally the students like them and tell me and the department that they find them useful to shape their studies. If I saw no other benefit then this positive student message would be enough to encourage me that it was worthwhile. However we have a more tangible indication that the approach is working...

As part of a programme of regular mock exams with year 11 this feedback structure has allowed us to prepare almost the whole year group for early completion of their GCSEs in the March exams. Yes I know there are mixed views on early entry but our students were ready for these exams and the results they delivered prove this...

The results were published this week, and have already set a new school record for maths in terms of A*-C. Compared to other schools we scored 17% points higher than the average of similar schools with that exam board and 25%points higher than the average of all schools with them. With the remaining students completing exams in June, along with some students now looking to improve, we are likely to deliver in the region of a 5-10% improvement in headline results compared to last year.

I'll not claim that this feedback approach made all of the difference, but it was a contributing factor in amongst everything else.

Any thoughts?
I'd be keen to hear if anyone has another way to crack this nut, or if you have any comments or questions. Leave a comment or come and find me on twitter: @ListerKev.