Thoughts on managing variability: March 2014

Saturday 29 March 2014

A RAG123 epiphany

Our PGCE student had a real lightbulb moment with RAG123 this week... It emphasised to me that it is way more than a quick way to get books marked...

(as a respect for anonymity I will refer to the PGCE student as "PG" for the rest of this post).

First lesson blues
It was PG's first lesson teaching a particular group, having only observed with them up to now. The group has a few characters in it who will always test out a new face in front of them. I wasn't observing the lesson myself but afterwards PG came to see me, clearly quite disappointed in what had happened.

In PG's view the group had not bought into the topic that had been taught. It was intended to be a lesson that picked up from earlier knowledge and then moved them into more advanced ideas. However PG felt that the students had just decided that their earlier knowledge was enough and they had rejected the more advanced method, and really not engaged with the lesson at all. This view was based on the verbal responses of a number of students during the lesson which shaped the general feel of the group (this "feel" was also backed up by the teacher who was observing PG teach).

I talked with PG about why they felt this way and what they could do to make sure that the next lesson was more successful. Now obviously part of this is a planning/experience thing for PG - at times like this it is important that the students can appreciate the need for the more advanced technique, so the earlier plan probably needed to include a way to demonstrate a flaw with the existing knowledge, creating a conflict that can be resolved with the new methods. We discussed that this might be where to start the next lesson. However I also encouraged PG to go and have a look at the books to get a wider view than just those more vocal students. (This group have been using RAG123 regularly since November so I was hopeful they would have given some comments/reflections) 20 minutes later PG came back to see me...

Perception from the few, reality from the many
Having reviewed the whole class's books and RAG123 self assessments PG's view of the class, and of the lesson was vastly changed. It was clear from looking at the work actually done by the class that the majority of the students had engaged with the lesson much more successfully than either PG or the observer had thought.

I emphasise - the observer was an experienced teacher who had been present for the full lesson, and they had also formed the perception thought that the majority of the class had failed to engage with the learning. However the work done by the students proved completely the reverse - the majority HAD engaged; the perception was dominated by the vocal few who had been more resistant.

Vitally with this information PG is now able to plan activities for the next lesson that align with where the class actually ended the last one. This starting point is MUCH more advanced than would have been the case if the lesson plan is based solely on the perceptions they ended the lesson with. It also allows a more effective demonstration of the gap between the students who were resistant than those who actually engaged but were less vocal.

Fundamentally PG has moved from feeling really disappointed about the progress of the whole class to being much more focussed on taking action to secure better engagement and progress for the relatively few individuals who shaped the overall perception. It's gone from a whole class issue to an issue involving only a few students.

More than marking, RAG123 is TEACHING
I've seen occasional sceptical comments about RAG123 (from those not using it), suggesting that teachers should use more than just looking in books to judge pupil's understanding or detect misconceptions. I completely agree with this perspective - I've never suggested that RAG123 should be used to the exclusion of other assessment methodologies. Abandoning any one method to the exclusion of others is rarely a good strategy for anything.

What this tale really highlights though is that the perception left at the end of a lesson for both the teacher and the observer was VASTLY different to the reality as shown by the student's work. Clearly there are questions about use of AFL strategies during the lesson - perhaps the right plenary or assessment method would have shown this up; then again it might not. However there is no doubt that without RAG123 the next lesson for this class would have been MUCH less effective.

This is a demonstration that RAG123 isn't just marking. The strongest bit is that it gives a way to connect with the class that is based on fact (e.g. what have they actually DONE) rather than perception (e.g. how did the you think the lesson went?). Shaping planning in this way means differentiation is more effective, less time is lost in either catching up or waiting for the perceived progress of the class to reach the desired level.

I know that PG is now completely sold on RAG123 as a way to make sure planning is effective....

All comments welcome as always

Saturday 22 March 2014

Why does RAG123 work?

I've been pondering for some time why RAG123 has such impact. I know I like it because I get a better understanding of my students, and can plan more accurately as a result. But to have such a big impact on progress (e.g. to have an effect size of 0.6 or better), there must be a number of reasons that it works. Having thought for a while I think it hits many fronts...

The cognitive science view
In his book "Why don't students like school?" Daniel Willingham (@DTWillingham) says:

"You want to encourage your students to think of their intelligence as under their control, and especially that they can develop their intelligence through hard work. Therefore you should praise process rather than ability." (page 183)

(I should point out that Willingham was referring to "slow learners" in that passage, but there is no logical reason that this would only apply to those below average - it's just that they are less likely to be praised for ability so have more need of an alternative.)

The effort gradings in RAG123 do exactly this - We acknowledge effort, even if the learning itself isn't successful. If the student is applying a good level of effort then it is the responsibility of the professional teacher to ensure that the learning experiences the student encounters are effective. Conversely without effort from the student even an otherwise perfect series of lessons will result in ineffective learning.

As such using RAG123 to raise the visibility/prominence of effort in the student's mind will encourage them to improve. They have full control over effort, they may not have full control over learning.

Similarly as a teacher, if you become clearly aware of a student that is trying hard but not making much progress you will be prompted to make a change, it forces you to be more reflective.

The visible learning view

I mentioned effect sizes above and this is most closely linked to the work of John Hattie, e.g. here.

If you look at the top end of Hattie's ordered list of effect sizes a further possible explanation of the power of RAG123 starts to emerge...

Source: http://visible-learning.org/hattie-ranking-influences-effect-sizes-learning-achievement/ (highlighting added by me)

The highlighted line items, for me, have some kind of link to what RAG123 does. Self reporting grades is relevant because the students are self rating their understanding of each topic/lesson. Formative evaluation is done as the RAG123 rating shapes what both the student and teacher do in the next lesson, that also links to Feedback. RAG123 also impacts on Teacher-student relationships as it gives a much closer and immediate dialogue on the work done, the level of understanding shown, or the barriers to learning that have been found. Meta-cognitive strategies are about students having an appreciation of how they are learning, and therefore about what makes them successful learners. With the lesson by lesson feedback enabling students to link actions from a particular lesson directly to the learning outcome RAG123 encourages this link. Finally the feedback that the teacher gets from RAG123 means that there is a clearer link between what students have done and the teaching strategies they have used. As such it increases the chance of selecting more effective teaching strategies in the future.

Of course I have simplified this a fair bit - just taking the headings from the Hattie rankings and making links to RAG123, which may be a little tenuous at times, but I think the basic logic is sound though.

The teacher reflection view (i.e. it forces you to be a better teacher)

Using RAG123 forces a teacher to be more reflective. If a student is persistently putting in poor effort it becomes more obvious if you record that every day - it will prompt some kind of action. Similarly if a student simply isn't progressing despite effort then it's the teacher's job to fix it. If a lesson was really successful you have immediate feedback on it and can use that to roll into the next lesson, and the converse is true.

Just recently I have been trying to rate my own teaching with RAG123. (RAG for quality of planning - note this is distinct from quantity! 123 for success of the lesson) - not sure I've got the criteria nailed yet, and I do struggle to both be 100% reliable in the assessment of planning quality. Also it's hard to separate lesson success. However the reflection itself is powerful.

I've not given myself any R1 ratings, but I have occasionally given G3, which then prompts me to consider if G3 is actually possible as surely a high quality plan must be successful? Otherwise it's not a high quality plan, or is it something else? I've also rated myself a few A1s, and again I start to wonder if the planning was really that average if the lesson was a success, but then how much better might it have been with better planning? Overall doing this has spurred me on to do more G level planning rather than A, simply because it prompts me to be more conscious of it.

Still no negatives

Finally I just want to observe that I've still NEVER found anyone who has tried RAG123 that doesn't like it and doesn't want to continue. But I do find loads of people who haven't tried it that are unsure that they could make it work. If you've never tried it then what's stopping you?

All feedback & comments welcome.

Saturday 15 March 2014

What can education learn from Quality Management?

Bit of a big post this - it's been evolving in my drafts file for a while now...

I learnt a lot about management and quality during 10 years working in the automotive industry. Due to the complexity of the products this is an industry that has been at the forefront of quality management since the 1950s. Cars are the most complicated consumer product in the world. They are built in vast numbers, require tens of thousands of components to work together when operated by relatively untrained drivers in a massive array of conditions. What's more failures have the potential to be both catastrophic and fatal. Also they are almost all built to a very tight budget, meaning that waste needs to be eliminated from all parts of the business to allow a car company to turn a profit.

The reliability of modern cars is truly gobsmacking, and it is due to the fact that the automotive industry are global leaders in the field of quality management. Versions of the approaches pioneered in the automotive sector are now deployed across manufacturing, and are even being used as models for business management in many non-manufacturing settings. However even the most forward thinking of these alternative applications are usually about 5-10 years behind the latest in the automotive world.

I'm sure having got this far you're now thinking "Kev's lost it, he's gibbering about cars, I thought he was a teacher and this was a blog about education?" Well you might be right! But I really do think there is much to learn about the idea of quality in education.

Don't panic, I'm not about to insist that schools are like production lines, propose time & motion studies or some hideous Fordian uniformity to classrooms; I guess I need to explain where this is coming from in terms of how quality management evolved... (but if you want to cut to the chase then scroll down to the "lessons for education" heading)

An evolution of practice

All quality assurance or management systems currently in place in any industry today have a lineage that traces back to the automotive sector in post war Japan. Before this point it was all about "quality control" - essentially inspection at the end of the production line to check that things were correct. This was ok, but it was only partially effective at catching all potential problems. You physically can't check for everything. As a result some things slipped through the net and caused issues in the field.

From control to assurance

To improve on the partially effective "control" system the emphasis shifted to "quality assurance". The premise was that rather than inspecting at the end of the line where tests were limited and rectifying errors was expensive and took a long time, why not do that inspection earlier in the process? Perhaps even before the parts are fitted? Or even delivered? The automotive firms pushed chunks of their inspection processes back up the production line all the way into their supplier's factory. The idea being that if all the parts arrived certified as "good" then the resulting car must be "assured" to be good. They still inspected a smaller number at the end of the line, but quality had improved as faulty parts were often detected before they were fitted.

However there was still a level of variability inherent in the design. Humans build it for a start, and we make mistakes! Even certified parts have to be supplied to geometric tolerances that cause variations, and other physical, chemical or even biological variations can creep in when you make large numbers of components or large numbers of vehicles. (Biological - really? Well for example loads of car components are made from rubber, which is a natural product. It literally grows on trees! As such, until relatively recently when it has become better understood and controlled, rubber components on cars were subject to variability depending on the season in which the natural rubber was harvested! As another example the quality of some car paint finishes can be affected by the type and quantity of deodorant used by the operators working in the paintshop!)

Error proofing

The next development in quality was to start to manage it from the outset. To do things with the design that prevented errors, or make the performance of the completed vehicle tolerant to variability of its components. Japanese terms like "Poka-yoke" are now commonplace in car design - it means "mistake proofing" and helps to remove human errors on the production line. For example, if an operator has to connect 3 different electrical plugs in the same area of the car each plug should be designed such that it only connects to it's correct socket, leaving no room for human error.

Assurance becomes management

By taking the focus away from inspection/control, wherever it is in the process, and looking in more detail to the systems and processes quality becomes managed rather than assured. This means designing OUT variability, designing IN error proofing, planning for quality from the very start of the process rather than applying it as an inspection at some point.

Continual improvement requires empowerment.

However perhaps the most powerful thing to come out of the Japanese automotive industry was the concept of continuous improvement, and with it the empowerment of everyone in the business to make suggestions to improve the product. This is often referred to by the Japanese term "Kaizen" (literal translation "improvement" or "act of making bad points better"). Toyota used this word to brand improvement activities in its factories and visiting western engineers and managers adopted the word.

At the heart of Kaizen is a philosophy that improvement must be lead from the top, but not directed from the top. Every worker in the factory has their part to play in the quality of the end product, and as such every worker has the right to make suggestions about how to improve it.

Vitally this includes the idea that the person that fits the brakes all day becomes an expert in fitting brakes. As such this person is very well placed to make suggestions about how to minimise errors when fitting brakes. This applies across the whole vehicle and as a result the shop floor assembly workers have a big voice in improving designs and optimising the processes.

LESSONS FOR EDUCATION
Firstly, we're currently broadly applying the "quality control" type of management. We inspect at the end of the process, both by assessing student's progress through final high stakes exams, and by the use of increasingly high stakes observations/assessments for teachers (I say increasingly high stakes due to imminent explicit linkage to pay structures in the UK), and high stakes inspections for schools.

The best schools will use more of a quality assurance model. "Good" practice will be embedded in school and departmental policies to reduce variability in practice between staff. However too often these are policed and enforced through inspection (e.g. observations, work scrutiny, learning walks). To take this on to the next level the structures need to be put in place to make good practice, and therefore success of the students, inevitable.

Developing systems in which good performance becomes inevitable can only come if the people doing the processes are inspecting it themselves. It becomes less about doing it well because you are being watched, and much more about not needing to be watched because you are watching yourself.

Ok that last paragraph sounds like a load of idealism, but if we pick up the Kaizen model in education and truly empower teachers to improve their practice they will feel a much greater ownership of it. By encouraging this ownership we make it much more likely that they will do it.

For example, who is best placed to formulate a marking policy that is workable for a teacher with a full mainscale timetable? It certainly isn't best done as a decision by someone that only teaches a partial timetable. Setting out a basic framework that includes the key characteristics of good marking and then asking teams of all staff to develop a way that this can be done in a manageable way would create a policy much more likely to be adhered to.

Just this week I was told about someone who wants to try #RAG123 marking (not heard of RAG123? It's awesome - see here!). They aren't allowed because it doesn't conform to their school's policy. Under their school's policy this person recently spent over 2 hours marking just 7 books, and they are saying that the quality of their planning is suffering due to the volume of marking they have to do! Nobody on a full timetable could possibly sustain that type of marking alongside teaching, planning and having a life. This is a clear example of a solution being imposed on people without thought to actual delivery.

Similarly it is really important that there is a route and process for all staff to highlight where the school is not working efficiently. For example are there flaws in the school sanctions and rewards system meaning that a group of teachers are struggling to use them effectively? Feedback loops are important in industry, and should be in education too. Vitally though if feedback is sought and given there MUST then be action with support from the top to address the concerns and improve the situation.

Where is the value added?
It is recognised in the automotive industry that the only people actually adding value are those building the cars. They take the components and combine them into something that can be sold at a profit. Everyone else and every other process in the organisation is an overhead that chips away at profits. They may be absolutely needed as part of the long term business, but they still cost money. As such these other processes need to be as efficient as possible, and mustn't interfere with the effectiveness of the production line.

In schools it would be too simplistic to suggest that the only people that add value are the teachers, as it's the total experience at school that's important, and not always just the lessons or exam results. However systems in the school absolutely mustn't make the jobs of those that interact with students harder to do well.

Consider how easy it is to get accurate data about a specific student... Attainment, targets, behaviour, attendance, SEN, FSM, IEP, etc, etc, Is it all in one place or in lots of different places? Is it easy to download a class list of information in a usable format? I know I've worked in schools where each bit of data is stored in a different place, and in different formats.

I've heard regularly that schools do well by prioritising on Learning and Teaching. The question "If it's not improving Learning and Teaching then why are we doing it?" pops up as part of this kind of thing. However how often is that really applied across ALL systems in a school? It may well be applied to guiding CPD, or some directed time activities and meeting agendas, but is the attendance system actually optimised to stop it interfering with learning and teaching? Is the behaviour system an add on administrative activity or an integral part of learning and teaching?

Loads of education practice, particularly on the administrative side, is based on finding a system that basically works, and then iterating as needs change. Sometimes this creates a real monster of a system with add on bits and extra files all over the place. For example all schools I've been in have slightly different ways of managing student data, sometimes this is based on the specific skills or preferences of the staff involved in creating them, or just on how it's been done for years. Sometimes these systems are brilliant, other times they are ineffective. New staff come in and have to learn the foibles of a particular system, and then all the various "work around" methods to get key bits of data in the right format.

It is incredibly rare to find a system that has been completely designed from the ground up to do its job in a way that is completely aligned with the needs of all of the users in the organisation. Process mapping and optimisation of processes are effectively alien terms in education, but they really shouldn't be.

In summary - we need to start actively managing quality
Basically what I'm trying to illustrate is that any push for improving "quality" within a school needs to aim far more at the quality management end, which is the cutting edge of quality practice; as opposed to the quality control end which is a blunt and inefficient instrument.

We need less direct inspection to enforce systems from the outside, and more design of systems to make good performance inevitable. We mustn't invent extra processes to fix problems; instead we should develop systems that simplify the job rather than making it more complicated.

Like choosing to walk across the grass or around the path, people only deviate from policy because there is a shortcut. We need to seek out and use the expertise of the people that will actually work with the policies the most to help redesign them to eliminate the shortcuts! If we make it hard to do the right thing then we can't be surprised if someone does it wrong. The purpose of good leadership and management must be to design and environment where we make it as easy as possible for our staff and students to do it right. That way success becomes inevitable.

As always I'd welcome any feedback and comments... :-)

Sunday 2 March 2014

My Tenets of Leadership (and Life)

I studied and trained in the Korean martial art of Tae Kwon Do for about 13 years, gaining a 2nd dan Black Belt along he way. Unfortunately a recurring knee injury alongside training times that are incompatible with my current job has basically stopped me doing this for the last few years. However there is an aspect of Tae Kwon Do that stays with me.

"What are your values?"

A reasonably common interview question. Or perhaps "what are the important qualities of a leader?"

When first contemplating a management interview in my engineering career I needed to think about these questions, and after a while (probably as part of a Tae Kwon Do training session) it occurred to me that I already knew 5 key terms that just about summed up my thoughts and key values. They are taught as the 5 tenets of Tae Kwon Do...

Courtesy, Integrity, Perseverance, Self control, Indomitable spirit.

No matter how much I think about it I have never been able to beat this as a set of 5 values for a leader, or for day to day life for that matter. I'd like to break them down and explain how I think this applies to leadership (though I'll admit that in many ways they're fairly self explanatory).

Courtesy

At all levels, in all contexts, treating people as I would like to be treated. Giving people the chance to do the right thing. This means allowing time for them to come round to proposals, not just writing them off if they don't respond positively straight away. Making sure I don't shoot the messenger if something goes wrong. It also means giving people a voice in shaping changes that involve them, and being polite, saying please and thank you. I try hard to avoid making unreasonable requests and try to remember that others may have different pressures and priorities in their lives than I do. Delegate genuinely, if I pass on responsibility then let them get on with it without interfering.

Integrity

Being true to myself. Never asking others to do something that I wouldn't be willing to do. Uphold high standards in all aspects of professional life. Don't compromise on my principles to get an easy ride. End each day knowing that I have tried my best, even if I've not achieved what I set out to do. Basically make sure I can look myself in the eye when I do up my tie in the morning.

Perseverance

Work hard, particularly when things are difficult. Mistakes will be made and problems will occur, but it's more about how I deal with these challenges than debating the causes of them. When faced with a setback I try to be the first to start planning for recovery. If that doesn't work, I have to try again, and again.

Self control

Don't micro manage, I need to be willing to step back and allow others to lead where they have greater knowledge or skill. I need to make sure I don't over estimate my own abilities, and not over commit myself or my team. Even when things get frantic I try to keep a level head. At least give the outward impression of remaining calm when under pressure - a team will use the responses of their leader as a cue. When all are under stress (e.g. Observations/inspections) a good leader needs to set the tone for the department - if I'm clearly massively stressed then the team will be too. Conversely if at these times I can be outwardly calm and take time to offer help/support others it is a real sign of strength and the team will perform better as a result.

Indomitable spirit

Being brave! Taking on a challenge. Not being intimidated by reputations or titles, or even by expectations or normal modes of working. Remembering that my views are just as valid as everyone else's. Take risks by following my instincts and arguing my case when I need to.

Overall I struggle to beat these 5 core values. I have come back to them at various points in my life and have always found it useful to consider whether a particular course of action fits with them - best fit is always the best decision. They might seem a bit cheesy or contrived, but this really is how I see it. Feel free to ignore or comment as you see fit...

Saturday 1 March 2014

Effect size and #RAG123

I first decided to do a trial of RAG123 in November, as described here. My thinking was that if it worked I'd share it more widely. The results were so positive that I've actually never stopped using this approach since - it's become a vital part of my marking and feedback approach, but also forms a massive part of my planning too. If you're new to it all and need a fuller description of RAG123 then see this post.

The result of just giving it a go for a week in November was that I have really struggled to find clean and clear objective data to compare a before and after to assess the impact of RAG123 on pupil progress. While I am confident that students like it, I like it, and it feels beneficial, it would be nice to have some proof that it's actually beneficial.

A prompt from Mr Benney..
I had been fairly content to just carry on with using RAG123 and accept that a clean data set was lost to me. However then Damian Benney (@benneypenyrheol) not only picked up RAG123 but then did this fantastic bit of work on a 5min Research plan for RAG123 (here), including crunching some data to work out an effect size of 0.73. (yes effect sizes need to be used with caution, but it's an indicator to go along with all the other feel good positives)

With this great data and apparently massively positive outcome to inspire me I revisited our departmental spreadsheets to see if I could find some clean comparable data. With a bit of searching I found what I was looking for. 2 parallel year 8 classes, nominally of similar ability, one taught by a teacher who started to use RAG123 early, and another who started later. This meant there was one full assessment that RAG123 was pretty much the only difference.

Results for comparison
The two groups have done 3 assessment tests so far this year - they sit identical tests during the same week of term. Scores are out of 50, and the mean scores for the two groups are as follows:

As can be seen in tests 1 and 2 the groups achieved broadly similar scores at or around the low 30s, but in test 3 group 1 takes a step up to 37.1.

Same teacher with/without RAG123
Group 1 had "normal" feedback, marking and planning up to test 2, but their books were marked and lessons planned using RAG123 for the content of test 3. Taking the step change of +4.7 marks from the average of tests 1 and 2 to the test 3 level, and dividing by the standard deviation calculated across all 3 tests (8.1) we get an effect size of +0.58. A really pleasing figure.

It's worth also noting that the standard deviation for test 3 on its own is just 6.4, compared with 7.4 and 9.3 for tests 1 and 2 respectively. As such the average score is better, and it is also less variable, in fact the lowest score on test 3 is 5 marks higher than on test 1, and 10 higher than test 2! This lower variability in scores is echoed in interquartile range, which is 9 marks for test 3 compared to 12.5 and 11.75 for tests 1 and 2.

In summary the results for test 3 suggests that RAG123 has had the following effects when group 1's performance is compared with its own prior results:

Raised the average mark by almost 5 marks, so almost 10% better (note median mark rose by almost 6 marks)
Made the group's marks more consistent as demonstrated by a reduced standard deviation and interquartile range.
Had an estimated effect size of +0.58

Different teacher with parallel groups with/without RAG123

This one is probably slightly sketchier, as there are more variables at play given 2 different teachers and different classes, however I'll have a go...

Group 2 was in receipt of "normal" feedback, marking and planning through all tests 1, 2 and 3.

For group 1 the mean score for the first 2 tests is 32.4, with a standard deviation of 8.4. For group 2 the equivalent mean for the first 2 tests is 31.8, with a standard deviation of 6.6. As such without RAG123 I'm broadly taking these two classes as performing at a comparable level. (note I fully acknowledge that there is a relatively large discrepancy between test 1 and 2 for group 2 - which could make this bit of analysis a bit uncertain - what can I say, it's real data and all I've got for this analysis. If you want to ignore this data then feel free! I believe what may have happened is that group 2 were shocked by their poor scores in test 1, put more effort in test 2 and took their foot back off the gas in test 3.)

Group 1 scored on average 4.1 marks higher than Group 2 in Test 3. The standard deviation for both groups combined in test 3 was 6.9, giving an apparent effect size of 0.59.

In summary the results suggest that RAG123 may have contributed to the following effects when group 1's results are compared with those of group 2, which is nominally a parallel set, and has comparable prior results:

Raised mean mark by about 4 marks, approx 8% (note median mark was actually 7 marks higher)
Given an estimated effect size of +0.59
However given the greater variability in marks for group 2 there is a level of uncertainty in this data.

Visible progress

Perhaps more powerful than all of these cold hard stats though are the colour progress charts we use for these groups (colours indicate progress to target). I've tweeted these kinds of thing before but it's quite stark when you compare the two groups in this particular study...

Group 1 looks like this: (red, yellow & orange are below target, green, blue and purple are on or above)

Group 2 looks like this: (red, yellow & orange are below target, green, blue and purple are on or above)

In terms of perfomance to target it is clear that group 1 are doing MUCH better in test 3 than either group has ever done.

Overall effects

As an overview all the data analysed so far supports the feel that RAG123 is effective. I'm sure the data has holes in it, but ALL data collected in education has that because it deals with individual students and individual teachers, and it is impossible to completely eliminate other factors from the experiment.

The data shown here undoubtedly supports the fact that RAG123 improves progress, both when the teacher is kept constant and marking changed or when the test & type of class is kept constant. Taking Mr Benney's study into account that's 3 analyses that show effect size a good step bigger than 0.5. Whether the effect size is really 0.58, 0.73 or another value is probably beside the point...

The bottom line is it's demonstrably beneficial AND it's better for overall workload!!

Once you get started with RAG123 it just feels right. The students respond really positively and REALLY value the little comments and dialogue that develops with the teacher. We've even had one student choose to make a child protection related disclosure by writing it down in their maths book - In later discussions it became clear that they chose this route because they knew their maths teacher reviewed their books daily and would respond to it! Clearly this is a sad situation for the child, but they're now getting the help they need and it's a powerful endorsement of the relationship that RAG123 helps to build.

As a further example of the kind of thing RAG123 can do here is one student's self assessment comments from Thursday this week:

He knew he'd not worked hard enough and was honest enough to both say so and apologise. (In truth though he had done more than quite a few others, and I'd have probably given him an amber for effort, but I very rarely raise a pupil's self assessed effort level - if they think they could have worked harder then who am I to argue!)

On Friday he arrived in lesson with a completely different attitude, completed masses of work and left me this comment:

I didn't give him a sanction for lack of effort - he self assessed and knew he needed to improve on yesterday... He then delivered on it! This kind of self reflection and reaction is commonplace for RAG123 marking.

What's stopping you?

Frankly if you're still skeptical that RAG123 is worthwhile then I have to wonder what more evidence is needed!!

If you've not tried it yet I strongly urge you to give it a go. 4 months since the #RAG123 hashtag was created and I'm losing track of the number of people using this approach. However I've still NEVER found anyone who has tried it that says they want to stop. In fact I've NEVER had anyone who has tried it say it is anything other than beneficial on all levels!!

For those of you that have tried it - if you've got data of before/after that could be analysed it would be really interesting to see that too!

As always all thoughts & comments appreciated....