Taking Precise Measurements

salvēte, amīcī et sodālēs! Before we go on to another set of exercises and quizzes, or even another story from the Tres Columnae project, I wanted to take some time to think through a very important, but often unexamined, issue in teaching languages, and especially in teaching Latin and Greek. It’s an issue of measurement and assessment – a critical one, in fact: how do we know that our measurements (the quizzes and tests we give our students) are actually measuring what we want them to measure? Statisticians and experts in assessment refer to this idea as validity; it’s closely linked with a related concept called reliability, which has to do with how close a learner’s scores would be if he/she took the test or quiz more than once. The closer the scores, the more reliable; the more the instrument measures what it’s supposed to measure, the more valid.

When I closed yesterday’s post, I made this point about translation, both as an instructional tool and as an assessment:

I also think it’s a tool that can easily be overused … or even used when it’s not the best tool for the job. Not even a Swiss Army knife is the perfect tool for every job; for example, it would be hard to use one to light up a darkened room! 🙂

My primary concern with over-using translation as an instructional tool is mainly that it keeps our language learners focused on their first language rather than on the language they’re learning. After all, if the only thing you do with a Latin passage is to translate it into, say, English, that would seem to imply a couple of things. First, it implies that English (or whichever language you’re translating into) is the “real” or “primary” language, while Latin is simply a complicated code from which you have to extract the “English meaning.” Second, and consequently, it implies that English (or whichever language you’re translating into) is superior and Latin is inferior. I’ve run into too many advanced Latin students (not mine, usually) who think the Romans actually thought in English but translated their thoughts into Latin! 😦 Of course, that’s a common belief among learners of any language, but it needs to be dispelled, not encouraged. My fear is that an over-use of translation in instruction actually confirms this belief, and my hope is that regular communicative interactions in the language (even the simple multiple-choice responses we’ve looked at in this series of posts) will help learners overcome this and other false preconceptions about the relationships between languages. In keeping with our tool metaphor, translation would be a useful but specialized tool for instruction – more like a set of metric sockets than a Swiss Army knife. (You don’t need them every day, but as I was reminded recently, when I had to replace the battery in a Volvo, when you need them, you really need them!)

So much for the overuse of translation in instruction. My larger concern is the overuse of translation in assessment, which is why I’ve taken such pains in this series of posts to demonstrate other ways (including a bunch of Latin-only ways) to assess both reading comprehension and grammatical analysis without using translation. My biggest concern with translation as an assessment tool – whether for comprehension or for analytical work with the grammar of the language – is that translation is too complicated a task to satisfy anyone’s criteria for validity or reliability. Specifically, I think there are too many variables, both in the learner’s task and in the assessor’s, and the criteria for an acceptable performance are often too vague. (I think of the plaintive questions about “how to grade translations” – usually asked after the translations have been assigned – on the Latinteach listserv over the years, and the perennial questions about “is this translation acceptable” on the AP-Latin listserv.)

For example, consider this sentence from the Tres Columnae story we’ve focused on since Friday:

haec tamen pauca tibi et sorōrī explicāre possum.

  • What criteria for accuracy of translation would you establish for this sentence?
  • How would you communicate them to a learner, in advance, without “giving away” the translation of the sentence to them?
  • What kinds of feedback would you give for “translation errors” produced by a student?

And how would you convert the student’s response into a numeric score?

In the context of Lectiō Octāva, the “new things” to be tested are the datives (tibi and sorōrī). The relatively new things that might still cause trouble for learners are the complementary infinitive explicāre and the meanings of the words haec and possibly possum. In the Tres Columnae system, we’d ask direct questions about these specific items, if they were what we wanted to measure. For example, to test grammatical analysis, we might ask:

  1. cuius modī est explicāre?
    1. indicātīvī
    2. coniunctīvī
    3. imperātīvī
    4. infinītīvī
  2. cuius cāsūs est sorōrī?
    1. nōminātīvī
    2. genitīvī
    3. datīvī
    4. accūsātīvī

To test comprehension, we might ask

  1. quid Impigra facere vult?
    1. rem nārrāre
    2. rem audīre
    3. līberōs laudāre
    4. līberōs pūnīre
  2. quis hanc periodum audit?
    1. Rapidus
    2. Rapida
    3. et Rapidus et Rapida
    4. nec Rapidus nec Rapida

Depending on the learner’s patterns of correct and incorrect responses (which would be tracked, of course, in the Tres Columnae Moodle course), it would be easy for the teacher – and the learner herself – to see patterns of errors and to determine the logical next area of focus for the learner.  It would also be fairly easy to assess the reliability and validity of any given question by comparing it with others that, ostensibly, measure the same skill.

But how, exactly, do you “test” these things with a translation? And how do you give useful feedback?

For example, suppose the student, assigned to translate this sentence, says or writes,

“These few things are possible to be explained to you and your sister.”

It’s a “wrong translation” because of how it handles explicāre and possum and how it doesn’t handle tamen. And yet the student apparently has grasped the function of the two datives; has some idea that explicāre is an infinitive; has correctly determined that haec and omnia go together; and has a general idea of what Impigra is saying to Rapidus and Rapida.

Even if the teacher used a rubric for grading translations – and if that rubric had been shared with the learners – scoring might be a bit problematic. But what if the teacher uses “points” or marks rather than a rubric? How would you convert those problems into a grade – or into meaningful feedback.

Some teachers might choose the “point per word” method. But does that give credit for haec … pauca (accusative in the original, but the subject of this sentence)? And what about explicāre, which is almost, but not quite, “to be explained”? Depending on the teacher, this sentence might end up with a score of 3 / 8 (for tibi et sorōrī), 4.5 / 8 (half credit for explicāre, haec, and omnia), or even 5 /8 (half credit for possum) .. or anywhere from 37.5% to 50% credit. That’s a big range of scores … and a very low set of scores, too, given that the learner did, in fact, understand what was going on with the sentence.

Other teachers might choose a segment-scored or chunk-scored method like the one used by the Advanced Placement Program. In that case, the segments would probably be

  1. haec pauca
  2. tamen
  3. tibi et sorōrī
  4. explicāre possum.

Again, the student gets credit for one segment (tibi et sorōrī), for a score of ¼, or 25%. Or, if the teacher is “kind” and gives partial credit for partly-correct segments, the score might be 1.5/4 (half credit for haec omnia) or even 2/4 (half credit for explicāre possum). A wide, but very different range of scores – and still quite low, given that the student did, in fact, understand the point of the sentence!

Unfortunately, when translation is used as the only assessment tool for comprehension and grammatical analysis, it’s very difficult for teachers (or other assessors) to be consistent in their scoring … and this tends to make test designers, who are worried about validity and reliability, very nervous. That’s one reason why so many test designers and publishers, especially in the current U.S. climate, use multiple-choice responses so heavily: they may not be perfect, but at least the machine scoring the responses will do so with consistency. Assessors can also be trained to apply a rubric pretty consistently – the fewer levels in the rubric, the more reliable it will be – but non-rubric-scored, non-forced-choice responses will always raise some validity or reliability concerns.

quid respondētis, amīcī?

  • What do you think of my concerns about the validity and reliability of translation?
  • Or do they just make you angry because “we’ve always done it that way” and I seem to be upsetting the apple cart?
  • Do you see ways to make translation-type assessments more valid and more reliable?
  • What do you think of our alternatives to translation?
  • What concerns about validity or reliability do you have in their regard?

Tune in next time, when we’ll respond to your concerns, share some more questions, and preview the next series of posts. intereā, grātiās maximās omnibus iam legentibus et respondentibus.

Advertisements

Exercises and Quizzes, and Lessons, Too

salvēte, amīcī et sodālēs! Today we’ll look at another exercise – or series of exercises – in the Tres Columnae system, one constructed with a Moodle feature known as the Lesson or Branching Lesson. We’ve talked about these in the abstract in previous posts like this one, but today we’ll see a real, interactive one. Just as a reminder, here’s what you’ll need to do:

  1. Choose the “Tres Columnae Semi-Public Sample.”
  2. Choose the “Login as a Guest” botton. Moodle will ask you for the “enrolment key” (sorry about the spelling; that’s how the folks at Moodle spell the word). Until June 1, the end of the free trial period, that key is Caeliola79 – so type that in and press the Enrol me in this course button.
  3. Then choose the Lesson called Lectio Secunda Noun Forms. Again, since you’re not a current student, you’ll be able to Preview the lesson (though you don’t have to choose a Preview link) , but you won’t be able to have your scores recorded.

We’ll talk more about the design and construction of these Lessons in tomorrow’s post. intereā, quid respondētis?

I’m especially interested in your response to

  • the sequence of tasks;
  • the feedback for correct and incorrect answers;
  • places where you think students might need more or less practice; and
  • whether you feel I’ve achieved the tight feedback loop I described in yesterday’s post.

 Tune in next time for your questions, our answers, and a more detailed description of Lesson construction. Like exercises and quizzes, whole Lessons can also be contributed by Tres Columnae participants … or you might choose to modify an existing Lesson to make it work better. grātiās maximās omnibus iam legentibus et respondentibus!

Published in: on April 8, 2010 at 4:08 pm  Leave a Comment  
Tags: , , , , , , , , , ,

Exercises and Quizzes, III

salvēte, amīcī et sodālēs! We continue today with our series of posts about Exercises and Quizzes in the Tres Columnae system. As you may recall, on Monday we looked at the (overlapping) definitions of the terms Exercise and Quiz, and we thought a bit about the purposes and goals of such activities, whether in the Tres Columnae system or in a physical classroom. Yesterday, we looked at some specific examples – and in case you missed the directions, here are the simple steps so that you, too, can take a “live” look at them.

  1. Go to www.trescolumnae.com/moodle
  2. Choose the “Tres Columnae Semi-Public Sample.”
  3. Choose the “Login as a Guest” botton. Moodle will ask you for the “enrolment key” (sorry about the spelling; that’s how the folks at Moodle spell the word). Until June 1, the end of the free trial period, that key is Caeliola79 – so type that in and press the Enrol me in this course button.
  4. Check out the exercises and quizzes in Week 1, and see what you think. As a guest, you can’t take the quiz for a recorded score, but you can click the Preview link (at the top of the page) to see the questions and the possible answers.

Today we’ll consider two big, related issues: feedback loops and lesson design. First, we’ll consider how – and why – we might create a “tight feedback loop” that helps all learners achieve mastery of a new concept quickly, without the opportunity to practice errors to the point of mastery. Then we’ll consider how to incorporate such “tight feedback loops” in the context of a whole lesson, whether it’s delivered by a system like Tres Columnae or in a more conventional, teacher-led classroom. Tomorrow we’ll see an actual example of a “branching lesson” in the Tres Columnae system.

When writers on education use the term feedback loop, they are usually referring to a process in which

  • a student (or group of students) practices a concept;
  • someone (usually the teacher) checks the accuracy of their attempt; and
  • depending on how they did, they either do more of the same kinds of practice,
  • move to a higher level of practice, or
  • move on to something else because they have completely mastered the concept.

It’s a very different model from the “typical” factory-model classroom, in which no one checks on the student’s mastery of the concept until “quiz day”or “test day.” Even on those days (and you know that “test day” is Friday, right?), the focus is not on helping the learner get better at the concept; instead, the purpose of the quiz or test is to reward students who have mastered it, punish those who have not, and accumulate some grades for report cards or to show that the class is rigorous.

In his remarkable book Disrupting Class, to which I’ve referred in several previous blog posts, Clayton Christensen points out that this factory-model approach is very similar to the way that the (former) U.S. Big Three automakers handled their production lines – and the training of new employees – but it’s utterly different from the way that Japanese automakers handle quality control and employee training. He tells the story of Steven J. Spear, who now teaches economics at MIT and has written a book called Chasing the Rabbit (which I haven’t yet read). As a graduate student, Dr. Spear took jobs with a Big Three and a Japanese auto company installing seats in vehicles. At the American company, the “training” consisted of a demonstration, after which he was completely unable to install the seats correctly. At the Japanese company, by constrast, the task was broken into several steps, and Spear was able to practice Step 1 until he had thoroughly mastered it. Only then was he able to practice Step 2, and so forth. By the time he was on the assembly line, he could install the seats perfectly.

As a result, the Japanese company spent considerably less money on post-production quality control; they knew that most of the products would be correctly assembled the first time. By contrast, the American company had a small army of quality inspectors and anticipated a substantial rate of rejections and reworks. And then, no doubt, its managers wondered why the Japanese company was able to produce a better-quality product for the same (or slightly lower) price!

Sadly, American schools for too long have emulated American automakers’ approaches to quality assurance – or is it that American manufacturers borrowed their training systems from American schools? Either way, the emphasis was not on producing a defect-free product the first time. That, of course, has been changing rapidly – and by necessity – in both manufacturing and education, but old habits can be hard to break … especially when teachers and school administrators aren’t consciously aware that they are breaking old habits.

By contrast, Tres Columnae is designed “from the ground up,” so to speak, around the principles of immediate feedback and zero defects. It’s our assumption – and I’ve implemented something like this in my own face-to-face classrooms over the years – that you can and should have such expectations, and that students can and should rise to them, as a rule. In previous posts, I’ve talked about both the why and the how of creating such an environment; I won’t re-hash that today except to say that the keys are to

  • consider each step of a multi-step process;
  • demonstrate that step to the learners;
  • check for comfort and understanding;
  • practice it with them;
  • check for comfort and understanding again;
  • watch them practice;
  • continuously check for comfort and understanding; and
  • move on to the next step (or the next process or concept) as soon as the learners are both comfortable and proficient.

I wish I could claim credit for the idea, but it’s hardly original to me. Fred Jones, in his remarkable book Tools for Teaching, explains it much more fully, and it’s the cornerstone of educational approaches from Mastery Learning to Madeline Hunter’s Direct Instruction system. (Sadly, while the idea is quite prevalent, its implementation in many schools is quite another story!) But how do you transfer it to the Tres Columnae system, or to any other independent learning system, for that matter?

Let’s take a look at the sequence of activities in Lectiō Prīma at www.trescolumnae.com/moodle and see what we notice:

  1. We begin by reading, hearing, and seeing Fabella Prīma, which introduces some simple Latin words with pictures and context clues.
  2. We check our understanding with a self-assessment (a Moodle “choice” activity where we rank our comfort on a scale from 1-5).
  3. Since the focus was on vocabulary, we check learners’ ability to recognize the new words in isolation through the vocabulary-matching assignment. Like all Tres Columnae exercises and quizzes, it gives the learner immediate feedback on his or her performance.
  4. If you, the learner, continue to have difficulty, we’ll send you to the Vocabulary Practice Lesson we describe (and demonstrate) tomorrow. Otherwise, you go on to Fabella Secunda, which introduces some more words. The cycle repeats, gradually adding more complex concepts and skills; as soon as possible, you’re able to move from rote practice to creating your own original stories (and other forms of content) that use the new concept or skill.

quid respondētis, amīcī?

  • How does this system of feedback and skill-building compare to what you typically do in your classroom?
  • How does it compare to what your teachers typically did?
  • Do you think this is a good approach to teaching and learning, or is it too mechanical? Or, for that matter, does it give students too much responsibility for monitoring their own progress?
  • This system is built on an assumption that students,in general, want to succeed and will do so, given the proper tools and guidance. Do you agree?
  • This system is also built on an assumption that students ought to master the concepts and skills that are being presented to them. Do you agree?
  • We also have a core assumption that students should be actively engaged in learning new concepts and skills, and that active engagement will normally lead to mastery. What do you think? Do you think students have a “right to fail”or a “right to sit passively and observe,” or not?
  • And do you want to see a specific example – perhaps with something a bit more complicated than vocabulary?

If you’re yearning for specific examples, just hold on for another day or so :-), because we’ll look at such an example in tomorrow’s post. It’s created with a Moodle feature called the “branching lesson,” which I think is one of the best presentational features of the Moodle system.

Until then, grātiās maximās omnibus iam legentibus et respondentibus! There’s still time to sign up for a Free Trial Subscription to the Tres Columnae project if you’re interested.