Next: Course Delivery Up: The Archive Previous: Our model Contents

Data mining

Gloria Rogers and others have made a strong case for protfolio-based assessment, i.e., looking at individual students and seeing what they've done in order to make a judgement about our confidence that they've acquired a certain level of a given targeted competence (a.k.a. ``outcome''). We can, of course, integrate competence level against confidence level to obtain a single number, i.e., the student's expected level of the given competence. We can then average these expectations over all students to determining how well we are doing with respect to this targeted competence, not just at graduation but at all intermediate stages, e.g, at the end of each course.

As a cornerstone of our ABET-mandated continuous-improvement process, we will implement a database that keeps track of how students are doing down to the problem/question level -- graders and TAs will organize their spread sheets so that there is a page per assignment/exam with a column (of per-student scores) for each problem/question. The data in these spreadsheets represents a sparse floating-point array of a thousand students by a few thousand problems/questions per quarter, i.e., a few million entries.

We will have a list of between 10 and 100 competencies that we wish students to achieve by the time they graduate, and we'll also have a sparse array of the "relevance" of each problem/question to each competency. For our purposes, "relevance" means the degree to which success (or lack thereof) on that problem/question indicates achievement (or non-achievement) of that particular competence.

What's needed is an algorthm (or heuristic) for determining from that data for a given student his expected level with regard to that competence. I don't expect these algorithms to be linear, i.e., based only on matrix multiplication. For instance, more recent work is more relevant than past work, etc. (Note that not all students take their courses in the same order, so the weight we give to a particular score should be student dependent.)

Also, not all scores are equally significant. For instance, a zero on a problem for a student whom missed the exam means something different than a zero on that same problem for a student who took the exam. (Note that a student may skip an quiz simply because the instructor has promised to throw out the lowest quiz score.) It is therefore important that we have a no-answer score that is different from a wrong-answer score. Fortunately, spreadsheets support that distinction.

Next: Course Delivery Up: The Archive Previous: Our model Contents

Tom Payne 2003-09-04