Idea: extract features from data dependency graph where each node is annotated with which control structure the dependency occurs in (inside a loop, inside nested loop, from parent scope, etc).
Two expert graders each graded about 90 instances of each of 2 problems (“encrypt” a string by adding position-dependent number to each character; sort a collection and return every other element in sorted order) on a 5-point scale. 5 = correct, passes tests, uses “correct” abstractions and data structures; 3 = significant errors in data structures and/or control flow; 1=gibberish. When experts disagreed, they discussed and came to agreement.
SVM and ridge regression were trained on 2/3 of data and run on 1/3. (Coefficients, penalties, etc. were determined empirically to find lowest RMS error during 3-fold cross-validation.) The best results were selected for presentation.
In final confusion matrix (predicted scores on ~30 examples), about 1/2 agreed with raters and most of the rest were off by 1 category.
They also tried 1-class modeling (vs supervised learning) and got correlations of between .5 and .7.