Monday, February 25, 2008

Residues and Threads

I'm slowly changing all my algorithms to work on either raw data or residue data. Because raw data are integer values in the range of 1 to 5, and thus enabling some performance optimizations and shortcuts, but residue data are essentially floats, all my algorithms took a performance hit.

At the same time, I'm adding multi-threading support to speed things up. The improvement is significant, but I'm already seeing diminishing returns: on a quad-core systems, matrix factorization using 4 cores runs at only 3.27x the speed of using one core.

Monday, January 28, 2008

Back to Basics

New year, now I'm playing with the most basic stuff: global effects. This time I'll try to be more patient and take a more systematic approach, instead of the "hack anything that looks promising" approach I've been taking.

Right now I've a nice number for my qualifying RMSE: 0.8888. Hey, it takes real skill to hit such a nice number by chance ! I'll try to resist the urge to submit better results until I'm sure I've made a big improvement.