Difference between revisions of "MovieLens 100k benchmark results"
Jump to navigation
Jump to search
Zeno Gantner (talk | contribs) |
Zeno Gantner (talk | contribs) |
||
| (One intermediate revision by the same user not shown) | |||
| Line 13: | Line 13: | ||
! References | ! References | ||
|- | |- | ||
| − | | MyMediaLite 3.07 || GlobalAverage || 1.1256 || 1.1238 | + | | [https://github.com/zenogantner/recsys-benchmark/ml-100k/mymedialite.pl MyMediaLite 3.07] || GlobalAverage || 1.1256 || 1.1238 |
|- | |- | ||
| − | | MyMediaLite 3.07 || UserAverage || 1.0437 || 1.0518 | + | | [https://github.com/zenogantner/recsys-benchmark/ml-100k/mymedialite.pl MyMediaLite 3.07] || UserAverage || 1.0437 || 1.0518 |
|- | |- | ||
| − | | MyMediaLite 3.07 || ItemAverage || 1.0246 || 1.0453 | + | | [https://github.com/zenogantner/recsys-benchmark/ml-100k/mymedialite.pl MyMediaLite 3.07] || ItemAverage || 1.0246 || 1.0453 |
|- | |- | ||
| − | | MyMediaLite 3.07 || UserItemBaseline || 0. | + | | [https://github.com/zenogantner/recsys-benchmark/ml-100k/mymedialite.pl MyMediaLite 3.07] || UserItemBaseline || 0.9413 || 0.9656 |
|} | |} | ||
== kNN-based Collaborative Filtering == | == kNN-based Collaborative Filtering == | ||
| + | |||
| + | {| | ||
| + | ! Software | ||
| + | ! Method | ||
| + | ! 5-fold CV | ||
| + | ! all-but-10 | ||
| + | ! References | ||
| + | |- | ||
| + | | [https://github.com/zenogantner/recsys-benchmark/ml-100k/mymedialite.pl MyMediaLite 3.07] || UserKNN || 0.9283 || 0.9572 | ||
| + | |- | ||
| + | | [https://github.com/zenogantner/recsys-benchmark/ml-100k/mymedialite.pl MyMediaLite 3.07] || ItemKNN || 0.9182 || 0.9445 | ||
| + | |} | ||
== Matrix Factorization == | == Matrix Factorization == | ||
| + | |||
| + | {| | ||
| + | ! Software | ||
| + | ! Method | ||
| + | ! 5-fold CV | ||
| + | ! all-but-10 | ||
| + | ! References | ||
| + | |- | ||
| + | | [https://github.com/zenogantner/recsys-benchmark/ml-100k/mymedialite.pl MyMediaLite 3.07] || BiasedMatrixFactorization || 0.9220 || 0.9475 | ||
| + | |- | ||
| + | | [https://github.com/zenogantner/recsys-benchmark/ml-100k/mymedialite.pl MyMediaLite 3.07] || SVDPlusPlus || 0.9112 || 0.9409 | ||
| + | |- | ||
| + | | [https://github.com/zenogantner/recsys-benchmark/ml-100k/mymedialite.pl MyMediaLite 3.07] || SigmoidUserAsymmetricFactorModel || 0.8939 || 0.9232 | ||
| + | |} | ||
| + | |||
| + | |||
== Attribute-Aware Methods == | == Attribute-Aware Methods == | ||
| − | == | + | == Other Methods == |
Latest revision as of 07:10, 2 February 2013
This page is a first example of how a benchmark page in RecSysWiki could look like. It is work in progress. Please contribute and comment.
Rationale: This is primarily meant to be a comparison between methods, not between tools. This is why we sort by method. At the same time, we state the version number and all input arguments for maximum reproducibility.
If there are two lines for one method, then the first line are results with the random seed set to 1; the second line (or otherwise the only line) contains the average results for 5 runs with random initialization.
Contents
Baseline Methods
| Software | Method | 5-fold CV | all-but-10 | References |
|---|---|---|---|---|
| MyMediaLite 3.07 | GlobalAverage | 1.1256 | 1.1238 | |
| MyMediaLite 3.07 | UserAverage | 1.0437 | 1.0518 | |
| MyMediaLite 3.07 | ItemAverage | 1.0246 | 1.0453 | |
| MyMediaLite 3.07 | UserItemBaseline | 0.9413 | 0.9656 |
kNN-based Collaborative Filtering
| Software | Method | 5-fold CV | all-but-10 | References |
|---|---|---|---|---|
| MyMediaLite 3.07 | UserKNN | 0.9283 | 0.9572 | |
| MyMediaLite 3.07 | ItemKNN | 0.9182 | 0.9445 |
Matrix Factorization
| Software | Method | 5-fold CV | all-but-10 | References |
|---|---|---|---|---|
| MyMediaLite 3.07 | BiasedMatrixFactorization | 0.9220 | 0.9475 | |
| MyMediaLite 3.07 | SVDPlusPlus | 0.9112 | 0.9409 | |
| MyMediaLite 3.07 | SigmoidUserAsymmetricFactorModel | 0.8939 | 0.9232 |
Attribute-Aware Methods
Other Methods
Disclaimers
- The results presented here come with no warranty whatsoever. Use at your own risk.
- Most if not all results are self-reported by the implementations, which may contain bugs in their evaluation routines.
- The results are not necessarily fair towards the compared methods and implementations. There could be hyper-parameter overfitting, or you could achieve a lot better results by better tuning.
- MovieLens 100k is one of the oldest existing collaborative filtering datasets, and it was dominating the literature for years, because it was one of the few available datasets. It could be that methods developed in that period have a certain bias towards this dataset. The dataset is also quite small by today's standards.