Reworded some stuff in Slope One post

2018-01-30 23:34:04 -05:00
parent c52687dd40
commit c7695799e6
1 changed files with 49 additions and 28 deletions
--- a/drafts/2018-01-30-slope-one.org
+++ b/drafts/2018-01-30-slope-one.org
@@ -3,47 +3,68 @@
 #+DATE: January 30, 2018
 #+TAGS: technobabble, machine learning
 [[https://arxiv.org/pdf/cs/0702144v1.pdf][Slope One Predictors for Online Rating-Based Collaborative Filtering]]
 The way this works is remarkably simple.  I'll concoct a really
 contrived example here to explain it.
 Suppose you have a large number of users, and a large number of
 movies.  Users have watched movies, and they've provided ratings for
 some of them (perhaps just simple numerical ratings, 1 to 10 stars).
 However, they've all watched different movies, and for any given user,
 it's only a tiny fraction of the total movies.
 Now, you want to predict how some user will rate some movie they
 haven't rated, based on what they (and other users) have.
 That's a common problem, especially when generalized from 'movies' to
 anything else, and one with many approaches.
 Slope One Predictors are one such method, described in the paper [[https://arxiv.org/pdf/cs/0702144v1.pdf][Slope
 One Predictors for Online Rating-Based Collaborative Filtering]].
 Despite the complex-sounding name, they are wonderfully simple to
 understand and implement, and very fast.
 Consider a user Bob.  Bob has rather simplistic tastes: he mostly just
 watches Clint Eastwood movies.  In fact, he's watched and rated nearly
 all of them, and basically nothing else.
 Now, suppose we want to predict how much Bob will like something
 completely different and unheard of (to him at least), like... I don't
-know... /Citizen Kane/.  How would we go about this?
+know... /Citizen Kane/.
-The Slope One algorithm does it as follows:
+First, find the users who rated both /Citizen Kane/ *and* any of the Clint
 Eastwood movies that Bob rated.
-1. Find the users who rated both /Citizen Kane/ *and* any of the Clint
+Now, for each movie that comes up above, compute a *deviation* which
-   Eastwood movies that Bob rated.
+tells us: On average, how differently (i.e. how much higher or lower)
-2. For each movie that comes up above, compute a number that tells us:
+did users rate Citizen Kane compared to this movie?  (For instance,
-   On average, how differently (i.e. how much higher or lower) did
+we'll have a number for how /Citizen Kane/ was rated compared to
-   users rate Citizen Kane compared to this movie?  (That is, we'll
+/Dirty Harry/, and perhaps it's +0.6 - meaning that on average, users
-   have a number for how /Citizen Kane/ was rated compared to /Dirty
+who rated both movies rated /Citizen Kane/ about 0.6 stars above
-   Harry/, another for /Citizen Kane/ compared to /Gran Torino/,
+/Dirty Harry/.  We'd have another deviation for /Citizen Kane/
-   another for /Citizen Kane/ compared to /The Good, the Bad and the
+compared to /Gran Torino/, another for /Citizen Kane/ compared to /The
-   Ugly/, and so on - for everything that Bob rated, and that someone
+Good, the Bad and the Ugly/, and so on - for every movie that Bob
-   else who rated /Citizen Kane/ also rated.)
+rated, provided that other users who rated /Citizen Kane/ also rated
-3. Of course, Bob rated all of these movies - so we can take his
+the movie.)
   rating for each of these, and add each respective number above to
   'correct' it and produce an estimate of how he might rate /Citizen
   Kane/ based on just his rating for another movie.
 4. Average together all of these predicted ratings.
-A variant of it, Weighted Slope One, makes one small modification: In
+If that deviation between /Citizen Kane/ and /Dirty Harry/ was +0.6,
-step #4 it turned that average into a weighted average that takes into
+it's reasonable that adding 0.6 from Bob's rating on /Dirty Harry/
-account how many ratings are involved for each pair of movies.
+would give one prediction of how Bob might rate /Citizen Kane/.  We
 can then generate more predictions based on the ratings he gave the
 other movies - anything for which we could compute a deviation.
-As an example, if only one person rated both /Citizen Kane/ and the
+To turn this to a single answer, we could just average those
-Eastwood classic /Revenge of the Creature/, the Slope One algorithm
+predictions together.
-would assign this prediction equal "votes" 
+
 That's the Slope One algorithm in a nutshell - and also the Weighted
 Slope One algorithm.  The only difference is in how we average those
 predictions.  In Slope One, every deviation counts equally, no matter
 how many users had differences in ratings averaged together to produce
 it.  In Weighted Slope One, deviations that came from larger numbers
 of users count for more (because, presumably, they are better
 estimates).
 Or, in other words: If only one person rated both /Citizen Kane/ and
 the lesser-known Eastwood classic /Revenge of the Creature/, and they
 happened to thank that /Revenge of the Creature/ deserved at least 3
 more stars, then with Slope One, this deviation of +3 would carry
 exactly as much weight as thousands of people rating /Citizen Kane/ as
 about 0.5 stars below /The Good, the Bad and the Ugly/.  In Weighted
 Slope One, that latter deviation would count for thousands of times as
 much.  The example makes it sound a bit more drastic than it is.