Minimizing MSE with Line Search

a b c d e f g h i j k l m n o p q r s t 1 2 3 4 5 6 7 8 LRATEMSE .52 1.47 .52 1 1.47 1 1.31 .68 1 .52 1.47 1 1.15 1.63 .52 1 .84 1.15 1.63 1.15 3.04 3.02 3.06 3.07 2.95 3.00 3.05 3.04 .05250.4470620 We came up with an approximately minimized mse at LRATE=,030. Going from this line search resulting from LRATE=.03, we do another round round: .76 1.47 .51 .99 1.47 .99 1.40 .67 .99 0.51 1.47 .99 1.19 1.63 .51 0.99 .79 1.15 1.63 1.16 3.06 3.03 3.07 3.08 2.93 3.00 3.04 3.03 .0300.368960 Going from this line search resulting from LRATE=.02, we the same for the next round: .75 1.47 .50 .99 1.47 .98 1.44 .66 .99 0.51 1.47 .99 1.21 1.63 .50 0.98 .76 1.15 1.63 1.17 3.06 3.03 3.07 3.08 2.92 2.99 3.04 3.03 .0200.351217 Here is the result after 1 round when using a fixed increment line search to find minimize mse with respect to the LRATE used: Without line search, using Funk's LRATE=.001, to arrive at ~ same mse (and a nearly identical feature vector) it takes 81 rounds: .76 1.38 .61 .99 1.38 .99 1.34 .74 0.99 .61 1.38 .99 1.16 1.50 .61 .99 .82 1.12 1.50 1.13 3.04 3.01 3.04 3.06 2.92 2.98 3.02 3 .001 0.44721854 Going from the round 1 result (LRATE=.0525) shown here, we do a second round and again do fixed increment line search: .52 1.47 .52 1 1.47 1 1.31 .68 1 .52 1.47 1 1.15 1.63 .52 1 .84 1.15 1.63 1.15 3.04 3.02 3.06 3.07 2.95 3.00 3.05 3.04 .05250.447062 .92 1.48 .50 .99 1.47 .98 1.46 .66 .98 0.50 1.47 .98 1.22 1.63 .50 0.98 .75 1.15 1.63 1.17 3.07 3.03 3.07 3.09 2.92 2.99 3.04 3.03 .0500.387166 .84 1.47 .50 .99 1.47 .99 1.43 .66 .99 0.50 1.47 .99 1.21 1.63 .50 0.98 .77 1.15 1.63 1.17 3.06 3.03 3.07 3.08 2.93 3.00 3.04 3.03 .0400.371007 .76 1.47 .51 .99 1.47 .99 1.40 .67 .99 0.51 1.47 .99 1.19 1.63 .51 0.99 .79 1.15 1.63 1.16 3.06 3.03 3.07 3.08 2.93 3.00 3.04 3.03 .0300.368960 .76 1.47 .51 .99 1.47 .99 1.40 .67 .99 0.51 1.47 .99 1.19 1.63 .51 0.99 .79 1.15 1.63 1.16 3.06 3.03 3.07 3.08 2.93 3.00 3.04 3.03 .0200.380975 .75 1.47 .50 .99 1.47 .98 1.44 .66 .99 0.51 1.47 .99 1.21 1.63 .50 0.98 .76 1.15 1.63 1.17 3.06 3.03 3.07 3.08 2.92 2.99 3.04 3.03 .0200.351217 .75 1.47 .50 .99 1.47 .99 1.42 .66 .99 0.51 1.47 .99 1.20 1.63 .50 0.99 .77 1.15 1.63 1.17 3.06 3.03 3.07 3.08 2.92 3.00 3.04 3.03 .0100.362428 .74 1.47 .50 .99 1.47 .98 1.46 .66 .98 0.50 1.47 .98 1.22 1.63 .50 0.98 .75 1.15 1.63 1.17 3.07 3.04 3.07 3.09 2.91 2.99 3.04 3.02 .0100.351899 LRATE=.02 stable, near-optimal? (No further line search). After 200 rounds at LRATE=.02. (note that it took ~2000 rounds without line search and with line search ~219): .83 1.39 .48 .91 1.40 .86 1.52 .61 .98 0.60 1.51 .98 1.15 1.74 .48 0.99 .64 1.10 1.62 1.45 3.28 3.28 3.04 3.48 1.69 2.98 3.12 2.65 .0200.199358 Comparing this feature vector to the one we got with ~2000 rounds at LRATE=.001 (without line search) we see that we arrive at a very different feature vector: a b c d e f g h i j k l m n o p q r s t 1 2 3 4 5 6 7 8 LRATE 1.48 2.54 .90 1.6 2.6 1.55 2.68 1.15 1.73 1.08 2.67 1.73 2.06 3.08 .90 1.76 1.16 2.07 2.90 2.75 1.86 1.74 1.71 1.94 .87 1.61 1.7 1.5 .001, no ls .83 1.39 .48 .91 1.40 .86 1.52 .61 .98 .60 1.51 .98 1.15 1.74 .48 0.99 .64 1.10 1.62 1.45 3.28 3.28 3.04 3.48 1.69 2.98 3.12 2.65 .020, w ls However, the UserFeatureVector protions differ by constant multiplier and the MovieFeatureVector portions differ by a different constant. If we divide the LR=.001 vector by the LR=.020, we get the following multiplier vector (one is not a dialation of the other but if we split user portion from the movie portion, they are!!! What does that mean!?!?!?! 1.77 1.81 1.85 1.75 1.84 1.79 1.76 1.86 1.75 1.78 1.76 1.75 1.79 1.76 1.85 1.76 1.81 1.86 1.78 1.89 .56 .53 .56 .55 .51 .54 .54 .56".001/.020" 1.80 avg 0.04 std 0.54 avg 0.01 std Another interesting observation is that 1 / 1.8 = .55, that is, 1 / AVGufv = AVGmfv. They are reciporicals of oneanother!!! This makes sense since it means, if you double the ufv you have to halve the mfv to get the same predictions. The bottom line is that the predictions are the same! What is the nature of the set of vectors that [nearly] minimize the mse? It is not a subspace (not closed under scalar multiplication) but it is clearly closed under "reciporical scalar multiplication" (multiplying the mfv's by the reciporical of the ufv's multiplier). Waht else can we say about it? So, we get an order of magnitude speedup fromline search. It may be more than that since we may be able to do all the LRATE calculations in parallel (without recalculating the error matrix or feature vectors????). Or we there may be a better search mechanism than fixed increment search. A binary type search? Othere?

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z AAAB AC AD 1 \a=Z2 2 3 3 5 2 5 3 3 /rvnfv~fv~{goto}L~{edit}+.005~/XImse<omse-.00001~/xg\a~ 3 2 5 1 2 3 5 3 .001~{goto}se~/rvfv~{end}{down}{down}~ 4 3 3 3 5 5 2 /xg\a~ 5 5 3 4 3 6 2 1 2 1 7 4 1 1 4 3 8 4 3 2 5 3 9 1 4 5 3 2 LRATE omse 10 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 3 3 2 3 1 3 3 2 0.001 0.1952090 fv A22: +A2-A$10*$U2 /* error for u=a, m=1 */ A30: +A10+$L*(A$22*$U$2+A$24*$U$4+A$26*$U$6+A$29*$U$9) /* updates f(u=a) */ U29: +U9+$L*(($A29*$A$30+$K29*$K$30+$N29*$N$30+$P29*$P$30)/4)/* updates f(m=8 */ AB30: +U29 /* copies f(m=8) feature update in the new feature vector, nfv */ W22: @COUNT(A22..T22) /* counts the number of actual ratings (users) for m=1 */ X22: [W3] @SUM(W22..W29) /*adds ratings counts for all 8 movies = training count*/ AD30: [W9] @SUM(SE)/X22 /* averages se's giving the mse */ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z AAAB AC AD 21 working error and new feature vector (nfv) 22 0 0 0 **0 ** 3 6 35 23 0 0 ** 0 ** 0 3 6 24 0 0 0 ** 0 2 5 25 0 ** ** 3 3 26 0 0 **1 3 27 **** ** 0 3 4 28 ** 1 0 ** 3 4 29 ** ** 0 0 2 4 L mse 30 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 3 3 2 3. 1 3 3 2 0.001 0.1952063 nfv A52: +A22^2 /*squares all the individual erros */ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z AAAB AC AD 52 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 square errors 53 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 54 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 55 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 56 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 57 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 58 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 59 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 SE 60 --------------------------------------------------------------- 61 0 1 0 0 1 0 1 0 0 0 1 0 1 1 0 0 0 1 1 1 3 3 3 3. 2 2 3 2 0.125 0.225073 62 0 1 0 0 1 0 1 0 0 0 1 0 1 1 0 1 0 1 1 1 3 3 3 3. 1 2 3 2 0.141 0.200424 63 0 1 0 0 1 0 1 0 0 0 1 0 1 1 0 1 0 1 1 1 3 3 3 3. 1 3 3 2 0.151 0.197564 64 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 3 3 2 3. 1 3 3 2 0.151 0.196165 65 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 3 3 2 3. 1 3 3 2 0.151 0.195222 66 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 3 3 2 3. 1 3 3 2 0.001 0.195232 67 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 3 3 2 3. 1 3 3 2 0.001 0.195228 68 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 3 3 2 3. 1 3 3 2 0.001 0.195224 69 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 3 3 2 3. 1 3 3 2 0.001 0.195221 70 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 3 3 2 3. 1 3 3 2 0.001 0.195218 71 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 3 3 2 3. 1 3 3 2 0.001 0.195214 72 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 3 3 2 3. 1 3 3 2 0.001 0.195211 {goto}se~/rvfv~{end}{down}{down}~ "value copy" fv to output list Notes: In 2 rounds mse is as low as Funk gets it in 2000 rounds. After 5 rounds mse is lower than ever before (and appears to be bottoming out). I know I shouldn't hardcode parameters! Experiments should be done to optimize this line search (e.g., with some binary search for a low mse). Since we have the resulting individual square_errors for each training pair, we could run this, then for mask the pairs with se(u,m) > Threshold. Then do it again after masking out those that have already achieved a low se. But what do I do with the two resulting feature vectors? Do I treat it like a two feature SVD or do I use some linear combo of the resulting predictions of the two (or it could be more than two)? We need to test out which works best (or other modifications) on Netflix data. Maybe on those test pairs for which the training row and column have some high errors, we apply the second feature vector instead of the first? Maybe we invoke CkNN for test pairs in this case (or use all 3 and a linear combo?) This is powerful! We need to optimize the calculations using pTrees!!! /rvnfv~fvcopies fv to nfv after converting fv to values. {goto}L~{edit}+.005~increments L by .005 /XImse<omse-.00001~/xg\a~IF mse still decreasing, recalc mse with new L .001~ Reset L=.001 for next round /xg\a~ Start over with next round

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z AAABACADAEAFAGAHAIAJ 1 3 4 3 1 3 3 3 2 3 2 2 5 3 3 3 3 5 1 4 4 4 1 1 3 4 5 5 2 3 3 2 6 3 1 3 4 1 7 3 2 1 3 1 1 3 4 8 4 5 5 9 2 2 2 3 3 1 10 3 1 1 2 3 2 11 4 4 2 3 5 3 3 12 3 13 5 3 3 5 3 3 3 1 14 2 3 3 2 5 15 3 3 2 16 1 1 5 3 1 5 1 3 17 3 2 3 1 2 2 18 2 3 3 1 3 3 19 1 3 4 2 4 3 3 1 4 1 20 1 2 5 3 1 4 4 2 3 AKALAMANAOAPAQARASATAUAVAWAXAY BA 5 2 5 3 1 1 2 3 5 1 3 3 5 5 1 3 4 1 1 2 1 1 4 1 3 4 2 5 1 3 5 1 1 5 1 5 5 5 2 1 1 3 3 2 5 1 4 4 1 1 3 1 5 2 1 4 4 3 5 1 5 1 3 2 5 1 4 1 4 1 5 2 4 5 3 4 1 O P Q R S T U V W X Y Z AA AB 0.97 0.72 0.89 0.97 0.97 0.97 0.56 0.72 0.53 0.80 1.62 0.97 0.89 0.66 0.97 0.72 0.89 0.97 0.97 0.97 0.56 0.72 0.53 0.80 1.62 0.97 0.89 0.66 0.97 0.72 0.89 0.97 0.97 0.97 0.56 0.72 0.53 0.80 1.62 0.97 0.89 0.66 0.97 0.72 0.89 0.97 0.97 0.97 0.56 0.72 0.53 0.80 1.62 0.97 0.89 0.66 0.97 0.72 0.89 0.97 0.97 0.97 0.56 0.72 0.53 0.80 1.62 0.97 0.89 0.66 0.97 0.72 0.89 0.97 0.97 0.97 0.56 0.72 0.53 0.80 1.62 0.97 0.89 0.66 0.97 0.72 0.89 0.97 0.97 0.97 0.56 0.72 0.53 0.80 1.62 0.97 0.89 0.66 0.97 0.72 0.89 0.97 0.97 0.97 0.56 0.72 0.53 0.80 1.62 0.97 0.88 0.66 0.97 0.72 0.89 0.97 0.97 0.97 0.56 0.72 0.53 0.80 1.62 0.97 0.88 0.66 0.97 0.72 0.89 0.97 0.97 0.97 0.56 0.72 0.53 0.80 1.62 0.97 0.88 0.66 0.99 0.98 0.99 0.99 0.99 0.99 0.97 0.98 0.98 0.99 1.01 0.99 0.99 0.99 0.99 0.77 0.92 0.99 0.99 0.99 0.61 0.77 0.62 0.84 1.45 0.99 0.92 0.85 BB BC BD BE BF BG BH BI BJ BK BL BM BN BO 1.68 1.68 1.69 1.67 1.68 1.68 1.68 1.68 1.68 1.67 1.68 1.67 1.70 1.68 1.69 1.69 1.69 1.69 1.69 1.69 1.69 1.69 1.69 1.69 1.69 1.69 1.69 1.69 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 1.71 2.81 2.80 2.83 2.80 2.81 2.77 2.81 2.80 2.81 2.80 2.81 2.80 2.83 2.82 2.84 2.84 2.84 2.84 2.84 2.83 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 2.84 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.01 3.01 3.02 3.01 3.01 3.01 3.01 3.01 3.01 3.01 3.01 3.01 3.02 3.01 AC AD AE AF AG AH AI AJ AK AL AM 0.97 0.64 0.56 0.97 0.97 0.86 0.97 1.13 1.07 1.59 1.13 0.97 0.64 0.56 0.97 0.97 0.86 0.97 1.13 1.07 1.59 1.13 0.97 0.64 0.56 0.97 0.97 0.86 0.97 1.13 1.07 1.59 1.13 0.97 0.64 0.56 0.97 0.97 0.86 0.97 1.13 1.07 1.59 1.13 0.97 0.64 0.56 0.97 0.97 0.86 0.97 1.13 1.07 1.59 1.13 0.97 0.64 0.56 0.97 0.97 0.86 0.97 1.13 1.07 1.59 1.13 0.97 0.64 0.56 0.97 0.97 0.86 0.97 1.13 1.07 1.59 1.13 0.97 0.64 0.56 0.97 0.97 0.86 0.97 1.13 1.07 1.59 1.13 0.97 0.64 0.56 0.97 0.97 0.86 0.97 1.13 1.07 1.59 1.13 0.97 0.64 0.56 0.97 0.97 0.86 0.97 1.13 1.07 1.59 1.13 0.99 0.99 0.97 0.99 0.99 0.99 0.99 1.00 1.00 1.00 1.01 0.99 0.76 0.61 0.99 0.99 0.90 0.99 1.14 1.08 1.28 1.15 BP BQ BR BS BT Lrate MSE 3.09 3.09 3.09 3.09 3.09 0.0079 1.252787373 3.09 3.09 3.09 3.09 3.09 0.0001 1.252778817 3.09 3.09 3.09 3.09 3.09 0.0001 1.252777738 3.09 3.09 3.09 3.09 3.09 0.0001 1.252777438 3.09 3.09 3.09 3.09 3.09 0.0001 1.252777289 3.09 3.09 3.09 3.09 3.09 0.0001 1.252777139 3.09 3.09 3.09 3.09 3.09 0.0001 1.252776991 3.09 3.09 3.09 3.09 3.09 0.0001 1.252776843 3.09 3.09 3.09 3.09 3.09 0.0001 1.252776695 3.09 3.09 3.09 3.09 3.09 0.0001 1.252776548 3.00 3.00 3.00 3.00 0.0005 1.749577428 3.01 3.02 3.01 3.01 0.0035 1.278489789 A B C D E F G H I J K L M N 102 0.76 0.97 0.97 0.75 0.72 1.29 0.88 0.86 0.97 1.18 0.86 0.72 0.97 1.29 103 0.76 0.97 0.97 0.75 0.72 1.29 0.88 0.86 0.97 1.18 0.86 0.72 0.97 1.29 104 0.76 0.97 0.97 0.75 0.72 1.29 0.88 0.86 0.97 1.18 0.86 0.72 0.97 1.29 105 0.76 0.97 0.97 0.75 0.72 1.29 0.88 0.86 0.97 1.18 0.86 0.72 0.97 1.29 106 0.76 0.97 0.97 0.75 0.72 1.29 0.88 0.86 0.97 1.18 0.86 0.72 0.97 1.29 107 0.76 0.97 0.97 0.75 0.72 1.29 0.88 0.86 0.97 1.18 0.86 0.72 0.97 1.29 108 0.76 0.97 0.97 0.75 0.72 1.29 0.88 0.86 0.97 1.18 0.86 0.72 0.97 1.29 109 0.76 0.97 0.97 0.75 0.72 1.29 0.88 0.86 0.97 1.18 0.86 0.72 0.97 1.29 110 0.76 0.97 0.97 0.75 0.72 1.29 0.88 0.86 0.97 1.18 0.86 0.72 0.97 1.29 111 0.76 0.97 0.97 0.75 0.72 1.29 0.88 0.86 0.97 1.18 0.86 0.72 0.97 1.29 0.97 0.99 0.99 0.99 0.98 1.00 0.99 0.99 0.99 1.00 0.99 0.98 0.99 1.01 0.78 0.99 0.99 0.81 0.77 1.22 0.92 0.90 0.99 1.18 0.90 0.77 0.99 1.27 AN AO AP AQ AR AS AT AU AV AW AX AY AZ BA 0.97 0.48 0.97 1.29 0.80 1.07 0.80 1.29 0.97 0.89 1.53 1.42 3.09 0.97 0.48 0.97 1.29 0.80 1.07 0.80 1.29 0.97 0.89 1.53 1.42 3.09 0.97 0.48 0.97 1.29 0.80 1.07 0.80 1.29 0.97 0.89 1.53 1.42 3.09 0.97 0.48 0.97 1.29 0.80 1.07 0.80 1.29 0.97 0.89 1.53 1.42 3.09 0.97 0.48 0.97 1.29 0.80 1.07 0.80 1.29 0.97 0.89 1.53 1.42 3.09 0.97 0.48 0.97 1.29 0.80 1.07 0.80 1.29 0.97 0.89 1.53 1.42 3.09 0.97 0.48 0.97 1.29 0.80 1.07 0.80 1.29 0.97 0.89 1.53 1.42 3.09 0.97 0.48 0.97 1.29 0.80 1.07 0.80 1.29 0.97 0.89 1.53 1.42 3.09 0.97 0.48 0.97 1.29 0.80 1.07 0.80 1.29 0.97 0.89 1.53 1.42 3.09 0.97 0.48 0.97 1.29 0.80 1.07 0.80 1.29 0.97 0.89 1.53 1.42 3.09 0.99 0.98 0.99 1.00 0.99 1.00 0.99 1.01 0.99 0.99 1.03 1.03 3.00 3.00 0.99 0.65 0.99 1.22 0.84 1.08 0.84 1.29 0.99 0.92 1.52 1.43 3.02 3.01 A larger example: 20 movies, 51 users (same as last time except I found errors in my code, which I corrected. 2 2 2 1 0 1 2 1 2 1 2 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 21 1 1 1 1 1 2 1 1 1 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 1 1 1 2 The last two red lines are printouts of the two steps in the initial line search (on the way to the first result line at MSE=1.252787373). The two vectors should be co-linear (generate the same line) or else I am not doing line search!! They are clearly not co-linear. Thus I have a more code mistake. This is why a C# versions is desparately needed!! How is that coming?

Where are we now wrt PSVD? Clearly line search is a good idea. How good? (speedup?, accuracy comparisons?) What about 2nd [3rd?, 4th?, ...] feature vector training? How to generate those? (Probably just a matter of understanding Funk's code). What "retraining under mask" steps are breakthroughs? improve accuracy markedly? improve speed markedly? What speedup shortcuts can we [as mindless engineers ;-) ] come up with. By "mindless" I mean only that trial and error is probably the best way to find these speedups, unless you can understand the mathematics). Maybe Dr. Ubhaya? What speedup shortcuts can we come up with to execute Md's PTreeSet Algebra Procedures? These speedups can be "mindless" or "magic" - we'll take them anyway!. Again, by "mindless" I mean that trial and error is used to find lucky speedups - unless you can fully understand the mathematics, it's mindless ;-) Maybe Dr. Ubhaya can do the math for us? I will suggest the following: "The more the Mathematics is understood the better the mindless engineering tricks work!" What speedup shortcuts can we come up with? Involving Md's PTreeSet Algebra? These speedups can be "mindless" or "magic", we'll take them anyway!. By "mindless" I mean that trial and error is used to find lucky speedups - unless you can fully understand the mathematics, it's mindless ;-) Maybe Dr. Ubhaya can do the math for us? I will suggest the following: "The more the Mathematics is understood the better the mindless engineering tricks work!" In RECOMMENDERs, we have people (users, customers, websearchers...) and things (products, movies, items, documents, webpages or?) We also often have text (product descriptions, movie features, item descriptions, document contents, webpage contents...), which can be handled as entity description columns or by introducing a third entity, terms (content terms, stems of content terms, ...). So we have three entities and three relationships in a cyclic 2 hop rolodex structure (or what we called BUP "Bi-partite, Uni-partite on Part" structure). A lifetime of fruitful research lurks in this arena. We can use one relationship to restrict (mask entities instances in) an adjacent relationship. I firmly believe pTree structuring is the way to do this. We can add a people-to-people relationship also (ala, facebook friends) and richen the information content significantly. We should add tweats to this somehow. Since I don't tweat, I'm probably not the one to suggest how this should fit in, but I will anyway ;-) Tweats (seem to be) mini-documents describing documents or mini-documents describing people, or possibly even mini-documents describing terms (e.g, if a buzzword becomes hot in the media, people tweat about it????) Let's call this research arena the VERTICAL RECOMMENDER arena. It's hot! Who's going to be the Master Chef in this Hell's Kitchen?

Minimizing MSE with Line Search