1 / 12

How to Rush a Contest in 24 Hours

How to Rush a Contest in 24 Hours. 中科院自动化所:李勇保 高 珩. 中科院计算所:孔东营(avail..). 一碗泡面. 两瓶红牛. 几套工具. 大食桶. 白瓶 红瓶. sklearn libfm,pmf,omf. 必备工具. BT机器. Libfm (Factorization Machine Library). Steffen Rendle: http://libfm.org 分类 or 回归 学习方法: SGD ALS

holden
Download Presentation

How to Rush a Contest in 24 Hours

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How to Rush a Contestin 24 Hours 中科院自动化所:李勇保 高 珩 中科院计算所:孔东营(avail..)

  2. 一碗泡面 两瓶红牛 几套工具 大食桶 白瓶 红瓶 sklearn libfm,pmf,omf 必备工具 • BT机器

  3. Libfm(Factorization Machine Library) • Steffen Rendle: http://libfm.org • 分类 or 回归 • 学习方法: • SGD • ALS • MCMC:easy to handle (no learning rate, no regularization) • 输入:libsvm 格式 • 迭代次数不需太多 • 尝试不同参数,寻找最优

  4. 特征 score of user and movie max min average score_rate never score user_id score training_set.txt movie_id 300w+ feature tag_num of movie score of tags score of movie (based on tag) tags of movie(low weight) movie_id movie_tag.txt movie_tag session_id show times of movie tags of user (filtered or weighted) match of user_tag user_id user_history.txt movie_id user_social.txt: num of followers

  5. Result of Libfm • 仅用user_id和movie_id, board:0.622 • libfm单模型,board:0.608 • 获取多模型结果: • 使用不同特征集 • 改变隐变量数目 • 改变学习方法(MCMC,SGD,ALS) • 迭代次数不能太高 • 隐变量数目过大(>60),速度很慢

  6. SVDFeature • http://svdfeature.apexlab.org/ • (上海交大Apex实验室 ) • 学习方法: • 输入:类似libsvm格式 • 基于feature的可扩展性 • 加入global feature:movie-tag • 3个结果:0.619,0.620,0.621 高 珩

  7. Salakhutdinov et al.: http://www.cs.utoronto.ca/~amnih/papers/pmf.pdf 概率图模型表示: PMF(Probabilistic Matrix Factorization)

  8. 求解目标函数: 学习方法:Gradient Descent in U and V 模型输入:训练集评分矩阵,测试集评分初始化矩阵 模型输出:测试集评分预测矩阵 选择隐变量维度 2个结果:0.618,0619 PMF(Probabilistic Matrix Factorization)

  9. Ordinal Matrix Factorization • Matrix Factorization • Ordinal Regression 孔东营

  10. Ordinal Matrix Factorization Probabilistic Ordinal Regression

  11. 主题特征 • PLSA • EM算法求解: • E步: • M步: • 对于user social和movie _user求PLSA主题作为特征

  12. Ensemble 选择不同的特征集和隐变量个数 • libfm • 10个结果:0.608-0.622 • PMF • 2个结果:0.618,0619 (user_id,movie_id) • SVD • 3个结果:0.619-0.621 • OMF • 3个结果:0.614,0.618,0.616 • PLSA • 1个结果:0.621 • LR • 2个结果:0.642,0.627 • SGD • 1个结果:0.633 Ridge Regression Random 10-kold 交叉验证

More Related