# 强化学习

## reinforcement-learning

* [返回顶层目录](https://github.com/luweikxy/machine-learning-notes/tree/b36b0ea26186770feceee2ee477b6b55a14c1488/SUMMARY.md)
* [返回上层目录](/machine-learning-notes/advanced-knowledge.md)
* [DRN-A-Deep-Reinforcement-Learning-Framework-for-News-Recommendation](/machine-learning-notes/advanced-knowledge/reinforcement-learning/drn-a-deep-reinforcement-learning-framework-for-news-recommendation.md)

## 为什么要将强化学习用在推荐系统上

作为一个**千亿级数据量**的从业者，我讲讲我认为推荐系统中**最重要的几点**，可能与其他回答都略有不同

1. **不同规模下的工程架构：**&#x7279;征从**百**到**百万**到**百亿**，不同级别的工程架构相差极大
2. **对目标的选定：**&#x5982;何选择你的目标，决定了怎么做画像、特征，改变一个目标非常的伤筋动骨，而且也无法说清目标的制定是否科学
3. **对长期目标的学习：**&#x77ED;期的目标可以是一跳（用户的单次成本，付费或者消费），但长期的目标一定是用户付出的长期成本（长期消费，用户粘性），怎么去学习，是非常困难的事情。很多公司、学校都在进行这方面的研究（1、2、3），可以参考

这几个点很难绕过，未来几年也会成为各家推荐的差异点。核心技术说实话大家都非常清楚，Wide & Deep已经应用的非常广泛，这剩余的核心问题就看谁能够解决的足够快、跑的足够前面了。

## 参考文献

* [推荐系统有哪些坑？-Geek An](https://www.zhihu.com/question/28247353/answer/399162539)

"为什么要将强化学习用在推荐系统上"一节参考了此回答。

\===

[增强学习在推荐系统有什么最新进展？](https://www.zhihu.com/question/57388498/answer/570874226)

\[1] Dulac-Arnold G, Evans R, van Hasselt H, et al. Deep reinforcement learning in large discrete action spaces\[J]. arXiv preprint arXiv:1512.07679, 2015.

\[2] Liebman E, Saar-Tsechansky M, Stone P. Dj-mc: A reinforcement-learning agent for music playlist recommendation\[C]//Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 2015: 591-599.

\[3] Zheng G, Zhang F, Zheng Z, et al. DRN: A Deep Reinforcement Learning Framework for News Recommendation\[C]//Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2018: 167-176.

\[4] Lixin Zou, Long Xia, Zhuoye Ding, Jiaxing Song, Weidong Liu, Dawei Yin: [Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems](http://export.arxiv.org/abs/1902.05570)\[C]KDD 2019

清华大学和京东发表于 KDD 2019 的全新强化学习框架 FeedRec

\[5] Youtube RL Recommendation: Top-k Off-Policy Correction for a REINFORCE Recommender System , Google, WSDM, 2019


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://luweikxy.gitbook.io/machine-learning-notes/advanced-knowledge/reinforcement-learning.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
