arxiv:2403.13684
whj363636
whj363636
·
AI & ML interests
None yet
Recent Activity
upvoted a paper 27 days ago
Uni-OPD: Unifying On-Policy Distillation with a Dual-Perspective Recipe upvoted a paper 3 months ago
MHPO: Modulated Hazard-aware Policy Optimization for Stable Reinforcement Learning