DolbyUUU

Follow

👻

yuwang91 DolbyUUU

👻

Follow

🚀 Machine Learning Engineer | NLP & LLM 📈 Economist | Empirical & Behavioral 🎓 PhD | Decision Science & Managerial Economics

2 followers · 10 following

DolbyUUU/README.md

Pinned Loading

Logic-RL-Lite Logic-RL-Lite Public

Lightweight replication study of DeepSeek-R1-Zero. Interesting findings include "No Aha Moment", "Longer CoT ≠ Accuracy", and "Language Mixing in Instruct Models".

Python 3
DeepEnlighten DeepEnlighten Public

Pure RL without SFT to post-train base models for social reasoning capabilities. Lightweight replication of DeepSeek-R1-Zero with Social IQa dataset.

Python 1