Yanfu Ren DaFuCoding
Stars
πͺ’ Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. πYC W23
Your own personal AI assistant. Any OS. Any Platform. The lobster way. π¦
MiroThinker is an open source deep research agent optimized for research and prediction. It achieves a 80.8% Avg@8 score on the challenging GAIA benchmark.
MiniMax-M2, a model built for Max coding & agentic workflows.
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
Post-training with Tinker
Salesforce Enterprise Deep Research
Toolkit for linearizing PDFs for LLM datasets/training
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning
Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
Scalable data pre processing and curation toolkit for LLMs
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
RewardAnything: Generalizable Principle-Following Reward Models
Tools for OpenDataArena: Fair, Open, and Transparent Arena for Data
Bash is all You need - Write a nano Claude Code 0 - 1
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
Kimi K2 is the large language model series developed by Moonshot AI team
Our code for ICLR'25 paper "DataMan: Data Manager for Pre-training Large Language Models".
A live reading list for LLM data synthesis (Updated to July, 2025).
π¦ CHONK docs with Chonkie β¨ β The lightweight ingestion library for fast, efficient and robust RAG pipelines
A powerful tool for creating datasets for LLM fine-tuning γRAG and Eval
Cosmos-RL is a flexible and scalable Reinforcement Learning framework specialized for Physical AI applications.
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coo...