Stars
The agent that grows with you
A Benchmark for Evaluating Turn-Taking and Overlap Handling in Full-Duplex Spoken Dialogue Models
Robust Speech Recognition via Large-Scale Weak Supervision
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
Your own personal AI assistant. Any OS. Any Platform. The lobster way. π¦
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, ...
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
Fast and memory-efficient exact attention
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Data processing for and with foundation models! π π π½ β‘οΈ β‘οΈπΈ πΉ π·
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
Solve Visual Understanding with Reinforced VLMs
Pioneering Automated GUI Interaction with Native Agents
Fully open reproduction of DeepSeek-R1
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
[IJCV-2021] FairMOT: On the Fairness of Detection and Re-Identification in Multi-Object Tracking
A high-throughput and memory-efficient inference and serving engine for LLMs
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
BoxMOT: Pluggable python and c++ SOTA multi-object tracking modules with support for axis-aligned and oriented bounding boxes
Official Repo For Pixel-LLM Codebase: Sa2VA (Arxiv-25), SAMTok (CVPR-26), VRT, SaSaSa2VA (1-st solution for LSVOS)
M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8...