lan wavelet2008
Lists (1)
Sort Name ascending (A-Z)
Stars
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Community maintained hardware plugin for vLLM on Ascend
State-of-the-art 2D and 3D Face Analysis Project
InspireFace is a cross-platform face recognition SDK developed in C/C++, supporting multiple operating systems and various backend types for inference, such as CPU, GPU, and NPU.
Easy to use device for connecting "old" measuring units (water, power, gas, ...) to the digital world
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, ...
[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents
[IROS 2025 Best Paper Award Finalist & IEEE TRO 2026] The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
RAG for Local LLM, chat with PDF/doc/txt files, ChatPDF. 纯原生实现RAG功能,基于本地LLM、embedding模型、reranker模型实现,支持GraphRAG,无须安装任何第三方agent库。
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
A Comprehensive Toolkit for High-Quality PDF Content Extraction
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
real time face swap and one-click video deepfake with only a single image
A model that achieve dual detection(Infrared+RGB) with rotation
Quick exploration into fine tuning florence 2
yolov10 瑞芯微 rknn 板端 C++部署,使用平台 rk3588。
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Strong and Open Vision Language Assistant for Mobile Devices
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.