MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

4,211 289 Updated May 23, 2026

chaoswork / sft_datasets

开源SFT数据集整理,随时补充

581 41 Updated Jun 2, 2023

lonePatient / awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Python 5,574 513 Updated May 30, 2026

EwingYangs / awesome-open-gpt

Collection of Open Source Projects Related to GPT,GPT相关开源项目合集🚀、精选🔥🔥

Python 6,015 546 Updated May 16, 2025

artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,926 875 Updated Jun 10, 2024

mosaicml / llm-foundry

LLM training code for Databricks foundation models

Python 4,407 589 Updated Mar 25, 2026

shadowsocks / go-shadowsocks2

Forked from riobard/go-shadowsocks2

Modern Shadowsocks in Go

Go 4,725 1,493 Updated Oct 20, 2024

tom-snow / wechat-windows-versions

保存微信历史版本

Shell 3,134 474 Updated Dec 20, 2025

FranxYao / chain-of-thought-hub

Benchmarking large language models' complex reasoning ability with chain-of-thought prompting

Jupyter Notebook 2,773 144 Updated Aug 4, 2024

OpenMOSS / MOSS

An open-source tool-augmented conversational language model from Fudan University

Python 12,138 1,133 Updated May 27, 2026

nebuly-ai / optimate

A collection of libraries to optimise AI model performances

Python 8,338 620 Updated Jul 22, 2024

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

Python 41,395 4,511 Updated May 25, 2026

CVI-SZU / Linly

Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集

Python 3,050 225 Updated Apr 14, 2024

microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 13,590 908 Updated Dec 17, 2024

CarperAI / trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,750 484 Updated Jan 8, 2024

MorvanZhou / Reinforcement-learning-with-tensorflow

Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

Python 9,459 4,983 Updated Mar 31, 2024

facebookresearch / metaseq

Repo for external large-scale work

Python 6,547 718 Updated Apr 27, 2024

EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Python 7,441 1,117 Updated Jun 11, 2026

TsinghuaAI / CPM-2-Pretrain

Code for CPM-2 Pre-Train

Python 157 25 Updated Mar 18, 2023

alibaba / ChatUI

The UI design language and React library for Conversational UI

TypeScript 4,404 424 Updated Mar 30, 2026

brightmart / nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

9,901 1,554 Updated Feb 6, 2026

dennybritz / reinforcement-learning

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

Jupyter Notebook 22,036 6,134 Updated Jul 13, 2023

horovod / horovod

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Python 14,688 2,244 Updated Dec 1, 2025

asyml / texar

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Python 2,392 368 Updated Aug 26, 2021

jma127 / pyltr

Python learning to rank (LTR) toolkit

Python 465 105 Updated Dec 27, 2025

CLUEbenchmark / CLUEDatasetSearch

搜索所有中文NLP数据集,附常用英文NLP数据集

Python 4,455 626 Updated Nov 21, 2022

thunlp / PromptPapers

Must-read papers on prompt-based tuning for pre-trained language models.

4,315 391 Updated Jul 17, 2023

frida / frida

Clone this repo to build Frida

Meson 20,974 2,133 Updated Jun 11, 2026

fatedier / frp

A fast reverse proxy to help you expose a local server behind a NAT or firewall to the internet.

Go 107,314 15,062 Updated Jun 3, 2026

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lpty

Achievements