Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

direct-preference-optimization

Here are 18 public repositories matching this topic...

The Rap Music Generator project is an innovative LLM-based tool designed to create rap lyrics. It offers multiple fine-tuning approaches to accommodate diverse rap generation techniques, providing users with a versatile platform for generating unique and stylistically varied content.

  • Updated May 18, 2025
  • Jupyter Notebook

Enhancing paraphrase-type generation using Direct Preference Optimization (DPO) and Reinforcement Learning from Human Feedback (RLHF), with large-scale HPC support. This project aligns model outputs to human-ranked data for robust, safety-focused NLP.

  • Updated Jun 4, 2025
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the direct-preference-optimization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the direct-preference-optimization topic, visit your repo's landing page and select "manage topics."

Learn more

AltStyle によって変換されたページ (->オリジナル) /