Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Finetune language models to automatically generate documentation for different programming language (Python, Java, go, etc)

License

Notifications You must be signed in to change notification settings

fastbatchai/docstring-generation

Repository files navigation

πŸš€ AutoDoc Course

MIT License Python Course

Learn how to fine-tune language models to automatically generate high-quality docstrings across multiple programming languages.

🎯 What You'll Learn

  • Multi-task Fine-tuning: Train models to generate docstrings across multiple programming languages simultaneously
  • LLM Fine-tuning Techniques: Instruction fine-tuning and RL fine-tuning using GRPO
  • Hands-on Experience: Work with different fine-tuning libraries (PEFT, TRL, Unsloth)
  • Cloud Infrastructure: Deploy scalable training with Modal
  • Performance Evaluation: Compare models using automated metrics and evaluation frameworks

πŸš€ Quick Start

# Clone and install
git clone https://github.com/fastbatchai/docstring-generation.git
cd docstring-generation
uv pip install -e .
# Setup Modal
modal setup
# Run training
modal run -i -m autoDoc.train --training-type sft --use-unsloth

πŸ“– Course Lessons

πŸ“Š Results

Fine-tuning Performance: CodeGemma vs CodeGemma+LoRA

Language CodeGemma CodeGemma+LoRA Improvement
Python 0.47 0.52 +11%
Java 0.57 0.55 -4%
JavaScript 0.43 0.48 +12%
Go 0.49 0.54 +10%
PHP 0.42 0.63 +50%
Ruby 0.52 0.60 +15%

NOTE: These are preliminary results based on training with a small subset (1K samples for each programming language).

Instruction finetuning results

Model Comparison Across Different Base Models
LoRA Configuration Impact on Performance

More results are available in Lesson 5: Evaluation and Comparison

🀝 Community

πŸ“„ License

MIT License - see LICENSE file for details.


⭐ Star this repository if you found it helpful!

About

Finetune language models to automatically generate documentation for different programming language (Python, Java, go, etc)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

Languages

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /