Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

dataset-generator

Here are 55 public repositories matching this topic...

This repository hosts a comprehensive suite for graph-based entity summarization dataset generating from user-selected Wikipedia pages. Utilizing a series of interconnected modules, it leverages Wikidata and Wikipedia dumps to construct a dataset, alongside auto-generated ground truths.

  • Updated Jun 24, 2024
  • Python
ImageFromTextGenerator

IFTG (ImageFromTextGenerator) is a Python package that simplifies creating robust datasets for OCR models. Generate images from text, apply over 10 built-in noise effects, and customize fonts and layouts. IFTG supports all languages and offers endless noise combinations, including custom noise creation.

  • Updated Apr 3, 2025
  • Python

Sudoku4LLM is a Sudoku dataset generator for training and evaluating reasoning in Large Language Models (LLMs). It offers customizable puzzles, difficulty levels, and 11 serialization formats to support structured data reasoning and Chain of Thought (CoT) experiments.

  • Updated Apr 28, 2025
  • Python

Improve this page

Add a description, image, and links to the dataset-generator topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dataset-generator topic, visit your repo's landing page and select "manage topics."

Learn more

AltStyle によって変換されたページ (->オリジナル) /