Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: CodaCipher/iterabeast

IteraBeast v0.4

23 Mar 03:49
@CodaCipher CodaCipher

Choose a tag to compare

Typing SVG IteraBeast Main Demo


Synth DataGen Engine
High-performance, UI-driven synthetic data generation for AI fine-tuning.

Status Backend Frontend Output


📡 OVERVIEW

IteraBeast is a modern, cyber-aesthetic web application designed to simplify and accelerate the process of generating large-scale synthetic datasets (.jsonl) for training and fine-tuning LLMs.

Engineered for AI Researchers, Data Scientists, and LLM Fine-tuners, this tool bridges the gap between raw prompt engineering and production-grade dataset creation. It transforms the chaotic task of data synthesis into a streamlined, visually immersive operation—perfect for building RAG pipelines, fine-tuning adapters (LoRA/QLoRA), or generating evaluation benchmarks.

By leveraging a dual-node architecture (FastAPI backend + React frontend), it allows developers to batch-generate diverse, context-aware conversational data across multiple LLM providers simultaneously.


⚡ CORE CAPABILITIES

🧬 Multi-Provider Node Matrix

Seamlessly integrates local (Ollama) and cloud nodes (Groq, OpenRouter, DeepInfra) into a unified generation grid. Switch providers instantly without breaking the workflow.

🔄 Advanced Distribution Routing

Features intelligent workload balancing algorithms including Sequential, Round-Robin, and Hybrid strategies to maximize throughput and minimize API rate limits.

🛡️ Strict JSONL & Schema Enforcement

Implements a rigorous Post-Generation Validation Layer that guarantees 100% valid JSONL syntax. The engine automatically sanitizes output and escapes forbidden characters, ensuring zero-fail ingestion for training pipelines.

🗃️ Direct Stream Architecture

Bypasses memory bottlenecks by streaming generated .jsonl chunks directly to your local SSD via the FileSystem Access API. Capable of handling massive datasets with zero latency.

🧠 Semantic Variation Injection

Prevents dataset overfitting by using MiniLM embeddings to analyze and inject dynamic context. The system autonomously alters sentence structures to ensure high-entropy, semantically diverse data distribution.

🎨 UNSTABLE_CORE Interface

A reactive, hardware-accelerated UI with real-time cost/token telemetry and interchangeable themes (MAGI / UNSTABLE_CORE), designed for high-velocity data operations.


IteraBeast Feature 1

Multi-Provider Integration & Node Configuration IteraBeast Feature 3

Semantic Variation System & Distribution Routing

🛠️ QUICK START

1. Backend Service (FastAPI)

cd backend
python -m venv .venv
# Activate virtual environment:
.venv\Scripts\activate # Windows (Command Prompt)
# .\.venv\Scripts\Activate.ps1 # Windows (PowerShell)
# source .venv/bin/activate # Linux/Mac
pip install -r requirements.txt
python main.py

API runs on http://localhost:8000

2. Frontend Client (React)

cd frontend
npm install
npm run dev

Interface accessible at http://localhost:5173


📁 ARCHITECTURE

IteraBeast/
├── backend/ # Async Server Node
│ ├── main.py # API Endpoints & Generators
│ └── requirements.txt # Dependencies
├── frontend/ # Client UI Node
│ ├── src/
│ │ ├── components/ # Interface Elements & Terminal
│ │ ├── App.jsx # State & Execution Logic
│ │ └── index.css # Styling & Animations
│ └── package.json
└── README.md

⚙️ REQUIREMENTS

  • Python 3.9+
  • Node.js 18+
  • Chromium-based browser (Chrome/Edge) recommended for full FileSystemWritableFileStream support.


[ SYSTEM_STATUS: OPERATIONAL ] | [ CAPACITY: OPTIMAL ]

CodaCipher

END_OF_LINE_SEQUENCE

Assets 2
Loading

AltStyle によって変換されたページ (->オリジナル) /