Name	Name	Last commit message	Last commit date
Latest commit History 745 Commits
data/redeem_codes	data/redeem_codes
database	database
dataflow_agent	dataflow_agent
deploy	deploy
docs	docs
fastapi_app	fastapi_app
frontend-workflow	frontend-workflow
models	models
script	script
static	static
supabase/functions	supabase/functions
tests	tests
.dockerignore	.dockerignore
.gitignore	.gitignore
CITATION.cff	CITATION.cff
DEPLOY.md	DEPLOY.md
Dockerfile	Dockerfile
LICENSE	LICENSE
README.md	README.md
README_CN.md	README_CN.md
docker-compose.host.yml	docker-compose.host.yml
docker-compose.yml	docker-compose.yml
mkdocs.yml	mkdocs.yml
pyproject.toml	pyproject.toml
requirements-base.txt	requirements-base.txt
requirements-cu12.txt	requirements-cu12.txt
requirements-paper-backup.txt	requirements-paper-backup.txt
requirements-paper.txt	requirements-paper.txt
requirements-system-ubuntu.txt	requirements-system-ubuntu.txt
requirements-win-base.txt	requirements-win-base.txt

Paper2Any

English | 中文

✨ Focus on paper multimodal workflows: from paper PDFs/screenshots/text to one-click generation of model diagrams, technical roadmaps, experimental plots, and slide decks ✨

Quickstart Online Demo Docs Contributing WeChat

Paper2Any Web Interface

🔥 News

Tip

🆕 2026年05月30日 · Editable PPT Workflow Consolidation
The latest frontend editable PPT workflow is now integrated into Paper2Any, combining outline-assisted generation, canvas editing, gallery review, and paper / AI image insertion into editable decks.
HTML-based editable PPTX export and ONLYOFFICE online editing remain available as optional layers on top of the main editable export workflow.

Tip

🆕 2026年04月24日 · Image Model Playground Upgrade
Added a new Image Model Playground page for managed image generation across Nano Banana 2 / Nano Banana Pro / Image 2 / Image 2 All.
The workflow now supports language control, model-specific generation options, batch generation (1 / 2 / 4 / 8 / 16), compressed thumbnail previews, and one-click batch download.

Tip

🆕 2026年04月15日 · 2026 Paper Updates
Two Paper2Any-related papers are now listed in the 2026 cycle:
Paper2SysArch: Structure-Constrained System Architecture Generation from Scientific Papers · CVPR 2026 Findings
SciFlow-Bench: Evaluating Structure-Aware Scientific Diagram Generation via Inverse Parsing · ACL 2026 Main

BibTeX

@article{guo2025paper2sysarch,
 title = {Paper2SysArch: Structure-Constrained System Architecture Generation from Scientific Papers},
 author = {Guo, Ziyi and Liu, Zhou and Zhang, Wentao},
 journal = {arXiv preprint arXiv:2511.18036},
 year = {2025},
 note = {CVPR 2026 Findings}
}
@article{zhang2026sciflowbench,
 title = {SciFlow-Bench: Evaluating Structure-Aware Scientific Diagram Generation via Inverse Parsing},
 author = {Zhang, Tong and Lin, Honglin and Liu, Zhou and Chen, Chong and Zhang, Wentao},
 journal = {arXiv preprint arXiv:2602.09809},
 year = {2026},
 note = {ACL 2026 Main}
}

Tip

🆕 2026年03月28日 · Editable PPT Showcase Refresh
Added two new editable PPT showcase screenshots for the frontend-deck workflow:
a generated multi-slide gallery view and the canvas editing workspace with deck theme lock.

Earlier updates

[!TIP] 🆕 2026年03月26日 · Workflow Showcase Update
Added showcase coverage for Paper2Video, Paper2Poster, and Paper2Citation.
The README now includes a compressed video demo plus refreshed English/Chinese workflow previews.

[!TIP] 🆕 2026年02月02日 · Paper2Rebuttal
Added rebuttal drafting support with structured response guidance and image-aware revision prompts.

[!TIP] 🆕 2026年01月28日 · Drawio Update
Added Drawio support for visual diagram creation and showcase-ready outputs in the workflow.
KB updates in one line: multi-file PPT generation with doc convert/merge, optional image injection, and embedding-assisted retrieval.

[!TIP] 🆕 2026年01月25日 · New Features
Added AI-assisted outline editing, three-layer model configuration system for flexible model selection, and user points management with daily quota allocation.
🌐 Online Demo: http://dcai-paper2any.nas.cpolar.cn/

[!TIP] 🆕 2026年01月20日 · Bug Fixes
Fixed bugs in experimental plot generation (image/text) and resolved the missing historical files issue.
🌐 Online Demo: http://dcai-paper2any.nas.cpolar.cn/

[!TIP] 🆕 2026年01月14日 · Feature Updates & Backend Architecture Upgrade

Feature Updates: Added Image2PPT, optimized Paper2Figure interaction, and improved PDF2PPT effects.

Standardized API: Refactored backend interfaces with RESTful /api/v1/ structure, removing obsolete endpoints for better maintainability.

Dynamic Configuration: Supported dynamic model selection (e.g., GPT-4o, Qwen-VL) via API parameters, eliminating hardcoded model dependencies.
🌐 Online Demo: http://dcai-paper2any.nas.cpolar.cn/

2025年12月12日 · Paper2Figure Web public beta is live
2025年10月01日 · Released the first version 0.1.0

✨ Core Features

From paper PDFs / images / text to editable scientific figures, slide decks, video scripts, academic posters, and other multimodal content in one click.

Paper2Any currently includes the following sub-capabilities:

📊 Paper2Figure - Editable Scientific Figures: Model architecture diagrams, technical roadmaps (PPT + SVG), and experimental plots with editable PPTX output.
🧩 Paper2Diagram / Image2Drawio - Editable Diagrams: Generate draw.io diagrams from paper/text or images, with drawio/png/svg export and chat-based edits.
🎬 Paper2PPT - Editable Slide Decks: Paper/text/topic to PPT, long-doc support, and built-in table/figure extraction.
📝 Paper2Rebuttal: Draft structured rebuttals and revision responses with claims-to-evidence grounding.
🖼️ PDF2PPT - Layout-Preserving Conversion: Accurate layout retention for PDF → editable PPTX.
🖼️ Image2PPT - Image to Slides: Convert images or screenshots into structured slides.
🔥 Image Model Playground: Directly call backend-managed image models with prompt templates, language control, batch generation, compressed previews, and zip download.
🎨 PPTPolish - Smart Beautification: AI-based layout optimization and style transfer.
🎬 Paper2Video: Generate video scripts and narration assets.
🖼️ Paper2Poster - Academic Poster: Turn paper PDFs into poster-ready layouts with configurable sections, logos, and export assets.
🔎 Paper2Citation - Citation Explorer: Track citing authors, institutions, and notable downstream works from author names or DOI/paper URLs.
📝 Paper2Technical: Produce technical reports and method summaries.
📚 Knowledge Base (KB): Ingest/embedding, semantic search, and KB-driven PPT/podcast/mindmap generation.

📸 Showcase

🧩 Drawio

_{✨ Upload a paper figure or screenshot as the starting point}
_{✨ Keep the source structure visible before conversion}
_{✨ Convert the image into an editable DrawIO canvas}

_{✨ Generate a model or system diagram directly inside the DrawIO workbench}
_{✨ Refine the generated architecture with chat editing and export-ready layout}

📝 Paper2Rebuttal: Rebuttal Drafting

_{✨ Rebuttal drafting and revision support}

📊 Paper2Figure: Scientific Figure Generation

_{✨ Model Architecture Diagram Generation}

_{✨ Model Architecture Diagram Generation}

_{✨ Technical roadmap workbench: choose route type, input source, model config, and visual template}
_{✨ Generated technical roadmap figure with structured dual-column layout}

_{✨ Experimental Plot Generation (Multiple Styles)}

🎬 Paper2PPT: Paper to Presentation

_{✨ End-to-end PPT generation demo}
_{✨ Paper / text / topic to polished slide deck}

_{✨ Edit slide text directly on canvas while keeping the deck theme locked}
_{✨ Review the generated multi-page gallery before export}

_{✨ AI-assisted outline refinement with targeted rewrite prompts}
_{✨ Structured outline editing down to section and bullet detail}

_{✨ Long document support for 40+ slides · Intelligent table extraction and insertion · Version history and iterative deck management}

🎬 Paper2Video: PPT to Narrated Video

_{✨ PPT / PDF to narrated video with script confirmation, Aliyun TTS voices, and downloadable output}

🖼️ Paper2Poster: Paper to Poster

_{PNG poster result}
_{PPT poster result}

_{✨ Paper PDF to academic poster with configurable layout, editable poster output, and one-click export}

🔎 Paper2Citation: Citation Explorer

_{✨ Search authors or papers to inspect citation candidates, institutions, and downstream citation context}

🎨 PPT Smart Beautification

_{✨ AI-based Layout Optimization}

_{✨ AI-based Layout Optimization & Style Transfer}

🖼️ PDF2PPT: Layout-Preserving Conversion

_{✨ Intelligent Cutout & Layout Preservation}

_{✨ Image2PPT}

🚀 Quick Start

Requirements

Python pip

`.env` Modes

Paper2Any now supports two configuration styles:

Simple mode: use *.env.simple.example. Recommended for most self-hosted users.
Advanced mode: use *.env.example. Use this only when you need workflow-specific model/provider overrides.

Quick choice:

cp fastapi_app/.env.simple.example fastapi_app/.env
cp frontend-workflow/.env.simple.example frontend-workflow/.env

If you need fine-grained workflow overrides instead:

cp fastapi_app/.env.example fastapi_app/.env
cp frontend-workflow/.env.example frontend-workflow/.env

🐳 Docker (Recommended) — Deployment & Updates

# 1. Clone
git clone https://github.com/OpenDCAI/Paper2Any.git
cd Paper2Any
# 2. Configure environment variables
cp fastapi_app/.env.simple.example fastapi_app/.env
cp frontend-workflow/.env.simple.example frontend-workflow/.env
cp deploy/docker.env.example deploy/docker.env

Required configuration:

fastapi_app/.env (backend):

# Internal API auth key. Must match frontend VITE_API_KEY.
BACKEND_API_KEY=your-backend-api-key
# Recommended: let backend own all workflow model choices
APP_BILLING_MODE=free
PAPER2ANY_CONFIG_MODE=simple
# Required: unified text entry
SIMPLE_TEXT_API_URL=https://your-text-gateway/v1
SIMPLE_TEXT_API_KEY=your_text_key
# Optional but recommended: unified image entry
SIMPLE_IMAGE_API_URL=https://your-image-gateway
SIMPLE_IMAGE_API_KEY=your_image_key
# Optional: DrawIO OCR / VLM service
SIMPLE_OCR_API_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
SIMPLE_OCR_API_KEY=your_dashscope_key
# Optional: MinerU official remote API
MINERU_API_BASE_URL=https://mineru.net/api/v4
MINERU_API_KEY=your_mineru_api_key
# Optional: SAM3 segmentation service for PDF2PPT / Image2PPT / Image2Drawio
# SAM3_SERVER_URLS=http://GPU_MACHINE_IP:8001
# SAM3_SERVER_URLS=http://GPU1:8021,http://GPU2:8022
# Optional: Supabase (skip for no auth — core features still work)
# SUPABASE_URL=https://your-project-id.supabase.co
# SUPABASE_ANON_KEY=your_supabase_anon_key

frontend-workflow/.env (frontend):

# Must match BACKEND_API_KEY in fastapi_app/.env
VITE_API_KEY=your-backend-api-key
# Usually keep VITE_API_BASE_URL empty in Docker, because nginx proxies /api and /outputs
VITE_API_BASE_URL=
# Frontend display defaults only
VITE_DEFAULT_LLM_API_URL=https://your-text-gateway/v1
VITE_DEFAULT_LLM_MODEL=gpt-4o
# Optional: Supabase (keep consistent with backend)
# VITE_SUPABASE_URL=https://your-project-id.supabase.co
# VITE_SUPABASE_ANON_KEY=your_supabase_anon_key

deploy/docker.env (compose overrides):

BACKEND_PORT=8000
FRONTEND_PORT=3000
DOCKER_APP_WORKERS=1
# Optional: enable local SAM3 container by running DOCKER_WITH_SAM3=1 bash deploy/docker-up.sh
SAM3_PORT=8021
SAM3_SERVER_URLS=

# 3. Build + run
bash deploy/docker-up.sh

Open:

Frontend: http://localhost:3000
Backend health: http://localhost:8000/health

Optional ONLYOFFICE editable PPTX support:

Paper2PPT can export HTML-based editable PPTX files and open them in ONLYOFFICE for online editing. Start a local ONLYOFFICE Document Server first:

# Optional: load a pre-downloaded image tar if your deployment ships one
docker load -i /path/to/onlyoffice-documentserver-latest.tar
docker run -d --name paper2any-onlyoffice \
 -p 8082:80 \
 --add-host=host.docker.internal:host-gateway \
 -e JWT_ENABLED=false \
 -e ALLOW_PRIVATE_IP_ADDRESS=true \
 onlyoffice/documentserver:latest

Then configure backend environment variables:

ONLYOFFICE_DOCUMENT_SERVER_URL=/onlyoffice
ONLYOFFICE_THINKFLOW_PUBLIC_URL=http://host.docker.internal:8000
ONLYOFFICE_DOCUMENT_DOWNLOAD_BASE_URL=http://host.docker.internal:8000
ONLYOFFICE_SERVER_DOWNLOAD_URL_BASE=http://127.0.0.1:8082
ONLYOFFICE_JWT_SECRET=

For local Vite development, /onlyoffice is proxied to http://localhost:8082. If you access the frontend through an SSH forwarded port, set Document Server storage.externalHost to the browser-facing origin, for example http://localhost:13000/onlyoffice. See ONLYOFFICE editable PPTX deployment for the full setup, troubleshooting, and production notes.

GPU services note: Docker starts backend + frontend by default.
Paper2PPT, Paper2Figure, Knowledge Base, etc. only need LLM APIs and work out of the box.

PDF2PPT, Image2PPT, Image2Drawio require SAM3 segmentation.
You can either point backend .env to an external SAM3 service with SAM3_SERVER_URLS=..., or start the optional local SAM3 compose profile:
DOCKER_WITH_SAM3=1 bash deploy/docker-up.sh
See the "Advanced: Local Model Server Load Balancing" section below for details.

Modify & update:

After changing code or .env, rebuild: bash deploy/docker-up.sh
Pull latest code and rebuild:
- git pull
- bash deploy/docker-up.sh

Common commands:

View logs: bash deploy/docker-logs.sh
Stop services: bash deploy/docker-down.sh
Build only: bash deploy/docker-build.sh

Notes:

The first build may take a while (system deps + Python deps).
Frontend env is baked at build time. If you change frontend-workflow/.env or deploy/docker.env, rebuild with bash deploy/docker-up.sh.
Outputs/models are mounted to the host (./outputs, ./models) for persistence.

🐧 Linux Installation

We recommend using Conda to create an isolated environment (Python 3.11).

1. Create Environment & Install Base Dependencies

# 0. Create and activate a conda environment
conda create -n paper2any python=3.11 -y
conda activate paper2any
# 1. Clone repository
git clone https://github.com/OpenDCAI/Paper2Any.git
cd Paper2Any
# 2. Install base dependencies
pip install -r requirements-base.txt
# 3. Install in editable (dev) mode
pip install -e .

2. Install Paper2Any-specific Dependencies (Required)

Paper2Any involves LaTeX rendering, vector graphics processing as well as PPT/PDF conversion, which require extra dependencies.

The dependency boundary is now:

requirements-base.txt: shared cross-platform Python runtime
requirements-paper.txt: paper / PDF / figure extras
requirements-cu12.txt: NVIDIA CUDA 12 Linux GPU extras
requirements-system-ubuntu.txt: Ubuntu/Debian system packages, not Python packages

# 1. Paper / PDF / figure Python extras
pip install -r requirements-paper.txt
# 2. NVIDIA GPU runtime extras (Linux + CUDA 12 only)
pip install -r requirements-cu12.txt
# 3. LaTeX engine (tectonic) - recommended via conda
conda install -c conda-forge tectonic -y
# 4. Resolve doclayout_yolo dependency conflicts (Important)
pip install doclayout_yolo --no-deps
# 5. System dependencies (Ubuntu example; full list is mirrored in requirements-system-ubuntu.txt)
sudo apt-get update
sudo apt-get install -y ffmpeg inkscape libreoffice poppler-utils wkhtmltopdf

Important

ffmpeg, libreoffice/soffice, inkscape, poppler-utils, wkhtmltopdf, and tectonic are external system tools. They are not installed by pip, and deploy/start*.sh does not auto-install them.

3. Environment Variables

export DF_API_KEY=your_api_key_here
export DF_API_URL=xxx # Optional: if you need a third-party API gateway
export MINERU_DEVICES="0,1,2,3" # Optional: MinerU task GPU resource pool

Tip

📚 For detailed configuration guide, see Configuration Guide for step-by-step instructions on configuring models, environment variables, and starting services.

4. Configure Environment Files (Optional)

📝 Click to expand: Detailed .env Configuration Guide

Paper2Any uses two .env files for configuration. Both are optional - you can run the application without them using default settings.

Step 1: Copy Example Files

# Copy backend environment file
cp fastapi_app/.env.example fastapi_app/.env
# Copy frontend environment file
cp frontend-workflow/.env.example frontend-workflow/.env

Step 2: Backend Configuration (`fastapi_app/.env`)

Supabase (Optional) - Only needed if you want user authentication and cloud storage:

SUPABASE_URL=https://your-project-id.supabase.co
SUPABASE_ANON_KEY=your_supabase_anon_key

Model Configuration - Customize which models to use for different workflows:

# Default LLM API URL
DEFAULT_LLM_API_URL=http://123.129.219.111:3000/v1/
# Workflow-level defaults
PAPER2PPT_DEFAULT_MODEL=gpt-5.1
PAPER2PPT_DEFAULT_IMAGE_MODEL=gemini-3-pro-image-preview
PDF2PPT_DEFAULT_MODEL=gpt-4o
# ... see .env.example for full list

Service Integration Configuration - External or local services used by image/PDF workflows:

# DrawIO OCR / VLM
PAPER2DRAWIO_OCR_API_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
PAPER2DRAWIO_OCR_API_KEY=your_dashscope_key
# MinerU official remote API; if MINERU_API_KEY is empty, backend falls back to local MINERU_PORT
MINERU_API_BASE_URL=https://mineru.net/api/v4
MINERU_API_KEY=your_mineru_api_key
MINERU_API_MODEL_VERSION=vlm
# SAM3 segmentation service for PDF2PPT / Image2PPT / Image2Drawio
# One endpoint:
SAM3_SERVER_URLS=http://127.0.0.1:8001
# Or multiple endpoints for load balancing:
# SAM3_SERVER_URLS=http://127.0.0.1:8021,http://127.0.0.1:8022

Step 3: Frontend Configuration (`frontend-workflow/.env`)

LLM Provider Configuration - Controls the API endpoint dropdown in the UI:

# Default API URL shown in the UI
VITE_DEFAULT_LLM_API_URL=https://api.apiyi.com/v1
# Available API URLs in the dropdown (comma-separated)
VITE_LLM_API_URLS=https://api.apiyi.com/v1,http://b.apiyi.com:16888/v1,http://123.129.219.111:3000/v1

What happens when you modify VITE_LLM_API_URLS:

The frontend will display a dropdown menu with all URLs you specify
Users can select different API endpoints without manually typing URLs
Useful for switching between OpenAI, local models, or custom API gateways

Supabase (Optional) - Uncomment these lines if you want user authentication:

VITE_SUPABASE_URL=https://your-project.supabase.co
VITE_SUPABASE_ANON_KEY=your-anon-key

Running Without Supabase

If you skip Supabase configuration:

✅ All core features work normally
✅ CLI scripts do not require Supabase
❌ No user authentication
❌ No cloud account features such as points, redeem, invite, and history
❌ No cloud file storage

Note

Quick Start: You can skip the .env configuration entirely and use CLI scripts directly with --api-key parameter. See CLI Scripts section below.

Advanced Configuration: Local Model Service Load Balancing

If you are deploying in a high-concurrency local environment, you can use script/start_model_servers.sh to start a local model service cluster (MinerU / SAM / OCR).

Script location: /DataFlow-Agent/script/start_model_servers.sh

Main configuration items:

MinerU (PDF Parsing)
- MINERU_MODEL_PATH: Model path (default models/MinerU2.5-2509-1.2B)
- MINERU_GPU_UTIL: GPU memory utilization (default 0.85)
- Instance configuration: By default, one instance is started on each configured GPU, ports 8011-8013.
- Load Balancer: Port 8010, automatically dispatches requests.
SAM3 (Segment Anything Model 3)
- Instance configuration: By default, one instance per configured GPU, ports 8021-8022.
- Model assets: default paths are ./models/sam3/sam3.pt and ./models/sam3/bpe_simple_vocab_16e6.txt.gz.
- Load Balancer: Port 8020.
OCR (PaddleOCR)
- Config: Runs on CPU, uses uvicorn's worker mechanism (4 workers by default).
- Port: 8003.

Before using, please modify gpu_id and the number of instances in the script according to your actual GPU count and memory.

For local one-command development test on a single GPU (SAM3 + backend + frontend), run:

bash script/start_local_sam3_dev.sh

🪟 Windows Installation

Note

We currently recommend trying Paper2Any on Linux / WSL. If you need to deploy on native Windows, please follow the steps below.

1. Create Environment & Install Base Dependencies

# 0. Create and activate a conda environment
conda create -n paper2any python=3.12 -y
conda activate paper2any
# 1. Clone repository
git clone https://github.com/OpenDCAI/Paper2Any.git
cd Paper2Any
# 2. Install base dependencies
pip install -r requirements-win-base.txt
# 3. Install in editable (dev) mode
pip install -e .

2. Install Paper2Any-specific Dependencies (Recommended)

Paper2Any involves LaTeX rendering and vector graphics processing, which require extra dependencies:

# Python dependencies
pip install -r requirements-paper.txt
# NVIDIA GPU runtime extras (Linux only; skip on Windows)
# pip install -r requirements-cu12.txt
# tectonic: LaTeX engine (recommended via conda)
conda install -c conda-forge tectonic -y

🎨 Install Inkscape (SVG/Vector Graphics Processing | Recommended/Required)

Download and install (Windows 64-bit MSI): Inkscape Download
Add the Inkscape executable directory to the system environment variable Path (example): C:\Program Files\Inkscape\bin\

Tip

After configuring the Path, it is recommended to reopen the terminal (or restart VS Code / PowerShell) to ensure the environment variables take effect.

⚡ Install Windows Build of vLLM (Optional | For Local Inference Acceleration)

Release page: vllm-windows releases
Recommended version: 0.11.0

pip install vllm-0.11.0+cu124-cp312-cp312-win_amd64.whl

Important

Please make sure the .whl matches your current environment:

Python: cp312 (Python 3.12)
Platform: win_amd64
CUDA: cu124 (must match your local CUDA / driver)

Launch Application

Paper2Any - Paper Workflow Web Frontend (Recommended)

# Recommended one-click entrypoint on NVIDIA machines
bash deploy/start_nv.sh

Default local addresses:

Frontend dev server: http://localhost:3000
Backend health: http://127.0.0.1:8000/health

Useful local deploy commands:

Start full stack (recommended): bash deploy/start_nv.sh
Start backend only after loading a deploy profile: set -a && source deploy/profiles/nv.env && set +a && bash deploy/start.sh
Stop backend: ./deploy/stop.sh
Restart backend: ./deploy/restart.sh

Notes:

deploy/start.sh reads deploy/app_config.sh, but it does not load deploy/profiles/*.env by itself.
deploy/start_nv.sh is the safe one-click entrypoint because it loads deploy/profiles/nv.env, prepares local models, starts model servers, then starts backend and frontend.
If you change APP_PORT, update the frontend proxy target in frontend-workflow/vite.config.ts as well.

Configure Frontend Proxy

Modify server.proxy in frontend-workflow/vite.config.ts:

export default defineConfig({
 plugins: [react()],
 server: {
 port: 3000,
 open: true,
 allowedHosts: true,
 proxy: {
 '/api': {
 target: 'http://127.0.0.1:8000', // FastAPI backend address
 changeOrigin: true,
 },
 '/outputs': {
 target: 'http://127.0.0.1:8000',
 changeOrigin: true,
 },
 },
 },
})

Visit http://localhost:3000.

Windows: Load MinerU Pre-trained Model

# Start in PowerShell
vllm serve opendatalab/MinerU2.5-2509-1.2B `
 --host 127.0.0.1 `
 --port 8010 `
 --logits-processors mineru_vl_utils:MinerULogitsProcessor `
 --gpu-memory-utilization 0.6 `
 --trust-remote-code `
 --enforce-eager

Launch Application

🎨 Web Frontend (Recommended)

# Recommended one-click entrypoint on NVIDIA machines
bash deploy/start_nv.sh

Visit http://localhost:3000. Backend health is available at http://127.0.0.1:8000/health by default.

🖥️ CLI Scripts (Command-Line Interface)

Paper2Any provides standalone CLI scripts that accept command-line parameters for direct workflow execution without requiring the web frontend/backend.

Environment Variables

Configure API access via environment variables (optional):

export DF_API_URL=https://api.openai.com/v1 # LLM API URL
export DF_API_KEY=sk-xxx # API key
export DF_MODEL=gpt-4o # Default model

Available CLI Scripts

1. Paper2Figure CLI - Generate scientific figures (3 types)

# Generate model architecture diagram from PDF
python script/run_paper2figure_cli.py \
 --input paper.pdf \
 --graph-type model_arch \
 --api-key sk-xxx
# Generate technical roadmap from text
python script/run_paper2figure_cli.py \
 --input "Transformer architecture with attention mechanism" \
 --input-type TEXT \
 --graph-type tech_route
# Generate experimental data visualization
python script/run_paper2figure_cli.py \
 --input paper.pdf \
 --graph-type exp_data

Graph types: model_arch (model architecture), tech_route (technical roadmap), exp_data (experimental plots)

2. Paper2PPT CLI - Convert papers to PPT presentations

# Basic usage
python script/run_paper2ppt_cli.py \
 --input paper.pdf \
 --api-key sk-xxx \
 --page-count 15
# With custom style
python script/run_paper2ppt_cli.py \
 --input paper.pdf \
 --style "Academic style; English; Modern design" \
 --language en

3. PDF2PPT CLI - One-click PDF to editable PPT

# Basic conversion (no AI enhancement)
python script/run_pdf2ppt_cli.py --input slides.pdf
# With AI enhancement
python script/run_pdf2ppt_cli.py \
 --input slides.pdf \
 --use-ai-edit \
 --api-key sk-xxx

4. Image2PPT CLI - Convert images to editable PPT

# Basic conversion
python script/run_image2ppt_cli.py --input screenshot.png
# With AI enhancement
python script/run_image2ppt_cli.py \
 --input diagram.jpg \
 --use-ai-edit \
 --api-key sk-xxx

5. PPT2Polish CLI - Beautify existing PPT files

# Basic beautification
python script/run_ppt2polish_cli.py \
 --input old_presentation.pptx \
 --style "Academic style, clean and elegant" \
 --api-key sk-xxx
# With reference image for style consistency
python script/run_ppt2polish_cli.py \
 --input old_presentation.pptx \
 --style "Modern minimalist style" \
 --ref-img reference_style.png \
 --api-key sk-xxx

Note

System Requirements for PPT2Polish:

LibreOffice: sudo apt-get install libreoffice (Ubuntu/Debian)
pdf2image: pip install pdf2image
poppler-utils: sudo apt-get install poppler-utils

Common Options

All CLI scripts support these common options:

--api-url URL - LLM API URL (default: from DF_API_URL env var)
--api-key KEY - API key (default: from DF_API_KEY env var)
--model NAME - Text model name (default: varies by script)
--output-dir DIR - Custom output directory (default: outputs/cli/{script_name}/{timestamp})
--help - Show detailed help message

For complete parameter documentation, run any script with --help:

python script/run_paper2figure_cli.py --help

📂 Project Structure

Paper2Any/
├── dataflow_agent/ # Core codebase
│ ├── agentroles/ # Agent definitions
│ │ └── paper2any_agents/ # Paper2Any-specific agents
│ ├── workflow/ # Workflow definitions
│ ├── promptstemplates/ # Prompt templates
│ └── toolkits/ # Toolkits (drawing, PPT generation, etc.)
├── fastapi_app/ # Backend API service
├── frontend-workflow/ # Frontend web interface
├── static/ # Static assets
├── script/ # Script tools
└── tests/ # Test cases

🗺️ Roadmap

Feature	Status	Sub-features
📊 Paper2Figure _{Editable Scientific Figures}	85%	Done Done Done Done
🧩 Paper2Diagram _{Drawio Diagrams}	80%	Done Done Done Done
🎬 Paper2PPT _{Editable Slide Decks}	70%	Done Done Done Done Done Done
🖼️ PDF2PPT _{Layout-Preserving Conversion}	90%	Done Done Done
🖼️ Image2PPT _{Image to Slides}	85%	Done Done
🎨 PPTPolish _{Smart Beautification}	60%	Done In_Progress In_Progress
📚 Knowledge Base _{KB Workflows}	75%	Done Done Done
🎬 Paper2Video _{Video Script Generation}	40%	In_Progress In_Progress

🤝 Contributing

We welcome all forms of contribution!

Issues Discussions PR

📄 License

This project is licensed under Apache License 2.0.

If this project helps you, please give us a ⭐️ Star!

GitHub stars GitHub forks

DataFlow-Agent WeChat Community
_{Scan to join the community WeChat group}

❤️ Made with by OpenDCAI Team

Folders and files

Latest commit

History

Repository files navigation

Paper2Any

📑 Table of Contents

🔥 News

✨ Core Features

📸 Showcase

🧩 Drawio

📝 Paper2Rebuttal: Rebuttal Drafting

📊 Paper2Figure: Scientific Figure Generation

🎬 Paper2PPT: Paper to Presentation

🎬 Paper2Video: PPT to Narrated Video

🖼️ Paper2Poster: Paper to Poster

🔎 Paper2Citation: Citation Explorer

🎨 PPT Smart Beautification

🖼️ PDF2PPT: Layout-Preserving Conversion

🚀 Quick Start

Requirements

.env Modes

🐧 Linux Installation

1. Create Environment & Install Base Dependencies

2. Install Paper2Any-specific Dependencies (Required)

3. Environment Variables

4. Configure Environment Files (Optional)

Step 1: Copy Example Files

Step 2: Backend Configuration (fastapi_app/.env)

Step 3: Frontend Configuration (frontend-workflow/.env)

Running Without Supabase

🪟 Windows Installation

1. Create Environment & Install Base Dependencies

2. Install Paper2Any-specific Dependencies (Recommended)

⚡ Install Windows Build of vLLM (Optional | For Local Inference Acceleration)

Launch Application

Launch Application

🎨 Web Frontend (Recommended)

🖥️ CLI Scripts (Command-Line Interface)

Environment Variables

Available CLI Scripts

Common Options

📂 Project Structure

🗺️ Roadmap

🤝 Contributing

📄 License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`.env` Modes

Step 2: Backend Configuration (`fastapi_app/.env`)

Step 3: Frontend Configuration (`frontend-workflow/.env`)

Packages