|
| 1 | +# Issue #11 Implementation: Simplified mdxai |
| 2 | + |
| 3 | +**Date:** 2025年10月03日 |
| 4 | +**Issue:** https://github.com/dot-do/.do/issues/11 |
| 5 | +**Status:** Implemented - Ready for Review |
| 6 | + |
| 7 | +## Summary |
| 8 | + |
| 9 | +Implemented a simplified mdxai CLI with natural language prompts, frontmatter-to-Zod schema mapping, parallel generation with p-queue, and mdxdb data import capabilities for Zapier and O*NET. |
| 10 | + |
| 11 | +## Changes Made |
| 12 | + |
| 13 | +### 1. New Simplified mdxai CLI (`mdxai/src/cli-simple.ts`) |
| 14 | + |
| 15 | +Created a natural language-driven CLI that supports: |
| 16 | + |
| 17 | +**Natural Language Patterns:** |
| 18 | +```bash |
| 19 | +# Data-driven generation |
| 20 | +mdxai for every occupation, write a blog post about how AI will transform it |
| 21 | +mdxai for each service, create documentation |
| 22 | +mdxai for all technologies, generate use case examples |
| 23 | + |
| 24 | +# Single generation |
| 25 | +mdxai write a blog post about AI in healthcare |
| 26 | +``` |
| 27 | + |
| 28 | +**Key Features:** |
| 29 | +- Parses natural language to extract collection, task, and context |
| 30 | +- Queries local MDX files as data sources |
| 31 | +- Loads README.md as generation instructions |
| 32 | +- Loads _template.mdx for frontmatter schema |
| 33 | +- Parallel generation with p-queue (default: 25 concurrent) |
| 34 | +- Frontmatter-to-Zod schema mapping |
| 35 | +- **Only generates fields that fail Zod validation** (as requested) |
| 36 | + |
| 37 | +### 2. Smart Zod Validation |
| 38 | + |
| 39 | +The implementation now: |
| 40 | +1. Validates existing item data against Zod schema from frontmatter |
| 41 | +2. Identifies which fields are missing or invalid |
| 42 | +3. Only generates those specific fields via `generateObject()` |
| 43 | +4. Merges valid fields with newly generated fields |
| 44 | +5. Writes complete MDX with validated frontmatter |
| 45 | + |
| 46 | +This prevents unnecessary regeneration and preserves valid existing data. |
| 47 | + |
| 48 | +### 3. mdxdb Importers Package (`@mdxdb/importers`) |
| 49 | + |
| 50 | +Created data importers for external sources: |
| 51 | + |
| 52 | +**Zapier Importer:** |
| 53 | +- Fetches apps from https://zapier.com/api/v4/apps/ |
| 54 | +- Converts to MDX with frontmatter |
| 55 | +- Includes categories, API docs, images |
| 56 | +- Expresses relationships (App → Categories, Triggers, Actions) |
| 57 | + |
| 58 | +**O*NET Importer:** |
| 59 | +- Downloads occupation data from onetcenter.org |
| 60 | +- Fetches task statements and skills |
| 61 | +- Converts to MDX with relationships |
| 62 | +- Links occupations to tasks and skills via SOC codes |
| 63 | + |
| 64 | +**CLI Usage:** |
| 65 | +```bash |
| 66 | +pnpm mdxdb-import zapier --output ./zapier-apps |
| 67 | +pnpm mdxdb-import onet --output ./occupations |
| 68 | +``` |
| 69 | + |
| 70 | +### 4. Updated Package Configuration |
| 71 | + |
| 72 | +**mdxai/package.json:** |
| 73 | +- Added `@anthropic-ai/claude-agent-sdk` dependency |
| 74 | +- Added `@mdxdb/core`, `@mdxdb/fs`, `@mdxdb/importers` workspace dependencies |
| 75 | +- Updated bin to use `cli-simple.js` as default |
| 76 | +- Legacy CLI available as `mdxai-legacy` |
| 77 | + |
| 78 | +**tsup.config.ts:** |
| 79 | +- Added `cli-simple` entry point for new CLI |
| 80 | + |
| 81 | +### 5. Documentation |
| 82 | + |
| 83 | +Created comprehensive documentation: |
| 84 | + |
| 85 | +**README-SIMPLIFIED.md** - User guide for simplified mdxai: |
| 86 | +- Quick start examples |
| 87 | +- Natural language patterns |
| 88 | +- Data source setup |
| 89 | +- Template usage |
| 90 | +- Integration with mdxdb |
| 91 | +- Migration guide from legacy CLI |
| 92 | + |
| 93 | +**@mdxdb/importers/README.md** - Importer documentation: |
| 94 | +- Installation and CLI usage |
| 95 | +- Programmatic API |
| 96 | +- Output format specs |
| 97 | +- Relationship mappings |
| 98 | +- Adding new importers |
| 99 | + |
| 100 | +## Usage Examples |
| 101 | + |
| 102 | +### Example 1: Generate from O*NET Occupations |
| 103 | + |
| 104 | +```bash |
| 105 | +# 1. Import O*NET data |
| 106 | +pnpm mdxdb-import onet --output ./occupations |
| 107 | + |
| 108 | +# 2. Create output directory with instructions |
| 109 | +mkdir -p ./blog-posts |
| 110 | +cat > ./blog-posts/README.md << 'EOF' |
| 111 | +Write engaging blog posts about how AI is transforming different occupations. |
| 112 | +Use a conversational tone and include real-world examples. |
| 113 | +EOF |
| 114 | + |
| 115 | +# 3. Create template for structured output |
| 116 | +cat > ./blog-posts/_template.mdx << 'EOF' |
| 117 | +--- |
| 118 | +title: Blog post title |
| 119 | +occupation: Original occupation name |
| 120 | +socCode: O*NET SOC code |
| 121 | +tags: ['ai', 'career', 'transformation'] |
| 122 | +readingTime: 8 minutes |
| 123 | +author: AI Content Generator |
| 124 | +--- |
| 125 | +EOF |
| 126 | + |
| 127 | +# 4. Generate content |
| 128 | +mdxai --dir ./blog-posts for every occupation, write a blog post about how AI will transform it |
| 129 | +``` |
| 130 | + |
| 131 | +### Example 2: Generate from Zapier Apps |
| 132 | + |
| 133 | +```bash |
| 134 | +# 1. Import Zapier apps |
| 135 | +pnpm mdxdb-import zapier --output ./zapier-apps |
| 136 | + |
| 137 | +# 2. Generate integration guides |
| 138 | +mdxai --dir ./integration-guides for every app, create an integration guide |
| 139 | +``` |
| 140 | + |
| 141 | +### Example 3: Custom Model and Concurrency |
| 142 | + |
| 143 | +```bash |
| 144 | +# Use GPT-4 with 50 concurrent generations |
| 145 | +mdxai --model gpt-4 --concurrency 50 for every service, create API documentation |
| 146 | +``` |
| 147 | + |
| 148 | +## Architecture |
| 149 | + |
| 150 | +``` |
| 151 | +Natural Language Prompt |
| 152 | + ↓ |
| 153 | +Parse (extract collection, task, context) |
| 154 | + ↓ |
| 155 | +Query Collection (local MDX files) |
| 156 | + ↓ |
| 157 | +Load Instructions (README.md) |
| 158 | + ↓ |
| 159 | +Load Template (frontmatter schema → Zod) |
| 160 | + ↓ |
| 161 | +For Each Item: |
| 162 | + ├─ Validate with Zod |
| 163 | + ├─ Identify invalid/missing fields |
| 164 | + ├─ Generate only those fields |
| 165 | + └─ Merge with valid fields |
| 166 | + ↓ |
| 167 | +Parallel Generation (p-queue, 25 concurrent) |
| 168 | + ↓ |
| 169 | +Write Output (validated MDX files) |
| 170 | +``` |
| 171 | + |
| 172 | +## Key Design Decisions |
| 173 | + |
| 174 | +### 1. Natural Language First |
| 175 | +Instead of explicit commands (`generate`, `list`, `research`), use natural language patterns that feel more intuitive. |
| 176 | + |
| 177 | +### 2. Template-Driven Generation |
| 178 | +README provides instructions, _template.mdx defines schema. This separation keeps generation logic decoupled from the CLI. |
| 179 | + |
| 180 | +### 3. Smart Validation |
| 181 | +Only generate what's needed. If data already has valid `title` and `description`, don't regenerate them—just fill in missing fields. |
| 182 | + |
| 183 | +### 4. Workspace Architecture |
| 184 | +Keep importers separate as `@mdxdb/importers` but integrated via workspace. This allows independent versioning and reuse. |
| 185 | + |
| 186 | +### 5. Parallel by Default |
| 187 | +Modern APIs can handle concurrent requests. Default to 25 parallel workers, customizable via `--concurrency`. |
| 188 | + |
| 189 | +## Future Enhancements (Per User Feedback) |
| 190 | + |
| 191 | +### Import Syntax in Frontmatter |
| 192 | +Instead of separate importer CLI, support import configuration directly in MDX frontmatter: |
| 193 | + |
| 194 | +```mdx |
| 195 | +--- |
| 196 | +title: Occupations Collection |
| 197 | +import: |
| 198 | + source: onet |
| 199 | + type: occupations |
| 200 | + version: "30_0" |
| 201 | + includes: [tasks, skills] |
| 202 | +--- |
| 203 | +``` |
| 204 | + |
| 205 | +Or via code blocks: |
| 206 | + |
| 207 | +````mdx |
| 208 | +--- |
| 209 | +title: Custom Data Collection |
| 210 | +--- |
| 211 | + |
| 212 | +```typescript import |
| 213 | +import { fetchZapierApps } from '@mdxdb/importers' |
| 214 | + |
| 215 | +export default async function importData() { |
| 216 | + const apps = await fetchZapierApps(250, 10) |
| 217 | + return apps.map(app => ({ |
| 218 | + id: app.id, |
| 219 | + title: app.title, |
| 220 | + description: app.description, |
| 221 | + })) |
| 222 | +} |
| 223 | +``` |
| 224 | +```` |
| 225 | + |
| 226 | +This would: |
| 227 | +1. Keep import config with collection definition |
| 228 | +2. Support both declarative (YAML) and imperative (code) imports |
| 229 | +3. Allow Zod schema validation on imported data |
| 230 | +4. Enable version control for import configurations |
| 231 | + |
| 232 | +## Next Steps |
| 233 | + |
| 234 | +1. **Test the implementation:** |
| 235 | + ```bash |
| 236 | + cd mdx |
| 237 | + pnpm build:packages |
| 238 | + pnpm test --filter mdxai |
| 239 | + ``` |
| 240 | + |
| 241 | +2. **Try the simplified CLI:** |
| 242 | + ```bash |
| 243 | + pnpm mdxdb-import onet --output ./occupations |
| 244 | + mdxai for every occupation, create a summary |
| 245 | + ``` |
| 246 | + |
| 247 | +3. **Implement frontmatter import syntax** (per user feedback) |
| 248 | + |
| 249 | +4. **Add Claude Agents SDK integration** for advanced agent capabilities |
| 250 | + |
| 251 | +5. **Create examples directory** with complete working demos |
| 252 | + |
| 253 | +## Files Modified |
| 254 | + |
| 255 | +### New Files: |
| 256 | +- `packages/mdxai/src/cli-simple.ts` - Simplified CLI implementation |
| 257 | +- `packages/mdxai/README-SIMPLIFIED.md` - User documentation |
| 258 | +- `packages/mdxdb/importers/src/zapier.ts` - Zapier importer |
| 259 | +- `packages/mdxdb/importers/src/onet.ts` - O*NET importer |
| 260 | +- `packages/mdxdb/importers/src/cli.ts` - Importer CLI |
| 261 | +- `packages/mdxdb/importers/src/index.ts` - Importer exports |
| 262 | +- `packages/mdxdb/importers/package.json` - Package config |
| 263 | +- `packages/mdxdb/importers/tsconfig.json` - TypeScript config |
| 264 | +- `packages/mdxdb/importers/README.md` - Importer docs |
| 265 | + |
| 266 | +### Modified Files: |
| 267 | +- `packages/mdxai/package.json` - Added dependencies and bin |
| 268 | +- `packages/mdxai/tsup.config.ts` - Added cli-simple entry point |
| 269 | + |
| 270 | +## Related Issues |
| 271 | + |
| 272 | +- Issue #11: Simplify mdxai |
| 273 | +- Implements: Natural language prompts, Zod schema mapping, parallel generation, data importers |
| 274 | +
|
| 275 | +## Notes |
| 276 | +
|
| 277 | +- Legacy CLI preserved as `mdxai-legacy` for backward compatibility |
| 278 | +- All new functionality is additive—no breaking changes |
| 279 | +- Documentation includes migration guide for existing users |
| 280 | +- Importers package follows mdxdb pattern (core, fs, sqlite, payload, velite, render, **importers**) |
| 281 | + |
| 282 | +--- |
| 283 | + |
| 284 | +**Implementation by:** Claude Code |
| 285 | +**Review Status:** Awaiting user feedback on frontmatter import syntax |
0 commit comments