GitHub - Code-Glider/markdown-splitter: A Python tool that intelligently splits Markdown files based on header levels

Code-Glider/markdown-splitter

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
output		output
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
run_splitter.py		run_splitter.py
setup.py		setup.py
test_markdown_splitter.py		test_markdown_splitter.py

Repository files navigation

# Markdown File Splitter
A Python tool that intelligently splits Markdown files into separate files based on header levels (# and ##) while maintaining a comprehensive table of contents that includes all header levels (# through #####). Built with LangChain for enhanced metadata handling.
## Features
- **Smart Splitting**: Splits markdown files at # and ## headers into separate files
- **Comprehensive TOC**: Generates table of contents including all header levels (# to #####)
- **Clean Filenames**: Creates URL-friendly filenames from headers using underscores
- **Metadata Preservation**: Uses LangChain for enhanced metadata handling
- **Hierarchy Maintenance**: Preserves document structure and header relationships
- **Flexible Output**: Customizable output directory for split files
## Installation
```bash
# Clone the repository
git clone https://github.com/yourusername/markdown-file-splitter.git
# Navigate to the directory
cd 
# Install required packages
pip install -r requirements.txt

Usage

Basic Usage

from markdown_splitter import process_markdown
# Process a markdown file
input_file = "your_markdown_file.md"
output_directory = "split_markdown"
process_markdown(input_file, output_directory)

Output Structure

output_directory/
├── table_of_contents.md
├── introduction.md
├── getting_started.md
└── advanced_features.md

Requirements

Python 3.7+
LangChain library
Operating System: Windows, macOS, or Linux

How It Works

File Reading: Reads the input markdown file
Header Processing: Identifies all header levels (# through #####)
Content Splitting: Splits content at # and ## headers
Metadata Extraction: Uses LangChain to extract and preserve metadata
File Generation: Creates separate files for each section
TOC Creation: Generates a comprehensive table of contents

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Your Name - your@email.com

Acknowledgments

LangChain for metadata handling capabilities
Markdown community for inspiration and best practices

Built With

Python - Primary programming language
LangChain - For metadata extraction and handling


This README follows best practices by:
1. Starting with a clear project title and description
2. Including all essential sections (Features, Installation, Usage, etc.)
3. Using proper markdown formatting and hierarchy
4. Providing code examples and directory structure
5. Including contribution guidelines and license information
6. Adding contact information and acknowledgments
7. Listing technologies used
Remember to:
- Update the GitHub repository URL
- Add your contact information
- Customize the license section
- Add any specific requirements or dependencies
- Include any additional sections relevant to your project

About

A Python tool that intelligently splits Markdown files based on header levels

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code-Glider/markdown-splitter

Folders and files

Latest commit

History

Repository files navigation

Usage

Basic Usage

Output Structure

Requirements

How It Works

Contributing

License

Author

Acknowledgments

Built With

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Languages

Code-Glider/markdown-splitter

Folders and files

Latest commit

History

Repository files navigation

Usage

Basic Usage

Output Structure

Requirements

How It Works

Contributing

License

Author

Acknowledgments

Built With

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages