Name	Name	Last commit message	Last commit date
Latest commit History 209 Commits
.github/workflows	.github/workflows
docs	docs
examples	examples
namedivider-api	namedivider-api
namedivider	namedivider
scripts	scripts
tests	tests
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
pyproject.toml	pyproject.toml
requirements-test.txt	requirements-test.txt
requirements.txt	requirements.txt

namedivider-python🦒

PyPI version Python versions PyPI downloads CI

NameDivider is a tool that divides Japanese full names into family and given names.

🚀 Try Live Demo • 📖 Documentation (日本語) • 🐳 Docker API • ⚡ Rust Version

💡 Why NameDivider?

Japanese full names like "菅義偉" are typically stored as single strings with no clear boundary between family and given names. NameDivider solves this with exceptional accuracy.

Unlike cloud-based AI solutions, NameDivider processes all data locally — no external API calls, no data transmission, and full privacy control.

# Before
person_name = "菅義偉" # How do you know where to divide?
# After 
from namedivider import BasicNameDivider
divider = BasicNameDivider()
result = divider.divide_name("菅義偉")
print(f"Family: {result.family}, Given: {result.given}")
# Family: 菅, Given: 義偉

✨ Key Features

🎯 99.91% accuracy - Tested on real-world Japanese names
⚡ Multiple algorithms - Choose between speed (Basic) or accuracy (GBDT)
🔐 Privacy-first – Local-only processing, ideal for sensitive data
🔧 Production ready - CLI, Python library, and Docker support
🎨 Interactive demo - Try it live with Streamlit
📊 Confidence scoring - Know when to trust the results
🛠️ Customizable rules - Add domain-specific patterns

🚀 Quick Start

Installation

pip install namedivider-python

Basic Usage

from namedivider import BasicNameDivider, GBDTNameDivider
# Fast but good accuracy (99.3%)
basic_divider = BasicNameDivider()
result = basic_divider.divide_name("菅義偉")
print(result) # 菅 義偉
# Slower but best accuracy (99.9%)
gbdt_divider = GBDTNameDivider()
result = gbdt_divider.divide_name("菅義偉")
print(result.to_dict())
# {
# 'algorithm': 'gbdt',
# 'family': '菅',
# 'given': '義偉',
# 'score': 0.7300634880343344,
# 'separator': ' '
# }

🔧 Multiple Interfaces

🖥️ Command Line Interface

Perfect for batch processing and automation:

# Single name
$ nmdiv name 菅義偉
菅 義偉
# Process file with progress bar
$ nmdiv file customer_names.txt
100%|██████████| 1000/1000 [00:02<00:00, 431.2it/s]
# Check accuracy on labeled data
$ nmdiv accuracy test_data.txt
Accuracy: 99.1%

🐳 REST API (Docker)

For environments where Python cannot be used, we provide a containerized REST API:

# Run the API server
docker run -d -p 8000:8000 rskmoi/namedivider-api
# Send batch requests
curl -X POST localhost:8000/divide \
 -H "Content-Type: application/json" \
 -d '{"names": ["竈門炭治郎", "竈門禰豆子"]}'

Response:

{
 "divided_names": [
 {"family": "竈門", "given": "炭治郎", "separator": " ", "score": 0.3004587452426102, "algorithm": "kanji_feature"},
 {"family": "竈門", "given": "禰豆子", "separator": " ", "score": 0.30480429696983175, "algorithm": "kanji_feature"}
 ]
}

🎯 Interactive Web Demo

Try NameDivider instantly in your browser: Live Demo →

Run locally:

cd examples/demo
pip install -r requirements.txt
streamlit run example_streamlit.py

📊 Performance & Benchmarks

Algorithm	Accuracy	Speed (names/sec)	Use Case
BasicNameDivider / backend=python	99.3%	4152.8	Stable & compatible
BasicNameDivider / backend=rust	99.3%	18597.7	Max performance (if available)
GBDTNameDivider / backend=python	99.9%	1143.3	Best accuracy, guaranteed
GBDTNameDivider / backend=rust	99.9%	6277.4	Fast + accurate (if available)

Run your own benchmarks:

bash scripts/benchmark_sample.sh

🛠️ Advanced Features

Custom Rules

Handle domain-specific names with custom patterns:

from namedivider import BasicNameDivider, BasicNameDividerConfig
from namedivider import SpecificFamilyNameRule
config = BasicNameDividerConfig(
 custom_rules=[
 SpecificFamilyNameRule(family_names=["竜胆"]), # Rare family names
 ]
)
divider = BasicNameDivider(config=config)
result = divider.divide_name("竜胆尊")
# DividedName(family='竜胆', given='尊', separator=' ', score=1.0, algorithm='rule_specific_family')

Speed Up

For high-volume processing, NameDivider offers several optimization options:

from namedivider import BasicNameDivider, BasicNameDividerConfig
# Load your names
with open("names.txt", "r", encoding="utf-8") as f:
 names = [line.strip() for line in f]
# Option 1: Enable caching (faster repeated processing)
config = BasicNameDividerConfig(cache_mask=True)
divider = BasicNameDivider(config=config)
results = [divider.divide_name(name) for name in names]
# Option 2: (beta) Use Rust backend (up to 4x faster)
# First install: pip install namedivider-core
config = BasicNameDividerConfig(backend="rust")
divider = BasicNameDivider(config=config)
results = [divider.divide_name(name) for name in names]

🏢 Typical Use Cases

Customer Data Processing - Clean and standardize name databases
Form Validation - Real-time name splitting in web applications
Analytics & Reports - Generate family name statistics
Data Migration - Convert legacy systems with combined name fields
Government & Municipal - Process citizen registration data
Security-sensitive Environments - Process names without sending data to external APIs

📚 Examples & Tutorials

🌐 Use REST API with minimal client samples - Integration examples (7 languages available in namedivider-rs)
⚡ Performance Optimization - Handle large datasets efficiently
🔧 Custom Rules Examples - Domain-specific configurations

📄 License

Source code and gbdt_model_v1.txt

MIT License

bert_katakana_v0_3_0.pt

cc-by-sa-4.0

family_name_repository.pickle

English

(1) Purpose of use

family_name_repository.pickle is available for commercial/non-commercial use if you use this software to divide name, and to develop algorithms for dividing name.

Any other use of family_name_repository.pickle is prohibited.

(2) Liability

The author or copyright holder assumes no responsibility for the software.

Japanese / 日本語

(1) 利用目的

このソフトウェアを用いて姓名分割、および姓名分割アルゴリズムの開発をする場合、family_name_repository.pickleは商用/非商用問わず利用可能です。

それ以外の目的でのfamily_name_repository.pickleの利用を禁じます。

(2) 責任

作者または著作権者は、family_name_repository.pickleに関して一切の責任を負いません。

The family name data used in family_name_repository.pickle is provided by Myoji-Yurai.net(名字由来net).

🔗 Related Projects

⚡ namedivider-rs - High-performance Rust implementation
🧠 BERT Katakana Divider - Deep learning approach for katakana names

📈 Project Stats

GitHub stars GitHub forks Docker Pulls

Trusted by developers worldwide

Made with ❤️ by @rskmoi • Contact @rskmoi

License

rskmoi/namedivider-python

Folders and files

Latest commit

History

Repository files navigation

namedivider-python🦒

💡 Why NameDivider?

✨ Key Features

🚀 Quick Start

Installation

Basic Usage

🔧 Multiple Interfaces

🖥️ Command Line Interface

🐳 REST API (Docker)

🎯 Interactive Web Demo

📊 Performance & Benchmarks

🛠️ Advanced Features

Custom Rules

Speed Up

🏢 Typical Use Cases

📚 Examples & Tutorials

📄 License

Source code and gbdt_model_v1.txt

bert_katakana_v0_3_0.pt

family_name_repository.pickle

🔗 Related Projects

📈 Project Stats

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Contributors 2

Languages

Packages