Python 3.8+ PyPI version License: MIT
Modern face detection, recognition & analysis in 3 lines of code
VisionFace is a state-of-the-art, open-source framework for comprehensive face analysis, built with PyTorch. It provides a unified interface for face detection, recognition, landmark detection, and visualization with support for multiple cutting-edge models.
Quick Start • Examples • Models • API Docs
- Detect faces in images with 12+ models (YOLO, MediaPipe, MTCNN...)
- Recognize faces with vector search and embedding models
- Extract landmarks (68-point, 468-point face mesh)
- Batch process thousands of images efficiently
- Production-ready with Docker support and REST API
pip install visionface
The Face Detection module is your gateway to identifying faces in any image. Built for both beginners and experts, it provides a unified interface to 12+ cutting-edge detection models.
✨ Key Features:
- Multiple Input Sources: Image Files, URLs, PIL images, NumPy arrays
- Flexible Processing: Single image or batch processing thousands of images efficiently
- 12+ State-of-the-Art Models: From ultra-fast mobile models to high-precision detectors
- One-Line Detection: Get results with just
detector.detect_faces(image) - Rich Outputs: Bounding boxes, confidence scores, cropped faces ready to use
📝 Quick Example:
import cv2 from visionface import FaceDetection, FaceAnnotators # 1. Initialize detector detector = FaceDetection(detector_backbone="yolo-small") # 2. Detect faces image = cv2.imread("your_image.jpg") faces = detector.detect_faces(image) # 3. Visualize results result = FaceAnnotators.box_annotator(image, faces[0]) cv2.imwrite("detected.jpg", result)
The Face Recognition module identifies individuals by generating embeddings and comparing them in a vector database. The process includes three stages: detecting faces, creating embeddings with the chosen model, and searching the database to find the closest matches.
✨ Key Features:
- Multi-model support: Choose from high-accuracy embedding backbones such as FaceNet-VGG, FaceNet-CASIA, and Dlib.
- Vector DB Integration: Store and query embeddings using Qdrant, Milvus, or local file-based storage.
- Scalable Search: Efficiently match thousands or millions of faces with fast search.
- Flexible Enrollment: Add faces one-by-one or in batches with associated labels.
- Threshold & Ranking: Control similarity thresholds and retrieve top-k matches for robust recognition results.
from visionface import FaceRecognition # 1. Setup recognition system fr = FaceRecognition(detector_backbone="yolo-small", embedding_backbone="FaceNet-VGG", db_backend="qdrant") # 2. Add known faces fr.upsert_faces( images=["john.jpg", "jane.jpg", "bob.jpg"], labels=["John", "Jane", "Bob"], collection_name="employees" ) # 3. Search for matches matches = fr.search_faces("query_face_image.jpg", collection_name="employees", score_threshold=0.7, top_k=3) for match in matches: print(f"Found: {match['face_name']} (confidence: {match['score']:.2f})")
The Face Embeddings module transforms each detected face into a high-dimensional numeric vector (embedding) that captures its unique features.
These embeddings can be used for:
- Face verification: Check if two faces belong to the same perso
- Recognition: Match against a database of known faces
- Clustering: Group similar faces automatically
- Advanced analytics:
✨ Supported Embedding Models:
FaceNet-VGG, FaceNet-CASIA, Dlib
📝 Quick Example:
from visionface import FaceEmbedder # 1. Initialize embedder embedder = FaceEmbedder(embedding_backbone="FaceNet-VGG") # 2. Generate embeddings for face images embeddings = embedder.embed_faces( face_imgs=["face1.jpg", "face2.jpg"], normalize_embeddings=True # L2 normalization ) # 3. Use embeddings for i, embedding in enumerate(embeddings): print(f"Face {i+1} embedding shape: {embedding.shape}") # (512,) # Use for: face verification, clustering, custom databases
The Landmarks module identifies key facial features with pixel-perfect accuracy. From eye positions to lip contours, get detailed facial geometry for advanced applications.
✨ Key Features:
- Multiple Input Sources: Image Files, URLs, PIL images, NumPy arrays
- Flexible Processing: Single image or batch processing thousands of images efficiently
- 2D & 3D Support: Standard 2D points or full 3D face mesh
- Rich Annotations: Built-in visualization with customizable styling
- Multiple Backends: MediaPipe (468 points) or Dlib (68 points)
📝 Quick Example:
from visionface import LandmarkDetection from visionface.annotators.landmark import MediaPipeFaceMeshAnnotator landmark_detector = LandmarkDetection(detector_backbone="mediapipe") image = cv2.imread("your_image.jpg") # Get 468 facial landmarks landmarks = landmark_detector.detect_3d_landmarks(image) # Visualize with connections vizualizer = MediaPipeFaceMeshAnnotator(thickness=2, circle_radius=3) result = vizualizer.annotate( image, landmarks[0], connections=True ) cv2.imwrite("detected_landmarks.jpg", result)
🎯 Real-time Face Detection
import cv2 from visionface import FaceDetection, FaceAnnotators detector = FaceDetection(detector_backbone="yolo-nano") # Fastest model cap = cv2.VideoCapture(0) while True: ret, frame = cap.read() faces = detector.detect_faces(frame) annotated = FaceAnnotators.box_annotator(frame, faces) cv2.imshow('Face Detection', annotated) if cv2.waitKey(1) & 0xFF == ord('q'): break cap.release() cv2.destroyAllWindows()
📊 Batch Processing
from visionface import FaceDetection import glob detector = FaceDetection(detector_backbone="yolo-medium") # Process entire folder image_paths = glob.glob("photos/*.jpg") images = [cv2.imread(path) for path in image_paths] # Detect all faces at once all_detections = detector.detect_faces(images) # Save cropped faces for i, detections in enumerate(all_detections): for j, face in enumerate(detections): if face.cropped_face is not None: cv2.imwrite(f"faces/image_{i}_face_{j}.jpg", face.cropped_face)
🏢 Employee Recognition System
from visionface import FaceRecognition import os # Initialize system fr = FaceRecognition(db_backend="qdrant") # Auto-enroll from employee photos folder def enroll_employees(folder_path): for filename in os.listdir(folder_path): if filename.endswith(('.jpg', '.png')): name = filename.split('.')[0] # Use filename as name image_path = os.path.join(folder_path, filename) fr.upsert_faces( images=[image_path], labels=[name], collection_name="company_employees" ) print(f"Enrolled: {name}") # Enroll all employees enroll_employees("employee_photos/") # Check security camera feed def identify_person(camera_image): results = fr.search_faces( camera_image, collection_name="company_employees", score_threshold=0.8, top_k=1 ) if results[0]: # If match found return results[0][0]['face_name'] return "Unknown person"
Choose the right model for your use case:
| Use Case | Speed | Accuracy | Recommended Model |
|---|---|---|---|
| 🚀 Real-time apps | ⚡⚡⚡ | ⭐⭐ | yolo-nano, mediapipe |
| 🎯 General purpose | ⚡⚡ | ⭐⭐⭐ | yolo-small (default) |
| 🔍 High accuracy | ⚡ | ⭐⭐⭐⭐ | yolo-large, mtcnn |
| 📱 Mobile/Edge | ⚡⚡⚡ | ⭐⭐ | mediapipe, yolo-nano |
| 🎭 Landmarks needed | ⚡⚡ | ⭐⭐⭐ | mediapipe, dlib |
📋 Complete Model List
Detection Models:
yolo-nano,yolo-small,yolo-medium,yolo-largeyoloe-small,yoloe-medium,yoloe-large(prompt-based)yolow-small,yolow-medium,yolow-large,yolow-xlarge(open-vocabulary)mediapipe,mtcnn,opencv
Embedding Models:
FaceNet-VGG(512D) - Balanced accuracy/speedFaceNet-CASIA(512D) - High precisionDlib(128D) - Lightweight
Landmark Models:
mediapipe- 468 points + 3D meshdlib- 68 points, robust
We welcome contributions! See our Contributing Guide.
Quick ways to help:
- ⭐ Star the repo
- 🐛 Report bugs
- 💡 Request features
- 📝 Improve docs
- 🔧 Submit PRs
MIT License - see LICENSE file.
@software{VisionFace2025, title = {VisionFace: Modern Face Detection & Recognition Framework}, author = {VisionFace Team}, year = {2025}, url = {https://github.com/miladfa7/visionface} }
⬆ Back to Top • Made with ❤️ by the VisionFace team