EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit
- 
 Updated
 Nov 27, 2024 
- Python
EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit
Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]
New generation of CLIP with fine grained discrimination capability, ICML2025
The back-end of cross-modal retrieval system,wihch will contain services such as semantic location .etc
Toward Universal Multimodal Embedding
PyTorch implementation of 'CLIP' (Radford et al., 2021) from scratch and training it on Flickr8k + Flickr30k
[ACMMM'25] Referring Expression Instance Retrieval and A Strong End-to-End Baseline
Image-Centered Pseudo Label Generation for Weakly Supervised Text-based Person Re-Identification, PRCV 2024
The LLM-Powered Video Search System is an advanced multimodal video search solution that leverages Large Language Models (LLMs) to enhance video retrieval through text, image, and metadata queries.
PIMA - A Novel Approach for Pill-Prescription Matching with GNN Assistance and Contrastive Learning
A search engine, operating on the foundation of the OpenAI Clip Model to retrieve images corresponding to textual queries.
VisAlign: Aligning Visual Representations with Textual Semantics for Image Similarity and Retrieval
Semantic image search engine powered by OpenAI's CLIP and ChromaDB. Search your image collection using natural language queries with CLI, REST API, and web interface.
PICTOPEDIA – An interactive word-search chatbot powered by the Wikipedia API. Search terms, get instant info, and chat in real time.
Digimon Dataset for MultiModal Machine Learning
Add a description, image, and links to the text-image-retrieval topic page so that developers can more easily learn about it.
To associate your repository with the text-image-retrieval topic, visit your repo's landing page and select "manage topics."