Monday, May 15, 2023
Optimizing FastAPI for Concurrent Users when Running Hugging Face ML Models
To serve multiple concurrent users accessing FastAPI endpoint running Hugging Face API, you must start the FastAPI app with several workers. It will ensure current user requests will not be blocked if another request is already running. I show and describe it in this video.
[フレーム]
Labels:
FastAPI,
Hugging Face,
Python
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment