A modern, state-based WhatsApp bot library with OpenAI GPT integration, built on top of whatsapp-chatbot-python and GREEN-API.
- OpenAI GPT model integration for intelligent responses
- Support for multiple GPT models (GPT-3.5, GPT-4, GPT-4o, o1)
- Multimodal capabilities with image processing support
- Voice message transcription using Whisper API
- Comprehensive message handling for various WhatsApp media types
- Middleware architecture for customizing message and response processing
- Built-in conversation history management
- State-based conversation flow inherited from base library
- Python type hints and comprehensive docstrings
pip install whatsapp-chatgpt-python
The dependencies (openai, whatsapp-chatbot-python, and requests) will be installed automatically.
Before using the bot, you'll need:
- A registered account with GREEN-API
- An instance ID and API token from your GREEN-API account
- An OpenAI API key for GPT access
from whatsapp_chatgpt_python import WhatsappGptBot # Initialize the bot bot = WhatsappGptBot( id_instance="your-instance-id", api_token_instance="your-api-token", openai_api_key="your-openai-api-key", model="gpt-4o", system_message="You are a helpful assistant." ) # Start the bot bot.run_forever()
The bot inherits all configuration options from the base GreenAPIBot including the ability to customize GREEN-API
instance settings.
Complete configuration options for the WhatsappGptBot:
bot = WhatsappGptBot( # Required parameters id_instance="your-instance-id", api_token_instance="your-api-token", openai_api_key="your-openai-api-key", # Optional GPT-specific parameters model="gpt-4o", # Default model max_history_length=10, # Maximum messages in conversation history system_message="You are a helpful assistant.", # System message to set behavior temperature=0.5, # Temperature for response generation error_message="Sorry, I couldn't process your request. Please try again.", session_timeout=1800, # Session timeout in seconds (30 minutes) # Optional parameters from base bot bot_debug_mode=False, # Enable debug logs debug_mode=False, # Enable API debug mode raise_errors=True, # Whether to raise API errors settings={ # GREEN-API instance settings "webhookUrl": "", # Custom webhook URL "webhookUrlToken": "", # Webhook security token "delaySendMessagesMilliseconds": 500, # Delay between messages "markIncomingMessagesReaded": "yes", # Mark messages as read "incomingWebhook": "yes", # Enable incoming webhooks "keepOnlineStatus": "yes", # Keep WhatsApp online status "pollMessageWebhook": "yes", # Enable poll message webhooks } )
Main class for creating and managing your OpenAI-powered WhatsApp bot:
from whatsapp_chatgpt_python import WhatsappGptBot bot = WhatsappGptBot( id_instance="your-instance-id", api_token_instance="your-api-token", openai_api_key="your-openai-api-key", model="gpt-4o", system_message="You are a helpful assistant specializing in customer support.", max_history_length=15, temperature=0.7, settings={ "webhookUrl": "", "markIncomingMessagesReaded": "yes", "keepOnlineStatus": "yes", "delaySendMessagesMilliseconds": 500, } ) # Start the bot bot.run_forever()
The bot automatically handles different types of WhatsApp messages and converts them into a format understood by OpenAI's models.
- Text: Regular text messages
- Image: Photos with optional captions (supported in vision-capable models)
- Audio: Voice messages with automatic transcription
- Video: Video messages with captions
- Document: File attachments
- Poll: Poll messages and poll updates
- Location: Location sharing
- Contact: Contact sharing
The bot uses a registry of message handlers to process different message types:
# Access the registry registry = bot.message_handlers # Create a custom message handler class CustomMessageHandler(MessageHandler): def can_handle(self, notification): return notification.get_message_type() == "custom-type" async def process_message(self, notification, openai_client=None, model=None): # Process the message return "Processed content" # Register the custom handler bot.register_message_handler(CustomMessageHandler()) # Replace an existing handler bot.replace_handler(TextMessageHandler, CustomTextHandler())
The middleware system allows for customizing message processing before sending to GPT and response processing before sending back to the user.
# Process messages before sending to GPT def custom_message_middleware(notification, message_content, messages, session_data): # Add custom context to the conversation if notification.get_message_type() == "textMessage" and notification.chat.endswith("@c.us"): # Add context to the message enhanced_content = f"[User message] {message_content}" return { "message_content": enhanced_content, "messages": messages } return { "message_content": message_content, "messages": messages } # Add the middleware bot.add_message_middleware(custom_message_middleware)
# Process GPT responses before sending to user def custom_response_middleware(response, messages, session_data): # Format or modify the response formatted_response = response.replace("GPT", "Assistant").strip() # You can also modify the messages that will be saved in history return { "response": formatted_response, "messages": messages } # Add the middleware bot.add_response_middleware(custom_response_middleware)
The GPT bot extends the base session data with conversation-specific information:
@dataclass class GPTSessionData: """Session data for GPT conversations""" messages: List[Dict[str, Any]] = field(default_factory=list) last_activity: int = field(default_factory=lambda: int(time.time())) user_data: Dict[str, Any] = field(default_factory=dict) context: Dict[str, Any] = field(default_factory=dict)
You can access and modify this data in your middleware:
def message_middleware(notification, content, messages, session_data): # Set context variables if "variables" not in session_data.context: session_data.context["variables"] = {} session_data.context["variables"]["last_interaction"] = int(time.time()) return {"message_content": content, "messages": messages}
The library provides several utility functions for common tasks:
from whatsapp_chatgpt_python import Utils # Download media from a URL temp_file = await Utils.download_media("https://example.com/image.jpg") # Transcribe audio from openai import OpenAI openai_client = OpenAI(api_key="your-openai-api-key") transcript = await Utils.transcribe_audio("/path/to/audio.ogg", openai_client) # Clean up after processing import os os.unlink(temp_file)
from whatsapp_chatgpt_python import Utils # Trim conversation history trimmed_messages = Utils.trim_conversation_history( messages, 10, # max messages True # preserve system message ) # Estimate token usage estimated_tokens = Utils.estimate_tokens(messages)
The library supports a variety of OpenAI models:
- gpt-4
- gpt-4-turbo
- gpt-4-turbo-preview
- gpt-4-1106-preview
- gpt-4-0125-preview
- gpt-4-32k
- gpt-4o (default)
- gpt-4o-mini
- gpt-4o-2024εΉ΄05ζ13ζ₯
- gpt-3.5-turbo
- gpt-3.5-turbo-16k
- gpt-3.5-turbo-1106
- gpt-3.5-turbo-0125
- o1
- o1-mini
- o1-preview
The following models can process images:
- gpt-4o
- gpt-4o-mini
- gpt-4-vision-preview
- gpt-4-turbo
- gpt-4-turbo-preview
Since the library is built on whatsapp-chatbot-python, you can use all the command/filter features of the base library:
@bot.router.message(command="help") def help_handler(notification): help_text = ( "π€ *WhatsApp GPT Bot* π€\n\n" "Available commands:\n" "β’ */help* - Show this help message\n" "β’ */clear* - Clear conversation history\n" "β’ */info* - Show bot information" ) notification.answer(help_text) # Clear conversation history command @bot.router.message(command="clear") def clear_history_handler(notification): chat_id = notification.chat # Get session data session_data = bot.get_session_data(chat_id) # Find system message if it exists system_message = None for msg in session_data.messages: if msg.get("role") == "system": system_message = msg break # Reset messages but keep system message if system_message: session_data.messages = [system_message] else: session_data.messages = [] # Update session bot.update_session_data(chat_id, session_data) notification.answer("ποΈ Conversation history cleared! Let's start fresh.")
You can create handlers that don't process with GPT:
@bot.router.message(command="weather") def weather_handler(notification): notification.answer( "π€οΈ This is a placeholder weather response from a custom handler.\n\n" "In a real bot, this would fetch actual weather data from an API.\n\n" "This handler demonstrates skipping GPT processing." )
You can explicitly request GPT processing after handling a message:
@bot.router.message(text_message="recommend") def recommend_handler(notification): # Add a prefix message notification.answer("I'll give you a recommendation. Let me think...") # Request GPT processing as well notification.process_with_gpt()
You can also modify the message before sending it to GPT using the custom_message parameter:
# Echo handler that forwards a modified message to GPT @bot.router.message(command="echo") def echo_handler(notification): # Get the rest of the message after the command message_text = notification.message_text command_parts = message_text.split(maxsplit=1) if len(command_parts) > 1: echo_text = command_parts[1] notification.answer(f"You said: {echo_text}\n\nI'll ask GPT for more insights...") # Process with GPT, but pass only the actual message (without the command) notification.process_with_gpt(custom_message=echo_text) else: notification.answer("Please provide text after the /echo command.")
This is useful when you want to preprocess the message before it's sent to GPT, such as removing command prefixes, formatting the input, or adding context.
# Check if current model supports images if bot.supports_images(): # Handle image-based workflow pass
Here's a complete example of a WhatsApp GPT bot with custom handlers and middleware:
import os import time import logging from whatsapp_chatgpt_python import ( WhatsappGptBot, ImageMessageHandler, TextMessageHandler ) from dotenv import load_dotenv # Load environment variables from .env file load_dotenv() # Set up logging logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s') logger = logging.getLogger('whatsapp_chatgpt_python') # Get environment variables ID_INSTANCE = os.environ.get("INSTANCE_ID") API_TOKEN = os.environ.get("INSTANCE_TOKEN") OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY") # Initialize the bot bot = WhatsappGptBot( id_instance=ID_INSTANCE, api_token_instance=API_TOKEN, openai_api_key=OPENAI_API_KEY, model="gpt-4o", # Uses the GPT-4o model (which supports images) system_message="You are a helpful assistant. Be concise and friendly in your replies.", max_history_length=15, # Keep last 15 messages in conversation history temperature=0.7, # Slightly more creative responses session_timeout=1800, # Sessions expire after 30 minutes of inactivity error_message="Sorry, your message could not be processed.", # Custom error message ) # Custom image handler that provides enhanced instructions for images class EnhancedImageHandler(ImageMessageHandler): async def process_message(self, notification, openai_client=None, model=None): # Call the parent class method to get the base result result = await super().process_message(notification, openai_client, model) # For text-only responses (non-vision models) if isinstance(result, str): return result.replace( "[The user sent an image", "[The user sent an image. Analyze what might be in it based on any caption" ) # For vision-capable models, enhance the text instruction if isinstance(result, list) and len(result) > 0 and isinstance(result[0], dict): if result[0].get('type') == 'text': text = result[0].get('text', '') if text == "Analyzing this image": result[0]['text'] = "Describe this image in detail and what you see in it." return result # Example of a custom text handler class EnhancedTextHandler(TextMessageHandler): async def process_message(self, notification, *args, **kwargs): # Get the text message using the parent method text = await super().process_message(notification, *args, **kwargs) if not text: return text lower_text = text.lower() if any(term in lower_text for term in ['code', 'function', 'script', 'program']): return f"π§βπ» CODE REQUEST: {text}\n\n[I'll format my code response with proper syntax highlighting]" elif text.endswith('?') or text.lower().startswith( ('what', 'why', 'how', 'when', 'where', 'who', 'can', 'could')): return f"β QUESTION: {text}\n\n[I'll provide a clear and comprehensive answer]" return text # Replace the default handlers with our enhanced versions bot.replace_handler(ImageMessageHandler, EnhancedImageHandler()) bot.replace_handler(TextMessageHandler, EnhancedTextHandler()) # Middleware to log all messages and add tracking information def logging_middleware(notification, message_content, messages, session_data): user_id = notification.sender if isinstance(message_content, str): content_display = message_content[:100] + "..." if len(message_content) > 100 else message_content else: content_display = "complex content (likely contains media)" logger.info(f"Message from {user_id}: {content_display}") # Add tracking information to session context if not session_data.context.get("variables"): session_data.context["variables"] = {} session_data.context["variables"].update({ "last_interaction": int(time.time()), "message_count": session_data.context.get("variables", {}).get("message_count", 0) + 1 }) # Return unchanged content and messages return {"message_content": message_content, "messages": messages} # Middleware to format responses before sending to user def formatting_middleware(response, messages, session_data): # Format response by adding a signature at the end of longer messages formatted_response = response.strip() # Don't add signature to short responses if len(formatted_response) > 100 and not formatted_response.endswith("_"): message_count = session_data.context.get("variables", {}).get("message_count", 0) formatted_response += f"\n\n_Message #{message_count} β’ Powered by GPT_" return {"response": formatted_response, "messages": messages} # Add the middleware bot.add_message_middleware(logging_middleware) bot.add_response_middleware(formatting_middleware) # Command handler for /clear to reset conversation history @bot.router.message(command="clear") def clear_history_handler(notification): chat_id = notification.chat # Get session data session_data = bot.get_session_data(chat_id) # Find system message if it exists system_message = None for msg in session_data.messages: if msg.get("role") == "system": system_message = msg break # Reset messages but keep system message if system_message: session_data.messages = [system_message] else: session_data.messages = [] # Update session bot.update_session_data(chat_id, session_data) notification.answer("ποΈ Conversation history cleared! Let's start fresh.") # Command handler for /help to show available commands @bot.router.message(command="help") def help_handler(notification): help_text = ( "π€ *WhatsApp GPT Bot* π€\n\n" "Available commands:\n" "β’ */help* - Show this help message\n" "β’ */clear* - Clear conversation history\n" "β’ */info* - Show bot information\n" "β’ */weather* - Example of a handler that skips GPT\n\n" "You can send text, images, audio, and more. I'll respond intelligently to your messages." ) notification.answer(help_text) # Add an info command @bot.router.message(command="info") def info_handler(notification): chat_id = notification.chat session_data = bot.get_session_data(chat_id) # Get session statistics message_count = len(session_data.messages) - 1 # Subtract system message if message_count < 0: message_count = 0 vision_capable = "Yes" if bot.supports_images() else "No" info_text = ( "π *Bot Information* π\n\n" f"Model: {bot.get_model()}\n" f"Vision capable: {vision_capable}\n" f"Messages in current session: {message_count}\n" f"Max history length: {bot.max_history_length}\n" f"Session timeout: {bot.session_timeout} seconds\n\n" "To clear the current conversation, use */clear*" ) notification.answer(info_text) # Example weather handler that skips GPT processing @bot.router.message(command="weather") def weather_handler(notification): notification.answer( "π€οΈ This is a placeholder weather response from a custom handler.\n\n" "In a real bot, this would fetch actual weather data from an API.\n\n" "This handler demonstrates skipping GPT processing." ) # Start the bot if __name__ == "__main__": logger.info("Starting WhatsApp GPT Bot...") logger.info(f"Using model: {bot.get_model()}") bot.run_forever()
MIT
This library is built on top of:
- whatsapp-chatbot-python - The base WhatsApp bot library
- GREEN-API - WhatsApp API service for bot integration
- OpenAI API - For GPT model integration