Spooky Smart AI That Designs Your Halloween Look

DEV Community

gemini-2.5-flash: This was the primary model for all text and structured data generation. I used it with a strictly defined responseSchema to ensure the AI's output was always in a predictable JSON format. This model was responsible for:

Generating the costume's name, description, materials, and detailed text instructions.

Powering the search feature by creating five distinct costume concepts from a single prompt.

Handling the conversational "Refine" feature, where it would modify a costume based on follow-up user input.

imagen-4.0-generate-001: This powerful image generation model was used to create the crucial first image for each set of instructions, establishing the visual foundation for the step-by-step guide.

gemini-2.5-flash-image-preview: This versatile image editing model was the key to creating the app's most unique feature. It was used to generate all subsequent instruction images by taking the previous step's image as input and adding the new details described in the current step's text.

Multimodal Features

The app is built around two core multimodal functionalities that create a rich and intuitive user experience.

Vision Understanding: Image to Costume Idea

The ability for a user to upload an image and receive a relevant costume idea is a powerful multimodal feature. It goes beyond simple text prompts by allowing for visual context. A user can upload a picture of their pet, a favorite object, or a friend, and the AI can creatively interpret that visual data to generate a highly personalized and often unexpected costume concept. This makes the brainstorming process more personal and engaging.

Additive Image Generation: A Cohesive Visual Guide

The app's standout feature is its ability to create a set of instruction images that build upon one another. Instead of generating a new, disconnected image for each step, the system uses an iterative, multimodal process:

Step 1: Generate a base image from a text prompt.
Step 2+: Feed the image from the previous step plus the text for the current step into the image editing model (gemini-2.5-flash-image-preview).

Image steps

This creates a coherent visual narrative, allowing the user to literally watch the costume come together from one image to the next. This significantly enhances the user experience by making the instructions far easier to understand and follow compared to a series of isolated diagrams. It transforms the app from a simple idea generator into a true step-by-step visual crafting guide.

Top comments (2)

graham_ad39d948da72c53e4a profile image

graham

Joined

Jul 2, 2025

• Sep 19 '25

This is really creative. love how it goes from idea to full DIY guide with step-by-step visuals. The additive image feature is a genius touch!

umer936 profile image

Umer Salman

Joined

Mar 21, 2017

• Oct 29 '25

Wow, this is really neat!! I could see this being super useful for other instruction type things, like turning crochet pattern instructions to images or furniture assembly instructions. Thank you for sharing!