Building an Interactive AI Co-Drawing Tool
A comprehensive guide to AI-powered collaborative drawing and the implementation behind this interactive tool
Table of Contents
1. Project Overview
The AI Co-Drawing application is an interactive collaborative drawing tool that combines human creativity with artificial intelligence. Built with React and powered by Google's Gemini 2.0 Flash model, it allows users to create drawings and have them enhanced, transformed, or evolved through AI assistance.
The Core Insight
AI Co-Drawing bridges the gap between human creativity and machine intelligence. By allowing users to draw and then enhance their creations with AI, it demonstrates how human and machine can collaborate in real-time to create something neither could achieve alone.
Core Features
- Interactive Canvas: High-resolution drawing surface with touch and mouse support
- AI Enhancement: Transform drawings using natural language prompts
- Iterative Creation: Build upon AI-generated images with additional drawing
- Real-time Processing: Live drawing with immediate visual feedback
- Multi-modal AI: Combines image understanding with text generation capabilities
Workflow
- User draws on the canvas using customizable brush tools
- User provides a text prompt describing desired enhancements
- AI analyzes the drawing and prompt to generate an enhanced version
- Generated image becomes the new canvas background
- User can continue drawing on top of the AI-generated content
- Process repeats for iterative collaborative creation
2. AI Integration
Gemini 2.0 Flash Model
The application uses Google's latest Gemini 2.0 Flash model with native image generation capabilities. This model can understand both visual input (drawings) and textual instructions to create coherent, contextual enhancements.
Model Configuration
Model:gemini-2.0-flash-preview-image-generationInput Modalities:Text + ImageOutput Modalities:Text + ImageMax Resolution:1024x1024 pixels
Multi-modal Processing
The AI system processes two types of input simultaneously:
1let contents: Content[] = [ 2 { 3 role: "USER", 4 parts: [{ inlineData: { data: drawingData, mimeType: "image/png" } }], 5 }, 6 { 7 role: "USER", 8 parts: [{ text: `${prompt}. Keep the same minimal line doodle style.` }], 9 }, 10]; 11 12const response = await ai.models.generateContent({ 13 model: "gemini-2.0-flash-preview-image-generation", 14 contents, 15 config: { 16 responseModalities: [Modality.TEXT, Modality.IMAGE], 17 }, 18});Style Preservation
To maintain visual consistency, the AI is instructed to "keep the same minimal line doodle style" ensuring that enhancements feel like natural extensions of the original drawing rather than complete replacements.
3. Canvas Drawing System
Canvas Architecture
The drawing system uses HTML5 Canvas with a multi-layer approach for optimal performance and functionality.
1const initializeCanvas = () => { 2 const canvas = canvasRef.current; 3 if (!canvas) return; 4 const ctx = canvas.getContext("2d"); 5 if (!ctx) return; 6 7 // Set high-resolution canvas size 8 canvas.width = 960; 9 canvas.height = 540; 10 11 // Fill canvas with white background 12 ctx.fillStyle = "#FFFFFF"; 13 ctx.fillRect(0, 0, canvas.width, canvas.height); 14};Coordinate System
The canvas uses a sophisticated coordinate mapping system to handle different screen sizes and device pixel ratios:
1const getCoordinates = (e: React.MouseEvent | React.TouchEvent) => { 2 const canvas = canvasRef.current; 3 if (!canvas) return { x: 0, y: 0 }; 4 5 const rect = canvas.getBoundingClientRect(); 6 const scaleX = canvas.width / rect.width; 7 const scaleY = canvas.height / rect.height; 8 9 let clientX, clientY; 10 if ("touches" in e) { 11 clientX = e.touches[0]?.clientX || 0; 12 clientY = e.touches[0]?.clientY || 0; 13 } else { 14 clientX = e.clientX; 15 clientY = e.clientY; 16 } 17 18 return { 19 x: (clientX - rect.left) * scaleX, 20 y: (clientY - rect.top) * scaleY, 21 }; 22};Drawing Engine
The drawing engine supports smooth line rendering with configurable brush properties:
1const draw = (e: React.MouseEvent | React.TouchEvent) => { 2 if (!isDrawing) return; 3 4 const canvas = canvasRef.current; 5 if (!canvas) return; 6 const ctx = canvas.getContext("2d"); 7 if (!ctx) return; 8 9 const { x, y } = getCoordinates(e); 10 11 ctx.lineWidth = brushSize; 12 ctx.lineCap = "round"; 13 ctx.strokeStyle = penColor; 14 ctx.lineTo(x, y); 15 ctx.stroke(); 16};4. Implementation Details
State Management
The application uses React hooks for comprehensive state management:
1// Drawing State 2const [isDrawing, setIsDrawing] = useState(false); 3const [penColor, setPenColor] = useState("#000000"); 4const [brushSize, setBrushSize] = useState(5); 5 6// AI Generation State 7const [prompt, setPrompt] = useState(""); 8const [generatedImage, setGeneratedImage] = useState<string | null>(null); 9const [isLoading, setIsLoading] = useState(false); 10 11// Error Handling 12const [showErrorModal, setShowErrorModal] = useState(false); 13const [errorMessage, setErrorMessage] = useState("");Background Image System
AI-generated images are seamlessly integrated as canvas backgrounds for iterative drawing:
1useEffect(() => { 2 if (generatedImage && canvasRef.current) { 3 const img = new window.Image(); 4 img.onload = () => { 5 backgroundImageRef.current = img; 6 drawImageToCanvas(); 7 }; 8 img.src = generatedImage; 9 } 10}, [generatedImage]); 11 12const drawImageToCanvas = () => { 13 if (!canvasRef.current || !backgroundImageRef.current) return; 14 15 const canvas = canvasRef.current; 16 const ctx = canvas.getContext("2d"); 17 if (!ctx) return; 18 19 // Fill with white background first 20 ctx.fillStyle = "#FFFFFF"; 21 ctx.fillRect(0, 0, canvas.width, canvas.height); 22 23 // Draw the background image 24 ctx.drawImage(backgroundImageRef.current, 0, 0, canvas.width, canvas.height); 25};Touch Support
Comprehensive touch support for mobile devices with gesture prevention:
1useEffect(() => { 2 const preventTouchDefault = (e: TouchEvent) => { 3 if (isDrawing) { 4 e.preventDefault(); 5 } 6 }; 7 8 const canvas = canvasRef.current; 9 if (canvas) { 10 canvas.addEventListener("touchstart", preventTouchDefault, { passive: false }); 11 canvas.addEventListener("touchmove", preventTouchDefault, { passive: false }); 12 } 13 14 return () => { 15 if (canvas) { 16 canvas.removeEventListener("touchstart", preventTouchDefault); 17 canvas.removeEventListener("touchmove", preventTouchDefault); 18 } 19 }; 20}, [isDrawing]);5. User Interface
Design System
The interface follows a consistent editorial design system with light theme and clean typography:
Design Principles
- • Editorial Typography: Serif headings, mono body text
- • Light Theme: Clean white backgrounds with subtle borders
- • Minimal Interface: Focus on the canvas and essential controls
- • Responsive Layout: Adapts seamlessly from mobile to desktop
Component Structure
- Canvas Area: Primary drawing surface with full touch/mouse support
- Settings Panel: Tool controls and AI configuration options
- Drawing Tools: Brush size, color picker, and canvas management
- AI Enhancement: Prompt input and generation controls
- Error Modal: User-friendly error handling and display
Responsive Design
The interface adapts to different screen sizes using Tailwind CSS breakpoints:
1{/* Responsive grid layout */} 2<div className="grid grid-cols-1 xl:grid-cols-12 gap-8"> 3 {/* Canvas takes full width on mobile, left side on desktop */} 4 <motion.div className="xl:col-span-9 border border-black/10 bg-white p-6"> 5 <canvas 6 className="w-full h-[300px] md:h-[400px] lg:h-[540px] bg-white cursor-crosshair touch-none" 7 /> 8 </motion.div> 9 10 {/* Settings panel stacks below on mobile, right side on desktop */} 11 <motion.div className="xl:col-span-3"> 12 {/* Settings content */} 13 </motion.div> 14</div>6. API Integration
Environment Configuration
The application requires proper environment setup for API access:
1# Google Generative AI API Key 2# Get your API key from: https://makersuite.google.com/app/apikey 3NEXT_PUBLIC_GOOGLE_API_KEY=your_api_key_hereError Handling
Comprehensive error handling with user-friendly feedback:
1function parseError(error: string) { 2 const regex = /{"error":(.*)}/gm; 3 const m = regex.exec(error); 4 try { 5 const e = m ? m[1] : error; 6 const err = JSON.parse(e); 7 return err.message || error; 8 } catch (e) { 9 return error; 10 } 11} 12 13// In the main component 14} catch (error: any) { 15 console.error("Error submitting drawing:", error); 16 setErrorMessage(error.message || "An unexpected error occurred."); 17 setShowErrorModal(true); 18} finally { 19 setIsLoading(false); 20}Performance Considerations
- Image Compression: Canvas exports optimized PNG data for API transmission
- Async Processing: Non-blocking AI generation with loading states
- Memory Management: Proper cleanup of canvas contexts and event listeners
- Error Recovery: Graceful handling of API failures with retry options
- Client-side Rendering: All drawing operations run locally for immediate feedback
Security
API keys are handled securely through environment variables with the NEXT_PUBLIC_ prefix for client-side access. The application includes input validation and sanitization to prevent malicious prompts or canvas manipulations.
Ready to Try It Yourself?
Experience the AI Co-Drawing tool firsthand. Draw your ideas, enhance them with AI, and explore the collaborative possibilities between human creativity and machine intelligence.
Launch AI Co-DrawingBuilding this AI Co-Drawing application demonstrates the power of combining human creativity with artificial intelligence. The intersection of interactive design, multi-modal AI, and real-time collaboration continues to be an exciting area to explore.