Drawing Recognition App Documentation

Overview

The Drawing Recognition App is a web-based application that enables users to create drawings on a canvas and receive AI-generated descriptions of their artwork. Powered by the Google Generative AI API, this app offers an interactive and innovative way to explore the intersection of human creativity and machine learning.

Key Features

  • Interactive drawing canvas with adjustable brush size and color
  • Automatic recognition of drawings with customizable delay
  • Manual recognition option for user control
  • Real-time status updates during the recognition process
  • Mobile-friendly design for drawing on touchscreen devices

How It Works

Flowchart describing the app functionality

Key Components

1. Canvas Setup

The app utilizes an HTML5 canvas for drawing, providing a responsive and interactive surface:

const canvas = document.getElementById('drawingCanvas');
const ctx = canvas.getContext('2d');

canvas.width = window.innerWidth - 250;
canvas.height = window.innerHeight;

2. Drawing Functions

The drawing functionality is implemented using mouse and touch events, allowing for smooth and intuitive drawing experiences:

function startDrawing(e) {
    isDrawing = true;
    lastX = e.clientX - canvas.offsetLeft;
    lastY = e.clientY - canvas.offsetTop;
    draw(e);
}

function draw(e) {
    if (!isDrawing) return;
    ctx.lineWidth = settings.brushSize;
    ctx.lineCap = 'round';
    ctx.strokeStyle = settings.brushColor;
    ctx.beginPath;
    ctx.moveTo(lastX, lastY);
    let currentX = e.clientX - canvas.offsetLeft;
    let currentY = e.clientY - canvas.offsetTop;
    ctx.lineTo(currentX, currentY);
    ctx.stroke();
    lastX = currentX;
    lastY = currentY;

    // Auto-recognize logic
    clearTimeout(drawingTimeout);
    if (settings.autoRecognize) {
        drawingTimeout = setTimeout(() => {
            setTimeout(generateDescription, settings.recognitionDelay);
        }, 1000);
    }
}

3. Recognition Function

The app leverages the Google Generative AI API to analyze and describe the drawings:

async function generateDescription() {
    if (isGeneratingDescription) return;
    isGeneratingDescription = true;
    updateStatus("Generating description...");

    try {
        const imageData = canvas.toDataURL('image/png').split(',')[1];
        const genAI = new Google Generative AI(API_KEY);
        const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });

        const prompt = "Describe what is drawn in this image:";
        const imagePart = {
            inlineData: {
                data: imageData,
                mimeType: 'image/png'
            }
        };

        const result = await model.generateContent([prompt, imagePart]);
        const response = await result.response;
        const text = response.text();
        recognitionResult.textContent = text;
        updateStatus("Description generated");
    } catch (error) {
        console.error("Error generating description:", error);
        recognitionResult.textContent = "Error recognizing image.";
        updateStatus("Error in recognition");
    } finally {
        isGeneratingDescription = false;
    }
}

User Interface

The app features an intuitive sidebar with controls for enhancing the drawing experience:

  • Brush size adjustment
  • Color selection
  • Canvas clearing option
  • Auto-recognition toggle
  • Recognition delay setting
  • Manual recognition trigger

Styling

The app employs custom CSS to create a visually appealing dark-themed interface. It features smooth transitions and a glowing effect during the recognition process, enhancing the user experience.

Mobile Support

To ensure a seamless experience across devices, the app includes touch event listeners for mobile support:

canvas.addEventListener('touchstart', (e) => {
    e.preventDefault();
    const touch = e.touches[0];
    startDrawing({ clientX: touch.clientX, clientY: touch.clientY });
});

canvas.addEventListener('touchmove', (e) => {
    e.preventDefault();
    const touch = e.touches[0];
    draw({ clientX: touch.clientX, clientY: touch.clientY });
});

canvas.addEventListener('touchend', stopDrawing);