Voice AI Input Interaction

Published Aug 20, 2024

During a recent project, I noticed how frustrating it was for me to type long messages to AI assistants (ChatGPT, Claude). Some times, I would literally say "I wish I could just talk to it". That observation led me to experiment with voice input that felt natural and unintimidating.

00:00

Click to speak

The first iterations were quite clunky - too many buttons, confusing states, and users weren't sure when they were actually recording. I stripped everything back to its essence: one button, one purpose. The waveform visualization wasn't part of the original idea, but after several iterations, I added it as some kind of feedback that the voice was being heard.

What you see now is actually version 4 - each iteration got progressively simpler. The subtle animation of the waveform bars gives just enough feedback without being distracting, and the minimal timer keeps users aware of their recording length without making them anxious about time limits.

Implementation Details

Why This Design

Minimalist Interface: Single button interaction reduces cognitive load
Clear Feedback: Visual waveform shows active recording
Adaptive Design: Works across different screen sizes and themes

"use client";
 
import { Mic } from "lucide-react";
import { useState, useEffect } from "react";
import { cn } from "@/lib/cn";
 
export default function VoiceAiInput() {
    const [submitted, setSubmitted] = useState(false);
    const [time, setTime] = useState(0);
    const [isClient, setIsClient] = useState(false);
    const [isDemo, setIsDemo] = useState(true);
 
    useEffect(() => {
        setIsClient(true);
    }, []);
 
    useEffect(() => {
        let intervalId: NodeJS.Timeout;
 
        if (submitted) {
            intervalId = setInterval(() => {
                setTime((t) => t + 1);
            }, 1000);
        } else {
            setTime(0);
        }
 
        return () => clearInterval(intervalId);
    }, [submitted]);
 
    const formatTime = (seconds: number) => {
        const mins = Math.floor(seconds / 60);
        const secs = seconds % 60;
        return `${mins.toString().padStart(2, "0")}:${secs
            .toString()
            .padStart(2, "0")}`;
    };
 
    return (
        <div className="w-full py-4">
            <div className="relative max-w-xl w-full mx-auto flex items-center flex-col gap-2">
                <button
                    className={cn(
                        "group w-16 h-16 rounded-xl flex items-center justify-center transition-colors",
                        submitted
                            ? "bg-none"
                            : "bg-none hover:bg-gray-200 dark:hover:bg-white-10"
                    )}
                    type="button"
                    onClick={handleClick}
                >
                    {submitted ? (
                        <div
                            className="w-6 h-6 rounded-sm animate-spin bg-black dark:bg-white cursor-pointer pointer-events-auto"
                            style={{ animationDuration: "3s" }}
                        />
                    ) : (
                        <Mic className="w-6 h-6 text-black-a7 dark:text-white-a7" />
                    )}
                </button>
 
                <span
                    className={cn(
                        "font-mono text-sm transition-opacity duration-300",
                        submitted
                            ? "text-black-a7 dark:text-white-a7"
                            : "text-black-a3 dark:text-white-a3"
                    )}
                >
                    {formatTime(time)}
                </span>
 
                <div className="h-4 w-64 flex items-center justify-center gap-0.5">
                    {[...Array(48)].map((_, i) => (
                        <div
                            key={i}
                            className={cn(
                                "w-0.5 rounded-full transition-all duration-300",
                                submitted
                                    ? "bg-black-a5 dark:bg-white-a5 animate-pulse"
                                    : "bg-black-a10 dark:bg-white-a10 h-1"
                            )}
                            style={
                                submitted && isClient
                                    ? {
                                          height: `${20 + Math.random() * 80}%`,
                                          animationDelay: `${i * 0.05}s`,
                                      }
                                    : undefined
                            }
                        />
                    ))}
                </div>
 
                <p className="h-4 text-xs text-black-a8 dark:text-white-a8">
                    {submitted ? "Listening..." : "Click to speak"}
                </p>
            </div>
        </div>
    );
}

PreviousCompact AI Input Field Interactions NextAnimating AI Input Interaction

👀 behind the scenes