This basic AI chat-bot experiment uses Speech-to-Text technology and some clever prompting of an OpenAI LLM to generate characterized responses, which are then piped out to a Text-to-Speech engine, with a lip-sync visualization.
I've found it to be an entertaining distraction to start conversations with the AI, change up characters, and build out fake "Podcasts" from the exported audio. Check the "Examples tab" for links to a few of them
Grant the Microphone Permission when the page loads. Without it, there is no other way to interface with the AI.
Use the Output tab to change characters and output language. There is also an "Additional prompt" field in which you can provide extra background on the conversation, like character background notes, or situational details.
Check the Input tab to make sure your microphone and language settings are correct. The input language doesn't matter too much, but it's helpful in the context I originally wrote this demo: building conversation practice tools for learning foreign languages.
Click the "Start listening" button to begin recording speech. The app attempts to detect quiet spaces around your utterances, so you don't need to click "Stop listening" in between each of your prompts.
However, if you're in a noisy environment, or your speakers are turned up too loud and your microphone ends up hearing the generated speech and interprets it as your own speech, you can use the "Start/stop" button to pause recording when you're done talking to avoid erroneous prompt recordings
The "Reprompt" button will force another reply from the AI without requiring your verbal input.
The "Export" button will concatenate all of the audio clips, both your own and the AI's generated clips, into a single audio file, which you can then do with as you wish.