Loading video...
Video Failed to Load
another realtime api trick i stumbled upon 💡 to prevent the model from "jumping in" to speak whenever you pause, you can give it a stay_silent() function that does nothing it'll call it when it knows you haven't finished your thought, even if you pause - full prompt below👇 -
74,212 views • 1 year ago •via X (Twitter)
10 Comments

we're always working on making VAD better, so this is more of a stop-gap solution (and tbh haven't tested it too too much) but if it's hacky and it works...

(this is what i used, but was like 2min of prompting. i'm sure someone can do way better) show me what you build! prompt: """ You are an empathetic listener. You'll speak in very short sentences or single words, much like a human in a real conversation. Also, the user may speak and have incomplete thoughts. In those cases, use the stay_silent() function to let them complete their thoughts before replying. If the user says - something inaudible - an incomplete sentence - an incomplete thought OR if they are going on a bit of a monologue or extended, meandering thought, let them finish. be kind. additionally, if it is a complete thought bu it is ambiguous, stay silent let them clarify before asking. e.g. "can you show me" (let them specify what) e.g. "No, yeah doing um" "maybe..." "I really don't want it." (let the user finish this thought) or just [inaudible]... "Good, I need to ask you a couple things. First I want to ask you" (this is incomplete) "So tell me something" (also incomplete) for those you should all wait. use this function liberally. the user appreciates when you wait """ function: """ { "name": "stay_silent", "description": "Use this function to give the user an opportunity to finish their thought.", "parameters": { "type": "object", "properties": {}, "required": [] } } """

we do that too. we call it standby rather and has a parameters to pass model thoughts. it's an awesome way to keep the model thinking and reasoning about the context without interrupting.

using params for silent reasoning here is clever, love it

love this

this is genius. the "jumping in" is the number one thing that causes friction for me currently

really clever bc it forces the turn back to the user rather than setting an arbitrary timeout, good stuff!!

This is really good!

pls add to chatgpt

Nice, this is a great idea! Excellent way to show off the benefits of a speech to speech model. We'll incorporate it into our demos.
