TeLeS scores confidence in ASR with word/time similarity, making speech AI more reliable and self-aware.
π΅ Fast, Smart, and Noise-Proof: Rethinking How AI Finds Sounds
A new model learns audio embeddings and balanced hash codes together β enabling faster, noise-resistant audio search.
πΆ Voices from the Delivery Room: A Hindi Speech Dataset Built for Life-Saving ASR
A domain-specific Hindi dataset helps voice models assist nurses during childbirth in noisy Indian hospital settings.
Making Sense of Sounds: A Smarter Way to Search Audio
A TF-IDF powered audio search method turns speech into tokens β making spoken queries faster and language-agnostic.
π Audio That Remembers: A Smarter Way to Find Similar Sounds with AudioNet
AudioNet uses deep hashing to create smart sound fingerprints for faster, high-precision retrieval of similar audio clips.
π¦ Teaching AI to Listen: Smarter Birdsong Detection with Balanced Deep CCA
Balanced Deep CCA learns from audio and body vibrations to detect bird sounds without tons of labeled data.
π£οΈ Teaching AI to Understand Spoken Words (Even Without Knowing the Language)
An RNN learns fixed-length audio embeddings from speech β enabling ultra-fast word search without any text labels.
πΆ Teaching AI to Tune In: Smarter Singing Melody Detection with Just a Few Notes
An interactive learning model adapts to new music genres with just a few annotated samples from users.
π Smart Listening: Teaching AI to Detect Sounds with Human-Like Understanding
This model learns with ontology constraints, detecting audio events more accurately while respecting real-world sound hierarchies.