Stream voice recognition in browser

22 February 2019

Modern web applications are full of rich features. What about stream voice recognition? Not there is no “HTMLvnext” feature for voice recognition, but there are different services like AWS Transcribe, IBM Watson, and Google Speech to text. All services seem to be very good, they at least can handle my English with East-Ukrainian accent very well.

But one thing makes IBM Watson special, it supports WebSockets, so you can stream directly from the browser without sending a stream to your server. This may give a lot of benefits from security to hosting costs.

You still need a server to perform authentication. But that is only one small call at the very begging, after that, you can send a stream of audio from your browser to IBM Watson and get a stream of text messages back. The latency is about a second or less.

To make everything simpler, you can use this library npm: watson-speach

More examples coming soon…

It also could be interesting to You

About reinvention of a wheel, Agile and Linux way

22 June 2022

How often in software development do you think about rewriting some 3rd party module from scratch? The popular vision of the problem says that we should not reinvent a wheel and use existing paid or…

“Why do people refuse to use WebRTC? Is it really its quality question?”

10 December 2020

As a CEO who has 15+ years of software development, I take part very often in the first call with our clients. Very often, I can hear something like “We should like to move away…