Stream voice recognition in browser
Modern web applications are full of rich features. What about stream voice recognition? Not there is no “HTMLvnext” feature for voice recognition, but there are different services like AWS Transcribe, IBM Watson, and Google Speech to text. All services seem to be very good, they at least can handle my English with East-Ukrainian accent very well.
But one thing makes IBM Watson special, it supports WebSockets, so you can stream directly from the browser without sending a stream to your server. This may give a lot of benefits from security to hosting costs.
You still need a server to perform authentication. But that is only one small call at the very begging, after that, you can send a stream of audio from your browser to IBM Watson and get a stream of text messages back. The latency is about a second or less.
To make everything simpler, you can use this library npm: watson-speach
More examples coming soon…