Baidu announced the launch of a web application that leverages the potential of artificial intelligence for automatic voice-to-text transcription, SwiftScribe.
Although at the moment it is in closed beta, we can know some of its main characteristics and the dynamics that it proposes. To begin to use it we only have to load the file (for example, mp3, .wav) and according to its duration will be the time that we must wait for the system to throw us the result.
There will be many details to correct after the app throws the text, for example, punctuation, spelling, format, etc. There are keyboard shortcuts that will help us in this process.
Even so, it manages to fulfill its objective, which is to save us a lot of time when having to transcribe word-for-word voice recordings.
This tool is based on Baidu’s voice recognition system, Deep Speech 2, providing the ability to learn from the modifications that users are making.
For the moment, it will only be tested by a small group of about 30 to 50 people, who are transcribers or related to industries where transcription is important. Meanwhile we can see an example of its potential in this link where the work done by this web application is illustrated.
It is a tool that will undoubtedly become an essential for those who make transcripts often, from professionals to independent, as well as journalists and students.
The proposal to use artificial intelligence to improve the recognition system is undoubtedly an innovative idea, it remains to be seen how well it will be on the move because we will not rely on two recordings without background noise.
Source: Baidu Blog