Tag Archives: Custom Vocabulary

A.I. Speech-to-Text: How to make sure your data isn’t being used for training

We get a fair number of questions from Transcriptive users that are concerned the A.I. is going to use their data for training.

First off, in the Transcriptive preferences, if you select ‘Delete transcription jobs from server’ your data is deleted immediately. This will delete everything from the A.I. service’s servers and from the Digital Anarchy servers. So that’s an easy way of making sure your data isn’t kept around and used for anything.

However, generally speaking, the A.I. services don’t get more accurate with user submitted data. Partially because they aren’t getting the ‘positive’ or corrected transcript.

When you edit your transcript we aren’t sending the corrections back to the A.I. (some services are doing this… e.g. if you correct YouTube’s captions, you’re training their A.I.)

So the audio by itself isn’t that useful. What the A.I. needs to learn is the audio file, the original transcript AND the corrected transcript. So even if you don’t have the preference checked, it’s unlikely your audio file will be used for training.

This is great if you’re concerned about security BUT it’s less great if you really WANT the A.I. to learn. For example, I don’t know how many videos I’ve submitted over the last 3 years saying ‘Digital Anarchy’. And still to this day I get: Dugal Accusatorial (seriously), Digital Ariki, and other weird stuff. A.I. is great when it works, but sometimes… it definitely does not work. And people want to put this into self-driving cars? Crazy talk right there.

 If you want to help the A.I. out, you can use the Speech-to-Text Glossary (click the link for a tutorial). This still won’t train the A.I., but if the A.I. is uncertain about a word, it’ll help it select the right one.

How does the glossary work? The A.I. analyzes a word sound and then comes up with possible words for that sound. Each word gets a ‘confidence score’. The one with the highest score is the one you see in your transcript. In the case above, ‘Ariki’ might have had a confidence of .6 (out 0 to 1, so .6 is pretty low) and ‘Anarchy’ might have been .53. So my transcript showed Ariki. But if I’d put Anarchy into the Glossary, then the A.I. would have seen the low confidence score for Ariki and checked if the alternatives matched any glossary terms.

So the Glossary can be very useful with proper names and the like.

But, as mentioned, nothing you do in Transcriptive is training the A.I. The only thing we’re doing with your data is storing it and we’re not even doing that if you tell us not to.

It’s possible that we will add the option in the future to submit training data to help train the A.I. But that’ll be a specific feature and you’ll need to intentionally upload that data.

Improving Accuracy of A.I. Transcripts with Custom Vocabulary

The Glossary feature in Transcriptive is one way of increasing the accuracy of the transcripts generated by artificial intelligence services. The A.I. services can struggle with names of people or companies and it’s a big of mixed bag with technical terms or industry jargon. If you have a video with names/words you think the A.I. will have a tough time with, you can enter them into the Glossary field to help the A.I. along.

For example, I grabbed this video of MLB’s top 30 draft picks in 2018:

Obviously a lot names that need to be accurate and since we know what they are, we can enter them into the Glossary.

Transcriptive's Glossary to add custom vocabulary

As the A.I. creates the transcript, words that sound similar to the names will usually be replaced with the Glossary terms. As always, the A.I. analyzes the sentence structure and makes a call on whether the word it initially came up with fits better in the sentence. So if the Glossary term is ‘Bohm’ and the sentence is ‘I was using a boom microphone’, it probably won’t replace the word. However if the sentence is ‘The pick is Alex boom’, it will replace it. As the word ‘boom’ makes no sense in that sentence.

Here are the resulting transcripts as text files: Using the Glossary and Normal without Glossary

Here’s a short sample to give you an idea of the difference. Again, all we did was add in the last names to the Glossary (Mize, Bart, Bohm):

With the Glossary:

The Detroit Tigers select Casey Mize, a right handed pitcher. From Auburn University in Auburn, Alabama. With the second selection of the 2018 MLB draft, the San Francisco Giants select Joey Bart a catcher. A catcher from Georgia Tech in Atlanta, Georgia, with the third selection of a 2018 MLB draft. The Philadelphia Phillies select Alec Bohm, third baseman

Without the Glossary:

The Detroit Tigers select Casey Mys, a right handed pitcher. From Auburn University in Auburn, Alabama. With the second selection of the 2018 MLB draft, the San Francisco Giants select Joey Bahrke, a catcher. A catcher from Georgia Tech in Atlanta, Georgia, with the third selection of a 2018 MLB draft. The Philadelphia Phillies select Alec Bomb. A third baseman

As you can see it corrected the names it should have. If you have names or words that are repeated often in your video, the Glossary can really save you a lot of time fixing the transcript after you get it back. It can really improve the accuracy, so I recommend testing it out for yourself!

It’s also worth trying both Speechmatics and Transcriptive-A.I. Both are improved by the glossary, however Speechmatics seems to be a bit better with glossary words. Since Transcriptive-A.I. has a bit better accuracy normally, you’ll have to run a test or two to see which will work best for your video footage.

If you have any questions, feel free to hit us up at cs@nulldigitalanarchy.com!