Greetings, fellow Spiceheads. Does anyone have any recommendations for services that will transcribe in-person meetings and allow for speaker attribution where we tell it at the end that a particular voice was so-and-so and then it updates the transcription everywhere? We still have a lot of meetings in our conference room, and most people just bring a notepad, so it’s not like everyone has their own laptop and joins into a Teams meeting so we can record/transcribe that way. I know Microsoft Teams Room has Intelligent Speaker that does nearly everything we want, but I don’t currently have a Microsoft Teams Room license or the equipment, and it seems a bit on the expensive side between license and equipment, so I was looking to see what other options might be out there. So far, my Google Fu is only pointing me to services for online meetings and not in-person.

6 Spice ups

I know Zoom has an AI transcription offering (not sure if it is in the unpaid version??). You could let it record the in-person meeting and transcribe it for you?

1 Spice up

I’ll have to look into this, but if it’s like Teams, everyone in the room just shows up as whomever actually joined the meeting regardless of who’s actually talking. Just for testing purposes, we set up a “generic” user named CnfRm and assigned a Teams (standard, not Rooms) license to it, and if it joins the meeting, the transcription shows everyone in the room who speaks as “CnfRm” regardless of who’s talking. I do know Teams Rooms will specify the speakers as long as each one consents and has uploaded a voice profile, but I’m just wondering if there’s a less expensive option. I also saw assigning CoPilot licenses to users can allow for this as well, but that looked like (once again) each attendee would have to have a license and also join the meeting itself, so that’s a non-starter.

1 Spice up

Oh yeah, you’ll need to manually change who says what on the transcript, of course, as it’s just recording the meeting and translating speech to text.

Identifying individual speakers is going to be the biggest issue. It’s simple when they join the call as the meeting recording will know who was speaking, but if it’s just a bunch people in a room, then difficulty goes way up, unless they have voice profiles. You could try to generate voice profiles from the recording itself, but audio quality isn’t guaranteed, and accuracy will suffer as a result. That’s why Teams wants a voice profile setup outside of a meeting.

Sounds like you just need a straight-up transcription service that takes an audio recording and then gives you the transcript. After you get the transcript back, you’d probably have to be the ones to identify each speaker. Not sure if a transcription service would be willing to take on that part of your request - probably have to ask them.

2 Spice ups

Are you meaning like an outside service I would literally “ship” the recording off to and they would send it back with a transcript? I didn’t even know such a thing existed. I’d be open to that, but was also wondering if there was some kind of AI out there that would do it automatically and “inhouse”. Even if it just identified everyone as “Speaker 1” , “Speaker 2”, “Speaker 3”, etc., but allowed us to update it to say exactly who each speaker was one time and it would auto-correct the names throughout the rest of the transcript, that would work, too.

1 Spice up

That was my initial thought.

After a bit of Googling, I believe the technical term you’re looking for is “Speaker Diarization”

Google has “Cloud Speech-to-Text”, and Amazon has “Transcribe”. Both are cloud services.

For issues with machine diarization see: Speaker Labels and Speaker Diarization Explained: How to Obtain and Use Them for Accurate Transcription. It also lists some diarization services that you can check out - of the ones listed, I think i’ve only head of Gladia before… Or maybe just confusing it with Stadia…

2 Spice ups

Microsoft AI Speech to Text can employ diarization to identify different speakers but it would need you to either train it or manually assign names to the Speaker1, Speaker2 labels in the text file, just upload the recording of the meeting to it.

2 Spice ups

We use Plaud AI. It’ll transcribe everything and also provide a summary of a meeting with action points and suggestions. We’ve been using it for a few months now and not had any concerns with it’s outputs. Sometimes we have to amend the notes to name individuals if introductions weren’t completed at the start.

2 Spice ups

Fantastic!! I somehow never came across this and was focusing on “speaker attribution”, as that was the only term I’d discovered so far. This is giving me a plethora of new options to consider, and now I just have to wade through them all. Thanks so much!

2 Spice ups

This honestly looks really intriguing, especially when it comes to price! Teams Rooms almost seemed like overkill, since we’re talking maybe 2-3 meetings a month and don’t really care about the video part. I’ll be taking a close look at this; thanks!!

2 Spice ups

Well, this is yet another case where I’d love to see the return of “helpful answers” like the old platform had, but ultimately I can only assign a BA. @pgauldy, I greatly appreciate your suggestion, as it’s about the right price point for us right now. We might graduate to a more mature solution like Teams Rooms, but this will do nicely for now. Thanks to the rest who chimed in as well! I learned a little something from each of you!

2 Spice ups