I’ve found that MS Teams has the ability to create voice/face profiles - though it appears these are really only usable when in a room setup with Teams Rooms.

So Teams won’t work for a conference on the fly off say a laptop with multiple people sharing it - and have Teams recognize the speaker and attribute correctly?

The main goal is transcription of the meeting with correct attribution to the speaker.

3 Spice ups

as i understand, if each attendee wil say his own name, teams knows who is who. Never tested it however…

From what I understand, and it makes sense to me now.

In a previous meeting, people introduced and describe themselves briefly instead of being on camera, my suspicion is this is enough to help the AI detect who will be speaking (voice recognition) and be able to transcribe it enough for everyone to be able to either read it back or audio playback.

Never used it myself, though.

Interesting - I’m going to have to test this… though - I’m sure my docs don’t want to have to do a round the table introduction every time they have a meeting…

In this case - was everyone in the same room? or each person was behind their own screen?

It wasn’t a meeting I was party to, but this is how it was explained to me as the person in the meeting found it odd that people were describing themselves.

Is this any use?

Use Microsoft Teams Intelligent Speakers to identify in-room participants in a meeting transcription - Microsoft Support

  1. Speaker Attribution: Teams can attribute spoken words to specific participants using AI-driven speech recognition. This helps in identifying who is speaking during the meeting.
  2. Intelligent Speakers: Microsoft Teams supports intelligent speakers that can recognize and differentiate between up to 10 different voices in a room, enhancing transcription accuracy.
  3. Voice Signatures: For more advanced setups, you can create voice signatures for each participant using the Azure Speech service, allowing for even more precise identification and transcription.

it’s things like this that make me wonder if Teams Rooms is really required for the speaker recognition to function.

The voice and face profiles features of Microsoft Teams are primarily used in meeting rooms equipped with Teams Rooms devices. These profiles help improve audio quality and user experience, especially in meeting transcription and speaker identification.

For laptops shared by multiple people, Teams does not currently support the ability to dynamically identify and tag multiple speakers on the fly. Voice recognition and speaker tagging features rely primarily on smart speaker devices in Teams Rooms.

If your organization’s Microsoft Teams Rooms are equipped with Intelligent Speakers, you can hold meetings where in-room participants can be identified in live transcription. During the meeting, all participants can easily see who’s saying what, and the post-meeting transcript identifies both remote and in-room attendees, except any who choose not to be identified. Without speaker recognition, audio will be attributed to the room in AI notes. For more details, see Use Microsoft Teams Intelligent Speakers to identify in-room participants in a meeting transcription.

If your primary goal is to transcribe the meeting and correctly attribute the speaker, you can consider manual speaker tagging: manually annotate each speaker’s speech during the transcription process. This requires more work, but ensures accuracy.

1 Spice up