How Microsoft Teams uses AI and machine learning to improve calling and meetings

As schools and workplaces begin to resume in-person operations, we anticipate a permanent increase in the volume of online meetings and calls. And while communication and collaboration solutions have played a critical role in enabling continuity during these unprecedented times, early stress testing revealed opportunities to improve and improve the quality of meetings and calls.

Disruptive echo effects, poor room acoustics, and choppy video are common issues that hamper effective online calls and meetings. Using artificial intelligence and machine learning, which have become fundamental in our continuous improvement strategy, we have identified and are now delivering innovative improvements in Microsoft Teams that improve these audio and video challenges in a way that is both user-friendly. and scalable in any environment.

Today, we’re announcing the availability of new Teams features, including echo cancellation, sound tuning in poor acoustic environments, and the ability for users to speak and hear at the same time without interruption. These build on recently released AI-powered features such as extended background noise cancellation.

Voice quality improvements

echo cancellation

During calls and meetings, when a participant has their microphone too close to their speaker, it’s common for sound to loop between input and output devices, causing an unwanted echo effect. Now Microsoft Teams uses AI to recognize the difference between the sound of a speaker and the user’s voice, eliminating echo without suppressing speech or inhibiting the ability of multiple parties to speak at the same time. time.

“De-reverb” adjusts for poor room acoustics

In specific environments, room acoustics can cause sound to bounce or resonate, causing the user’s voice to sound shallow as if speaking in a cave. For the first time, Microsoft Teams uses a machine learning model to convert captured audio to sound as if users were speaking into a short-range microphone.

Interruptibility, for more natural conversations

A natural part of conversation is the ability to interrupt for clarification or validation. This is accomplished through full-duplex (two-way) audio transmission, allowing users to speak and hear others at the same time. When not using headphones, and especially when using devices where the speaker and microphone are very close to each other, it is difficult to suppress echo while retaining audio full duplex. Microsoft Teams uses a “trained” model with 30,000 hours of speech samples to retain the voices you want while removing unwanted audio cues, resulting in smoother dialogue.

Background noise suppression

Each of us has first-hand experience of a meeting disrupted by the unexpected sounds of a barking dog, a car alarm, or a slamming door. Over two years ago, we announced the release of AI-based noise cancellation in Microsoft Teams as an optional feature for Windows users. Since then, we have continued a cycle of iterative development, testing and evaluation to further optimize our model. After seeing significant improvements in key user metrics, we’ve enabled machine learning-based noise cancellation by default for Teams clients using Windows (including Microsoft Teams Rooms), as well as Mac and iOS users. . A future release of this feature is planned for Teams Android and web clients.

These AI-driven audio enhancements are rolling out and should be generally available in the coming months.

Video quality improvements

We also recently released breakthrough AI-powered video quality and screen sharing optimizations for teams. From adjustments for low light to optimizations based on the type of content being shared, we now use AI to help you look your best.

Real-time screen optimization adapts to the content you share

The impact of presentations can often depend on an audience’s ability to read on-screen text or watch a shared video. But different types of shared content require varying approaches to ensure the best video quality, especially under bandwidth constraints. Teams now uses machine learning to detect and adjust the characteristics of the content presented in real time, optimizing the readability of documents or the fluidity of video playback.

AI-powered optimization ensures your video looks great even under bandwidth constraints

Unexpected issues with network bandwidth can result in choppy video that can quickly shift the focus of your presentation. AI-driven optimizations in Teams help adjust playback under challenging bandwidth conditions, so presenters can use video and screen sharing without worry.

Brightness and focus filters that put you in the best light

While you can’t always control the surrounding lighting for your meetings, new AI-powered filters in Teams give you the ability to adjust brightness and add soft focus for your meetings with a simple toggle. in your device’s settings, to better adapt to low-light environments.

Adjust brightness and focus settings in Microsoft Teams.  The screen shows a gentleman on the start screen of a Teams meeting with a split screen to toggle the brightness on and off.

Microsoft Teams: Designed for clearer sound and fewer distractions

The past two years have clearly shown how important communication and collaboration platforms such as Microsoft Teams are to maintaining safe, connected and productive operations. Along with bringing new features and capabilities to Teams, we’ll continue to explore new ways to use technology to make online calling and meetings more natural, resilient, and efficient.

Visit the Tech Community Teams blog for more technical details on how we’re leveraging AI and machine learning to improve audio quality as well as optimize video and screen sharing in Microsoft Teams.

Sherry J. Basler