Evaluating Standard Noise Reduction in Shure Microflex Advanced (MXA) Series Microphones

Optimal spaces for conferencing, lectures, and other speech-focused activities are designed and maintained with background noise levels not exceeding NC 25-30. However, real-world scenarios often present much higher noise levels, degrading audio capture quality and clarity. Shure Intellimix DSP incorporates noise reduction algorithms to combat this issue, while Microflex Advanced (MXA) array microphones utilize beamforming technology to further isolate speech from ambient noise sources.

These recordings aim to evaluate the relative noise reduction performance and speech quality capture across various Noise Reduction settings (including no noise reduction) under a range of background noise levels. Each background noise level was introduced into the space with an attempt to roughly replicate the corresponding Noise Criterion (NC) level spectra. It's worth noting that these noise "curves" may be more heavily weighted towards higher frequencies compared to real-world noise spectra, but they nevertheless provide a systematic approach for making comparative assessments.

This conference room at Richard Dean Associate's Newburyport office is typical of 10-person conference rooms though on the acoustically dry and quiet side, a reasonable baseline for comparison testing. To further simulate typical corporate conditions, acoustic panels were removed to slighly increase RT and worsen acoustic flutter and coloration.

Physical Properties

Length

22'

Width

11'

Height

7'-4"

Acoustic Treatment

None, removed for testing

Walls

Painted Gypsum (Acoustical Panels Removed)

Ceiling

Mineral Fiber Tile, ~1' Plenum

Floor

Thin commercial carpet

Furniture

10-person conference table, 110" DVLED panel, open-front credenza, leather chairs

Acoustic Properites

Reverb

RT = 0.3s Broadband (T30), Midband C50 = 11.7

Noise

27.1 dBA, just above NC20 Very quiet except for network switch fan

Comments

There is noticeable flutter echo at some locations with treatments removed

Acoustic Treatment

None, removed for testing

Acoustic Measurement

Platform

Studio Six Digital: AudioTools RTA and Impulse Response modules Run on M1 iPad Pro

Microphone

Studio Six Digital iPrecision Mic: ANSI Type 1 small diaphragm, omnidirectional condenser measurement microphone

Calibrator

Larson Davis CAL200

Impulse Response

Captured via 18" balloon pop

Pink noise was generated in a digital audio mixer and played through two studio monitors in opposite diagonal corners of the room to create a diffuse field. The noise was shaped in the mixer to roughly match NC contours.

Equipment and Processing

Noise Source

Digital audio mixer, pink noise generator

Processing

1/3-octave graphic EQ

Loudspeakers

5" 2-way, self-powered studio monitors

Loudspeaker Locations

Floor, facing corner, diagonally opposed room corners

Recorded, dry, female speech was played back through a small loudspeaker at normal seated height and normal speech level exactly 10' from the center axis of the microphone. The loudspeaker was chosen to approximate the sound directivity of a human head. The distance was both a normal talker location in the room and at a typical boundary of recommended coverage area for the array in its automatic tracking mode.

Equipment and Processing

Speech Script

"Harvard Sentences" Script From "IEEE Recommended Practice for Speech Quality Measurements." IEEE Transactions on Audio and Electroacoustics, Vol. 17, Issue 3, 225-246, 1969. APPENDIX C 1965 Revised List of Phonetically Balanced Sentences (Harvard Sentences). DOI 10.1109/IEEESTD.1969.7405210 (IEEE standard 297-1969) and 10.1109/TAU.1969.1162058

Recordings

G. E. Henter, T. Merritt, M. Shannon, C. Mayo, and S. King, “Measuring the perceptual effects of modelling assumptions in speech synthesis using stimuli constructed from repeated natural speech,” in Proc. Interspeech, 2014. Silent audio truncated

Playback Source

VLC Media player to USB audio interface

Loudspeaker

5", coaxial 2-way passive "cube" loudspeaker, low-Z

Loudspeaker Position

48" AFF 10' o/c of the medial axis of the mic array Airmed toward center of table, no tilt

Amplification

15-watt class-D, tabletop amplifier, low-Z

Playback Level

60 dBA @ 1m, 1s ("slow") averaging

Recording was captured using the Shure Microflex Advanced MXA920 ceiling microphone. Configuration was selected as detailed below to offer a realistic and largely "default" set of circumstances.

Equipment and Processing

Microphone Mounting Height

7'-4"

Array Coverage Configuration

Single, Dynamic Auto-Coverage area covering entire room

Equalization (EQ)

High-pass filter: 125 Hz, MXA920 onboard Intellimix DSP, undefined slope

Compression

None

Acoustic Echo Cancellation (AEC)

Default (MXA920 onboard Intellimix DSP)

Automatic Gain Copensation (AGC)

On (MXA920 onboard Intellimix DSP)

Noise Reduction (NR)

Off, Low, Medium, High settings as indicated per recording below (MXA920 onboard Intellimic DSP)

Audio Signal Chain

MXA 920 Dante out to Dante Virtual Soundcard on PC. Audio recorded in multitrack digital audio workstation (DAW)

Recordings

    Baseline - Approx. NC20

    Room Baseline Background Noise: 27.1 dBA

    No Additional Noise Added

    At its quietest, the only notable noise source is the fan of a network switch, which could not be disabled for this experiment. The overall noise rating is just above NC20 at 27.1 dBA.

    The noise source is more present in the recording than is noticed in real life. This is at least partially due to the mono nature of the MXA920 array's output.

    "Low" is likely the ideal NR selection, and Medium nearly eliminates all noise without meaningful sound quality degredation.

    Intelligibility is excellent at all NR settings, even without. Processing artifacts are minimal, even at High NR.

    No NR
    00:00
    /
    00:00
    Low NR
    00:00
    /
    00:00
    Medium NR
    00:00
    /
    00:00
    High NR
    00:00
    /
    00:00
    ~ NC25

    Approximately NC25 Background Noise: 32.5 dBA

    The noise source is more present in the recording than is noticed in real life. This is at least partially due to the mono nature of the MXA920 array's output.

    Intelligibility is excellent at all NR settings, even without. Without NR, the background noise is an annoyance, however.

    "Low" is likely the ideal NR selection, and Medium nearly eliminates all noise without meaningful sound quality degredation.

    Processing artifacts are minimal up to Medium NR. Processing artifacts are increasingly noticeable at High but are easily tolerable in conferencing, capture, and streaming applications.

    No NR
    00:00
    /
    00:00
    Low NR
    00:00
    /
    00:00
    Medium NR
    00:00
    /
    00:00
    High NR
    00:00
    /
    00:00
    ~ NC30

    Approximately NC30 Background Noise: 36.4 dBA

    This noise level is beyond typically specified ideals for conferencing applications, but is none-the-less common in real life.

    Intelligibility is excellent at all NR settings. Without NR, the background noise is an signifcant annoyance.

    "Medium" is likely the ideal NR selection, and Medium nearly eliminates all noise without meaningful sound quality degredation.

    Processing artifacts are minimal at Low NR, but background noise remains very present. Processing artifacts are increasingly noticeable at Medium NR but are easily tolerable in conferencing, capture, and streaming applications.

    Background noise is all but eliminated at High NR with slightly worse processing artificats. High NR is still very useable if complete background noise removal is absolutely necessary.

    No NR
    00:00
    /
    00:00
    Low NR
    00:00
    /
    00:00
    Medium NR
    00:00
    /
    00:00
    High NR
    00:00
    /
    00:00
    ~ NC35

    Approximately NC35 Background Noise: 41.5 dBA

    This noise level is well beyond typically specified ideals for conferencing applications, but is none-the-less common in real life.

    Intelligibility is acceptable at all NR settings though begins to suffer at High NR. Without NR, the background noise is an signifcant annoyance.

    "Medium" is likely the ideal NR selection as residual noise level and sound quality are balanced.

    Processing artifacts are minimal at Low NR, but background noise remains very present. Processing artifacts are increasingly noticeable at Medium NR but likely remain entirely tolerable in conferencing, capture, and streaming applications.

    Residual background noise is similar between Medium and High NR settings, but processing artifacts are more noticeable at High.

    No NR
    00:00
    /
    00:00
    Low NR
    00:00
    /
    00:00
    Medium NR
    00:00
    /
    00:00
    High NR
    00:00
    /
    00:00
    ~ NC40

    Approximately NC40 Background Noise: 46.2 dBA

    This noise level is significantly beyond typically specified ideals for conferencing applications. The raw noise level will compete with direct sound for many in-room participants.

    Speech is intelligible at all NR settings but suffers from the signifiant noise.

    "High" is likely the ideal NR selection as residual noise level and sound quality are balanced.

    Processing artifacts are noticeable at all NR levels but remain usable in conferencing, capture, and streaming applications.

    No NR
    00:00
    /
    00:00
    Low NR
    00:00
    /
    00:00
    Medium NR
    00:00
    /
    00:00
    High NR
    00:00
    /
    00:00
    ~ NC45

    Approximately NC45 Background Noise: 50.8 dBA

    This noise level is the absolute upper limit of usable for conferencing applications. The raw noise level will compete with direct sound for most in-room participants and is generally unacceptable at any budget or use case.

    Speech is intelligible up to Medium NR settings but suffers from the signifiant noise and processing artifacts.

    Surprisingly, "Low" is likely the ideal NR selection as residual noise level and sound quality are most balanced. Medium may be better in some applications, however, depending on the circumstances of far-end listeners.

    Processing artifacts are noticeable at all NR levels. At High NR, intelligibility suffers unaccetably.

    No NR
    00:00
    /
    00:00
    Low NR
    00:00
    /
    00:00
    Medium NR
    00:00
    /
    00:00
    High NR
    00:00
    /
    00:00
    ~ NC50

    Approximately NC50 Background Noise: 56.1 dBA

    This noise level is beyond the usable limit for most if not all applications. The raw noise level will compete with direct sound for all in-room participants, rendering the space generally unusable.

    Many syllables and vowel sounds are burried in the noise floor and are therefore suppressed by the NR algorithm. Speech is probably most intelligible without NR.

    The Medium NR setting is unusable, and High NR may be the best choice if absolutely necessary for a conferencing application. Near-end participants need to be instructed to speak loudly. For archival applications, no NR may be the best choice.

    No NR
    00:00
    /
    00:00
    Low NR
    00:00
    /
    00:00
    Medium NR
    00:00
    /
    00:00
    High NR
    00:00
    /
    00:00
    ~ NC55

    Approximately NC55 Background Noise: 61.3 dBA

    This noise level is beyond the usable limit for conferencing applications. The raw noise level will overpower direct sound for all in-room participants, rendering the space generally unusable.

    Few words and syllables can be discerend at any NR level as all speech is buried below the noise floor and is therefore further suppressed by the NR algorithm.

    No NR
    00:00
    /
    00:00
    Low NR
    00:00
    /
    00:00
    Medium NR
    00:00
    /
    00:00
    High NR
    00:00
    /
    00:00