Audibility of group delay at low frequencies

Can we hear phase changes and small delays in the bass range, caused by misaligned subwoofers or room acoustics? Let us find out. This controlled experiment shows the facts about group delay.

Group delay at low frequencies

Room acoustics and the loudspeaker system causes timing to be far from perfect in the bass range. The room can be difficult to fix, but misaligned subwoofers we can do something about. And a good bass-system with proper calibration can give very reasonable performance, by reducing most of the impact from the room and perfect time-alignment between main front speakers and bass-system.

Like the system in Room2, with 2x V110 subwoofers up front, and 2x V6030 in the back:

GD in Room2.

But is it audible. If we can not hear it, the effort is wasted, and we could focus on just getting the frequency response smooth and flat. Much easier.

So we set up a little experiment, which enables us to verify if group delay is actually audible.

Method

By applying signal processing to change phase and delay in a way similar to what is typically found in real-world systems, we can create sound samples with different group delay and compare those samples to the original. Then we can use ABX to verify audibility, to prove we actually hear what we think we hear.

I created two different signal processing filters, one with approximately 20ms group delay, one with more than 100ms.

Here is the GD of the 20ms processing:

GD 20ms processing.

Frequency response is reasonably flat, considered below threshold for audibility:

Frequency response and phase 20ms processing.

The 100ms GD:

GD 100ms processing.

Frequency response of the 100ms is borderline, but with transient music signal it is not likely to be audible:

Frequency response and phase 100ms processing.

It is obvious that it is the change in group delay across a limited frequency range that can cause audible changes to an audio signal. If the whole frequency range is delayed, it would not matter, because it only means the music starts a little later. Here I have placed the highest peak in GD down in the mid-to-low bass, sliding down to zero in the 100-200Hz range. This causes phase and timing changes to be focused in the important mid and upper bass range, where attack and tactile information from drums and other transient sounds originates.

Music samples

The music is from Flashbulb’s Reunion album, track Walking Irrevocable. A 10 second excerpt from the original, 20ms and 100ms processed files can be found in the test signals folder:

Walking Irrevocable 10sec original.flac

Walking Irrevocable 10sec GD20ms.flac

Walking Irrevocable 10sec GD100ms.flac

The delay processing causes visual changes to the waveforms. One channel, for each of the tracks – original, 20ms, 100ms:

Sample music waveforms – original, 20ms, 100ms.

We see that the delay processing causes transients to be smeared out in across a larger time interval, the initial peak amplitude is reduced. Luckily, we don’t hear waveforms.

ABX test

I started with the 100ms sample, first casually listening for differences. It is clear that the differences are not huge if you do not know what to listen for. I suspect the abx will be challenging, at low volume it is quite difficult. At -10dB, the differences are more obvious – transients sound more powerful on the original, there is more realism.

I used the foobar abx to verify that I can actually hear a difference, and this is the result:

foo_abx 2.0.6c report
foobar2000 v1.3.14
2019-08-19 22:39:51

File A: The Flashbulb - R+¬union - 08 Walking Irrevocable.flac
SHA1: 1b0e6132f7504b151ca75c62681828a0c3114775
File B: The Flashbulb - R+¬union - 08 Walking Irrevocable ph4.flac
SHA1: 3776df8e24abdea604d264366efa591bd45fa4c5

Output:
DS : DENON-AVAMP (Intel SST Audio Device (WDM))
Crossfading: NO

22:39:51 : Test started.
22:47:46 : 01/01
22:48:53 : 02/02
22:49:46 : 03/03
22:51:18 : 04/04
22:52:27 : 05/05
22:53:33 : 06/06
22:55:38 : 07/07
22:58:06 : 08/08
22:59:55 : 09/09
23:00:47 : 10/10
23:00:47 : Test finished.

----------
Total: 10/10
p-value: 0.001 (0.1%)

-- signature --
32e562f14ad104061293122edb6a305755d287f1

The test shows I could hear the difference, with very high certainty.

I expected the 20ms to be very difficult, if not impossible, to hear. But there was an audible difference in the attack on the transients, and a quick 8-trial abx shows it is very likely I can hear this, it is not just imagination:

foo_abx 2.0.6c report
foobar2000 v1.3.14
2019-08-19 23:12:43

File A: The Flashbulb - R+¬union - 08 Walking Irrevocable.flac
SHA1: 1b0e6132f7504b151ca75c62681828a0c3114775
File B: The Flashbulb - R+¬union - 08 Walking Irrevocable ph.flac
SHA1: d7ba9144061af2155059235ec6af121204387561

Output:
DS : DENON-AVAMP (Intel SST Audio Device (WDM))
Crossfading: NO

23:12:43 : Test started.
23:14:02 : 01/01
23:17:13 : 02/02
23:18:16 : 03/03
23:18:59 : 03/04
23:20:08 : 04/05
23:21:06 : 05/06
23:22:11 : 06/07
23:24:01 : 07/08
23:24:01 : Test finished.

----------
Total: 7/8
p-value: 0.0352 (3.52%)

-- signature --
d02a0431a802b778b4f4df6acfdb4ae77776545b

I listened to this in Room2, master volume at -10dB. This system performs better than the 20ms sample, and I can also play at significantly louder volumes, which may give even larger differences due to increased tactile feel. I did not test the samples on any system with lesser group delay performance, and so it is open to whether the group delay of the playback system itself will mask the delay in the 20ms or 100ms samples. If you can not hear any difference, it is possible that the performance of the playback system can prevent you from hearing the differences.

Conclusion

The ABX test shows that group delay at lower frequencies is audible, and threshold for GD audibility is at least below the 20ms which was tested here.

Difference between 100ms and 20ms samples was not tested. This could provide information to whether it is the first 20ms that is most critical, or if there is a more linear degradation with increasing delay.

—————————-

Update 20.08.2019:  Spectrograms and subjective impressions from Room2

Spectrograms

To get a better perspective of what we are dealing with here, we can look at spectrogram charts, those give a good impression of what is going on in time across the frequency range.

All charts are 100ms 40dB range.

First, the system in Room2 for reference as an example of typical performance that is possible to achieve:

Spectrum Room2, 100ms/40dB scale.

20ms GD:

Spectrum 20ms, 100ms/40dB scale.

100ms GD:

Spectrum 100ms, 100ms/40dB scale.

We see that the 20ms processing will change timing significantly in the important mid and upper bass frequency range.

Subjective impression from Room2

The ABX test verifies that I can hear a difference. Now I have listened in Room2 at different, louder, listening levels, to observe and learn more how group delay affects sound quality.

First, I compare original to 20ms.

At 0dB, there is a difference in attack and solidity on bass transients. On the 20ms different intruments sound more similar, softer. Tonal balance, definition, soundstage sounds similar.

At +6dB the 20ms definitely misses out on that addictive physical feel on the drums. The original gets better and more addictive as you increase volume, while the 20ms sounds just louder, kind of like a typical hifi-system, just louder.

The 100ms may be even a little softer and looses even a bit more attack, but the destruction of the addictive feel was already present in the 20ms. The bass does not sound delayed or resonant or boomy, even with this huge delay present.

Comparing 20ms to 100ms, there is a difference, which indicates it may be worth it to try to reduce GD from hopelessly far-off down to more reasonable levels.

If you can compare instantly, switching between original and delayed, the difference is quite significant, especially at louder volumes. The original simply sound so much more addictive and powerful and realistic in tactile feel.


Update 21.08.2019: New sample files

5sec sample files

It is difficult to hear any difference using headphones, and on loudspeakers on low volume. It can also be very difficult to hear on loudspeakers, if the response in the most critical range for  this test is a bit untidy.

To make it easier, I have made some new, short samples. I managed to hear a difference here, using headphones. I ended up listening for the 5. drum stroke – sounds more solid and tight on the original.

Start by comparing original to 100ms, this is the easiest to hear.

Walking Irrevocable 5sec original

Walking Irrevocable 5sec GD 20ms

Walking Irrevocable 5sec GD 100ms