Can we hear phase changes and small delays in the bass range, caused by misaligned subwoofers or room acoustics? Let us find out. This controlled experiment shows the facts about group delay.
Group delay at low frequencies
Room acoustics and the loudspeaker system causes timing to be far from perfect in the bass range. The room can be difficult to fix, but misaligned subwoofers we can do something about. And a good bass-system with proper calibration can give very reasonable performance, by reducing most of the impact from the room and perfect time-alignment between main front speakers and bass-system.
Like the system in Room2, with 2x V110 subwoofers up front, and 2x V6030 in the back:
But is it audible. If we can not hear it, the effort is wasted, and we could focus on just getting the frequency response smooth and flat. Much easier.
So we set up a little experiment, which enables us to verify if group delay is actually audible.
Method
By applying signal processing to change phase and delay in a way similar to what is typically found in real-world systems, we can create sound samples with different group delay and compare those samples to the original. Then we can use ABX to verify audibility, to prove we actually hear what we think we hear.
I created two different signal processing filters, one with approximately 20ms group delay, one with more than 100ms.
Here is the GD of the 20ms processing:
Frequency response is reasonably flat, considered below threshold for audibility:
The 100ms GD:
Frequency response of the 100ms is borderline, but with transient music signal it is not likely to be audible:
It is obvious that it is the change in group delay across a limited frequency range that can cause audible changes to an audio signal. If the whole frequency range is delayed, it would not matter, because it only means the music starts a little later. Here I have placed the highest peak in GD down in the mid-to-low bass, sliding down to zero in the 100-200Hz range. This causes phase and timing changes to be focused in the important mid and upper bass range, where attack and tactile information from drums and other transient sounds originates.
Music samples
The music is from Flashbulb’s Reunion album, track Walking Irrevocable. A 10 second excerpt from the original, 20ms and 100ms processed files can be found in the test signals folder:
Walking Irrevocable 10sec original.flac
Walking Irrevocable 10sec GD20ms.flac
Walking Irrevocable 10sec GD100ms.flac
The delay processing causes visual changes to the waveforms. One channel, for each of the tracks – original, 20ms, 100ms:
We see that the delay processing causes transients to be smeared out in across a larger time interval, the initial peak amplitude is reduced. Luckily, we don’t hear waveforms.
ABX test
I started with the 100ms sample, first casually listening for differences. It is clear that the differences are not huge if you do not know what to listen for. I suspect the abx will be challenging, at low volume it is quite difficult. At -10dB, the differences are more obvious – transients sound more powerful on the original, there is more realism.
I used the foobar abx to verify that I can actually hear a difference, and this is the result:
foo_abx 2.0.6c report foobar2000 v1.3.14 2019-08-19 22:39:51 File A: The Flashbulb - R+¬union - 08 Walking Irrevocable.flac SHA1: 1b0e6132f7504b151ca75c62681828a0c3114775 File B: The Flashbulb - R+¬union - 08 Walking Irrevocable ph4.flac SHA1: 3776df8e24abdea604d264366efa591bd45fa4c5 Output: DS : DENON-AVAMP (Intel SST Audio Device (WDM)) Crossfading: NO 22:39:51 : Test started. 22:47:46 : 01/01 22:48:53 : 02/02 22:49:46 : 03/03 22:51:18 : 04/04 22:52:27 : 05/05 22:53:33 : 06/06 22:55:38 : 07/07 22:58:06 : 08/08 22:59:55 : 09/09 23:00:47 : 10/10 23:00:47 : Test finished. ---------- Total: 10/10 p-value: 0.001 (0.1%) -- signature -- 32e562f14ad104061293122edb6a305755d287f1
The test shows I could hear the difference, with very high certainty.
I expected the 20ms to be very difficult, if not impossible, to hear. But there was an audible difference in the attack on the transients, and a quick 8-trial abx shows it is very likely I can hear this, it is not just imagination:
foo_abx 2.0.6c report foobar2000 v1.3.14 2019-08-19 23:12:43 File A: The Flashbulb - R+¬union - 08 Walking Irrevocable.flac SHA1: 1b0e6132f7504b151ca75c62681828a0c3114775 File B: The Flashbulb - R+¬union - 08 Walking Irrevocable ph.flac SHA1: d7ba9144061af2155059235ec6af121204387561 Output: DS : DENON-AVAMP (Intel SST Audio Device (WDM)) Crossfading: NO 23:12:43 : Test started. 23:14:02 : 01/01 23:17:13 : 02/02 23:18:16 : 03/03 23:18:59 : 03/04 23:20:08 : 04/05 23:21:06 : 05/06 23:22:11 : 06/07 23:24:01 : 07/08 23:24:01 : Test finished. ---------- Total: 7/8 p-value: 0.0352 (3.52%) -- signature -- d02a0431a802b778b4f4df6acfdb4ae77776545b
I listened to this in Room2, master volume at -10dB. This system performs better than the 20ms sample, and I can also play at significantly louder volumes, which may give even larger differences due to increased tactile feel. I did not test the samples on any system with lesser group delay performance, and so it is open to whether the group delay of the playback system itself will mask the delay in the 20ms or 100ms samples. If you can not hear any difference, it is possible that the performance of the playback system can prevent you from hearing the differences.
Conclusion
The ABX test shows that group delay at lower frequencies is audible, and threshold for GD audibility is at least below the 20ms which was tested here.
Difference between 100ms and 20ms samples was not tested. This could provide information to whether it is the first 20ms that is most critical, or if there is a more linear degradation with increasing delay.
—————————-
Update 20.08.2019: Spectrograms and subjective impressions from Room2
Spectrograms
To get a better perspective of what we are dealing with here, we can look at spectrogram charts, those give a good impression of what is going on in time across the frequency range.
All charts are 100ms 40dB range.
First, the system in Room2 for reference as an example of typical performance that is possible to achieve:
20ms GD:
100ms GD:
We see that the 20ms processing will change timing significantly in the important mid and upper bass frequency range.
Subjective impression from Room2
The ABX test verifies that I can hear a difference. Now I have listened in Room2 at different, louder, listening levels, to observe and learn more how group delay affects sound quality.
First, I compare original to 20ms.
At 0dB, there is a difference in attack and solidity on bass transients. On the 20ms different intruments sound more similar, softer. Tonal balance, definition, soundstage sounds similar.
At +6dB the 20ms definitely misses out on that addictive physical feel on the drums. The original gets better and more addictive as you increase volume, while the 20ms sounds just louder, kind of like a typical hifi-system, just louder.
The 100ms may be even a little softer and looses even a bit more attack, but the destruction of the addictive feel was already present in the 20ms. The bass does not sound delayed or resonant or boomy, even with this huge delay present.
Comparing 20ms to 100ms, there is a difference, which indicates it may be worth it to try to reduce GD from hopelessly far-off down to more reasonable levels.
If you can compare instantly, switching between original and delayed, the difference is quite significant, especially at louder volumes. The original simply sound so much more addictive and powerful and realistic in tactile feel.
Update 21.08.2019: New sample files
5sec sample files
It is difficult to hear any difference using headphones, and on loudspeakers on low volume. It can also be very difficult to hear on loudspeakers, if the response in the most critical range for this test is a bit untidy.
To make it easier, I have made some new, short samples. I managed to hear a difference here, using headphones. I ended up listening for the 5. drum stroke – sounds more solid and tight on the original.
Start by comparing original to 100ms, this is the easiest to hear.
Walking Irrevocable 5sec original
Walking Irrevocable 5sec GD 20ms
Walking Irrevocable 5sec GD 100ms