volume control - linear - pro-audio style

Ever thought about volume controls (VC)? 

...how they work? 
...digital volume controls in particular?

or... 

...do you simply slide your VCs up'n down enjoying the sound level getting up'n down !?!? 


Introduction


You'll find digital or software based volume controls on pretty much every device and app
nowadays. A few clicks (or swipes) up and the sound level goes up. A few clicks down and you'll dive into silence.
  
Fun Fact. Pretty much nobody knows what's happening under the hood.

If you start looking into digital volume controls from a software designer perspective, you'll realize the vast amount of creativity that exists out there (under the hoods). 

First of all. There is no standard. 

No standard, no guidance - something like that usually ends up in chaos. 
Everybody can do whatever he/she wants.  

And ... as a matter of fact ... everybody out there does whatever he/she wants.

You'll find all kind of
  • scales (shapes)
  • number of steps per scale
  • step-widths
  • min-max values
  • curve shapes
  • precision (bit-depth)
  • algorithms
  • rounding and approximations
  • errors and flaws
  • locations (server, applications kernel (driver)/HW , OS sound layer, remote devices/... )

And all that needs to work somehow on all kind of devices and systems, preferably all in sync.

What you'll realize is there's usually no direct relation to the actual sound level.  "80%" on a volume control. What does it mean? It actually usually doesn't have anything to do with the absolute sound level. Usually it just means 80% of that 0-100 app scale. 

Neither the app nor the user can know what's actually going on further downstream though. And you bet, there's a lot more going on further downstream. 

When contacting Paul from the piCorePlayer team about a squeezelite related volume control subject, which actually triggered me to start this little project, his response was:  "...you don't want to open that can of worms...". 

I did open that can.

As usual the article takes the Logitechmediaserver/squeezelite/control app universe and its related digital volume controls as base for discussion. 

However. Many of the generic findings do also apply to other environments. 

When bringing up the volume control related subject - it's actually been an issue - on the squeezelite main thread I had the impression that simply nobody was interested. 

I quickly ran into 3 issues (software flaws) while having a closer look. The two ( 1 2 ) LMS volume-control related trouble-tickets I issued months ago are still sitting idle.

Only Allo responded (after some weeks and this-or-that friendly reminder) after I was reporting my 3rd issue, a VC bug in the Katana driver, which I found while rewriting the squeezelite volume control. As a matter of fact Allo got that bug fixed and published not long after.

OK. A pretty challenging situation right at the beginning. But that happens all the time. 
If you dig a little further than just scratching the surface, you'll get yourself right into the mess.

Paul from PiCorePlayer wasn't wrong.

This article will not focus on analog vs. digital volume controls btw. These analog things are a dying art anyhow. (That'd be worth another article.)
 
I also won't focus on quantization errors - the number one weakness associated to digital volume controls - because most todays 32bit or even 64bit volume controls feeding 32bit DACs exhibit very very low losses in that area. Basically you can waste around 10bits ~ 60dB without out touching the audible noisefloor limits on real world DAC HW out there. 
However. If you run a 16bit DAC, I know some people still do, you'll face serious issues - quality loss -  by using digital volume controls! Luckily pretty much all modern DACs support 32bit transfers.


Keep in mind. That doesn't mean there'd be no need to properly level the output voltage on the DACs analog side respectively the input voltage on your the amp side. 
Either, a  good DAC should offer a stepped adjustment of the analog output voltage, e.g. 1V, 2V, 4V (only a very few do!). That'd give us even more headroom for the digital volume control. 
But that'd still be just the 2nd best option. 
We should actually keep the voltage as high as possible towards the amp or to phrase it slightly different, as high as possible towards the maximum specified input voltage of the amp. 

This also means we preferably look for an amp with gain adjustment ( not to be mixed up with a volume control pot! ) on the amps input stage.
 Why is that? More and more people connect their DACs right to the amp. Pre-amps are 
pretty much obsolete for many people. There's a high risk associated to this. If you run your DAC at full speed into an amp, this can easily kill your speakers under certain conditions.

OK. That, a proper DAC <> amp integration, is actually another quite important subject. I won't get more into it at this stage.  I'm btw. running an amp that allows gain adjustment on its input stage.  That allowed me to dial in that DAC/Amp level nicely so that I can use my volume control scale close to its full range. Even at 100% or 0dBFS (FS = full scale) I won't blow up my speakers.


Now. The sound level is one thing. The shape of the volume control curve is another.


After all that initial digging and looking into the volume control abyss an idea developed. 

How great would it be to have a digital volume control where you exactly know what's going on. A certain value on your volume control apps would have a relevant meaning.

That's not such a wild dream btw. In the pro-audio arena it's a must. 

Bottom line. We look for a linear digital volume control. That's why we have to have a closer look at the DSP scales and the app-scale mappings in particular.


A Volume Control


What's actually expected of a simple volume control?  What do we actually want? 


Right. 

Full speed at 100% and Mute at 0. And Whatever in between.

That'll pretty much cover it. And that's what we actually get all over the place.

OK with that? 

Yep? ... Then that'll be the wrong project for you to read. ;)


Just kidding.


Let's take a pro-audio mixer as example. Professionals need to know and control exact signal levels to get the job done. And that's what pro-audio mixers and tools usually offer. 

Basically all sound-levels are handled in dB. You tell the mixer to get 1dB down and then the audio signal will follow and gets 1dB down.
 
To me that'd be my wish for a quality volume control in the consumer world.   


Lets's discuss some basics first.

How is all this actually working!?!?

You move your volume control slider on you app up'n down. That slider has an upper and lower limit, which makes its range. Usually the app slider follows a linear scale shape like 100 >> 0. Which would lead to a 101 steps scale. 

Beside that you can move through that range with a certain step-width. Most you'll find a step-width of 1. 

Once you're done sliding your volume control, you stop at a certain scale position with the control slider. 

That simple position value gets communicated to the next stage. What's done with that value 
depends on the environment. It differs all the time.

In our case the control app such as iPeng reports that value to LMS.

The next level (LMS) then will take that number and will apply an algorithm to map that slider position to a certain sound-level curve. That sound-level curve can have any kind of shape. It basically depends on the manufacturer/developer what he considers best for the application or device. That sound-level target curve could be linear or logarithmic, it could also have e.g. a limited range 0 - -73dB and/or perhaps just 20 steps due to HW limitations. There are infinite scenarios out there. That's typically the black-hole part for the user.

This mapping task is actually the tricky part.
 
The curve designer needs to decide in what fashion to map e.g. the 101-step VC slider values to an e.g. 20-step on whatever curve-shape like scenario. Something you usually never get to see. 
Consequence: 10000 designers = 10000 solutions.
 
Within the LMS environment once the mixer scale to volume curve mapping is done things become much easier.  

The "final" stage - squeezelite - receives a number and knows what to do with it. 
Usually that number (attenuation factor) gets multiplied with the actual audio sample. 
The results then get rounded to 16, 24 or 32 bit samples, depending on what your DAC/driver supports.

Above reflects the LMS/squeezelite setup. Other setups will look different of course.

Generic advise:
Make sure to use the highest precision towards your DAC that's being supported. 
If your DAC supports 32bit data transfer and processing - use and configure 32bit! That'll reduce digital VC losses to a minimum.


As you hopefully can imagine now, there's a lot going on. And the whole thing gets even more complex if you add more stuff such as e.g. replaygain to the game. Some volume controls even add dither to cope with the digital processing associated losses. Asf. Asf. 


Let's have a little closer look at the whole thing and limit the scope by looking at the LMS/squeezelite implementation in particular. 

LMS-Squeezelite VC - AS-IS


Once more.  Squeezelite, the actual player or client, does not define the actual volume curve.
squeezelite just runs the "final" stage as described earlier. It simply multiplies a value that's 
supplied by LMS with the audio sample.

The actual VC curve and scale gets defined, calculated and applied on the Logitechmediaserver. 
Squeezelite just takes the LMS supplied level (gain) information and applies it to the audio data. A simple multiplication.  
It's done as late as possible in the processing chain, because otherwise you'd face a huge delay from moving your VC slider to its actual response. 

Now. How does the actual volume curve being offered by LMS for squeezelite look like?

Note: LMS offers different curves for different players! All curves are hardcoded!


For squeezelite the curve looks like this:



What you see is a special logarithmic curve with two different slopes. The intention is to have a higher sensitivity in the upper (louder) part.

The three spikes you see in above curve reflect the LMS VC curve flaws (swapped&ambiguous values - remember the earlier mentioned LMS trouble tickets - I simply marked them this way). 

This LMS VC curve does not show your actual volume level. It simply reflects a conversion rule.

Let's get a bit technical. How does it actually work?

The audio app sends in a simple value e.g. something between 0 and 100. 
LMS takes that value in an converts it according to the underlying algorithm as reflected in above curve.
squeezelite then gets a 16-bit value between 65536 ->0  from LMS. There'll be 101 numbers within this range of 65537 values. 
These values follow above shape of curve. squeezelite converts these values into normalized logarithmic values. 
These values get distributed over a 101-click scale.
65536 becomes 0dB = 100%. 

LMS offers different VC scales for a small number of different streaming clients. 
These "players" define all the parameters towards the streaming client. 
squeezelite makes use of one of these "players". The player that's being used for squeezelite is called "Squeezebox2 ". It originates from ancient slim-devices times

Unfortunately squeezelite doesn't have it's own "player"-module on LMS. The "player"-module related parameters - such as the curve and related algorithms incl. related flaws - are hardcoded (compiled into these player-module binaries), thus can not be changed from the userspace. 

I still hope that one day we see a "squeezelite" player-plugin popping up on LMS offering a simple 101 step data mapping, so that the actual volume control algorithm can be implemented within squeezelite - the application. 


Anyhow. For now it means we (all the folks who are not compiling LMS manually) have to accept above LMS VC curve incl. flaws.


Now we come to the subject that actually started this exercise. 


 A linear volume control


My naive understanding about a linear volume control simply was and still is 1click = 1dB.  That's what I falsely assumed when reading the squeezelite help menu and stepping over a linear volume control option.

I expected a linear function in terms of having a straight line in a half-logarithmic graph

>> 100 clicks 
on x and -100dB on y. 
>> 100=0dB; 99=-1dB; 98=-2dB; asf.

In other terms. I expected that if iPeng/Orange Squeeze/Material Skin, asf. would e.g. show "94" on their 100-0 scale I'd expect the DAC level to be "6dB" down.

I mean - squeezelite offers that feature called "linear" volume control. 


FFW. Of course this feature wasn't linear as I thought it was. I figured that out later on.
It simply can't. Meanwhile you know why!  Just have a look at
 the above LMS curve!
And as mentioned, there's no other algorithm or curve applied inside squeezelite.

I then looked at the DAC side. 

Many DACs offer a very capable and sometimes even preferable "hardware" volume control. Such as the modern Sabre ES9038 DAC family.  I noticed people get confused by the term HW volume control. In estimated 99% of all cases it's actually NOT a "hardware" volume control we're talking about, like an Alps potentiometer which directly affects the analog audio signal, nope, it's actually a software volume control located on the DAC ""hardware"". It's basically another, usually a 2nd SW volume control. 
You've basically got one SW volume control inside squeezelite and another SW volume control on your DAC. The good part. squeezelite offers access to either of them - as long as the audio device driver maps the external VC back into the OS.

The hardware volume control is made available via the Linux kernel - or more precise - through the audio interface drivers. 
Linux/Alsa ( Alsa is the Linux sound layer) offers a 8 bit range for mapping an external volume control scale. That means we'll have 256 values (max)  at hand to implement a "HW" VC scale. With these limited number of steps though we can only mimic a linear mapping (or scale) of course!

As mentioned, all VC curves are logarithmic. That means, the DAC designer needs to introduce a  logarithmic >> linear scale conversion on his audio device to get these 256 values mapped to a certain logarithmic (dB) value. 
That sounds like a good start. Building a linear VC externally on the DAC side would then be possible.

How these 8-bit actually get mapped to the DAC solely depends on the HW (driver) designer.
The associated "creativity"  leads to all kind of scales. For the same DAC chip, e.g. ESS Sabre 9038Q2M, you'll find all kind of  implementations.
E.g. the Audiophonics 9038Q2M driver offers 101 steps at 1dB/step. For the same DAC
Allo offers 256 steps at 0.5dB/step.

That means the application developer respectively the user has to cope with that kind of creative freedom out there.

What's bugging me. The application designers usually do not care (or even do not understand!?!?) how the users are implicated.
I tried to talk to Allo about it. I asked them who'd need -128 dB if below -80dB nobody would hear any useful sound. It is also common sense that a 1dB step width is more then sufficient. Why then introducing a 0.5dB stepwidth!?!? It remains Allos' secret how they got there.

All that leaves the user stuck with the chaos out there. 

And now comes the challenge.

To be able to enjoy a linear scale (1click = 1dB ), the audio application designer would have to offer a very simple linear scale, which would lead to e.g. 100=0dB, 99=-1dB,  88=-12dB asf.  

Are you still with me??


The bad - meanwhile - old news. That linear scale neither doesn't exist on LMS, nor on squeezelite. 

Logarithmic scales - Why?


Pretty much all VCs are logarithmic. That's done for a reason.  On loud sounds we are less sensitive, on low sounds we're much more sensitive. Our hearing follows a pretty logarithmic scale. Logarithmic VCs basically mimic our hearing sensitivity scale. Using linear scales for the sound level simply 

  1. wouldn't make sense and 
  2. wouldn't work satisfactory


Either you'd need a huge number of steps to cover the audible range on a linear scale. Or if you'd limit the number of steps you'd need to accept all but a smooth VC with big jumps on the way up'n down...


Technical implementation

A technical SW VC silder or switch usually offers a certain number of steps. 0,1,2,3,4 ... 100. Such a scale obviously reflects a linear scale.

Now we've got that logarithmic hearing scale. How do we get both scales in sync

We need to map them! 

Right. Looks trivial - doesn't it.  Case closed! 


Wait a minute! How exactly is this mapping gonna work !?!? 

Trivial???

The logic. Yep.
The implementation. Oh boy! 




Each VC step on the linear scale gets mapped to a 1dB step.  

Lets have a look at a real-world example. 

1.
iPeng ,Orange Squeeze , Material Skin, asf. usually offer a 101 step (100-0) linear scale. Note: It's the app that offers that linear scale!

2.
LMS converts the app scale to the volume scale and tells squeezelite what to apply. LMS provides 101
different values.


3.
squeezelite applies one of these 101 values to the audio signal.



I realized I had to rewrite squeezelite to get the situation under control. 

The idea. 

Make squeezelite the entity that controls the volume curve by taking the LMS volume curve out of the loop.

I did it by introducing a mapping function with mapping tables.

Each of the 101 different volume values as being send by the LogitechMediaServer gets mapped to a unique value of my own linear db-scale curve. The LMS supplied value won't achieve the audio bit anymore.  

E.g. If your app now shows "94" you'll have a "-6dB", or a "50% DAC output voltage"


That'll finally make the LMS volume curve irrelevant and my volume control linear.


By introducing the mapping I also managed to somehow to limit the impact of the earlier mentioned LMS related volume curve issues. 


I've now added two new features to squeezelite. 

  • a linear internal  VC  - 1dB/step  *** option "-Y" 
  • a linear external VC  - 1dB/step  *** option "-X"
You can run my squeezelite version also without these features enabled to see simply see how it compares.

I havn't tried to get these new features introduced to the official squeezelite development tree.

For now I make it accessible through my own squeezelite fork.  I usually try to keep that fork 
in sync with the the squeezelite main tree.



Wrap Up

What's been done/accomplished after all:

  • Introducing a linear volume control option for squeezelite
  • I had a deep and scary look into the digital volume control jungle
  • I've written a complete new way for handling a linear db-scale properly for squeezelite
  • two trouble tickets have been issued towards LMS (I do not expect much movement on that side) - the related issues are still open
  • Allo got the Katana driver fixed
  • Shared all that with you folks out there

That new linear scale IMO also feels really nice (impact/click) when going up'n down the scale. Things get predictable.

You also get a very good feeling on how far you are down with the signal approaching the digital loss threshold - remember > roughly -60dB on 32bit data!.

If you're a hardware focused fellow. You also know now at e.g. 94 = -6dB, your DAC output level should be down at a 50% voltage.  

You can call the whole approach WYSIWYG for audio.

I've been testing my solution on 4 different HAT DACs and some USB DACs I have around here. So far it works really well.
   
In my "The Engine" article I show how to get my own fork of squeezelite (beta), which includes these features, installed on piCorePlayer.


I'd love to hear your opinion about the Pro-Audio style linear volume control. Just leave a comment.


Enjoy.

2 comments:

  1. Hi Klaus,
    thanks a lot for sharing this! I' thinking about getting an Audiophonics Evo Sabre (equipped with a ES9038Q2M) and a power amp (Box'em Arthur). I'd like to use picoreplayer/squeezelite and make use of the volume control inside the dac. I do have some concerns about this, exspecially how I could prevent accidently setting to high volume levels. Do you know if there is a way (an alsa setting? audio.conf?) to limit the volume range?
    Thx again
    Paul

    ReplyDelete
    Replies
    1. I am not aware of any feature that would limit the output level. Even if such a feature would be programmed - could easily be done - in case of errors you could still see 0dB on the output towards the DAC. If you are worried, you won't get around using a VC on your amp or your DAC.

      Delete