Tuesday, April 10, 2018

SoX on steroids

The quest for better performance will never end.

This time it's SoX - "The Swiss Army Knife for Soundprocessing"

We use it on e.g. Logitechmediaserver for samplerate conversions (SRC).
I use it for highest quality SRC to 352k8/384k for one of the DACs I'm running.

Perhaps there's something to gain in that area!?!?




If you have followed earlier articles you might stepped over that part where we were building a customized squeezelite binary. The basic idea behind was and still is, the more specific a binary gets aligned to the target infrastructure (CPU), the better might be its performance.

If you look at it from the other side it also means, the more generic a binary is made
the slower it might get.

The compiler used on Linux systems is "gcc". It allows to add numerous (optimization-)options to a compilation process. The usual binary packages that you install
via package managers on your Linux systems are precompiled using pretty generic options - to be on the save side. 


There's another factor - related to LMS in particular. Pretty much every program uses libraries. Many of them are used in case of sox.
You need a pulseaudio, a flac, an ogg, an alsa, and more libraries to get sox compiled.

Then there's dynamic linking and static linking of these libraries. 
If you configure static linking during the compilation process, all library related code is put into the target binary. 

Consequence? You'll get a really bloated binary.


Here we are. 

The sox binary that gets supplied by the LogitechMediaServer package is
fad, really fad.  And it's as generic as it can get.  (e.g. LMS sox=4,8MB Ubu sox=71kB)


Consequence? Yep. It's potentially slow and potentially inefficient.

It's not a fault or a flaw though! Logitech does it to provide a stable platform. 
Logitech can't rely on whatever OS to provide the right libraries at the required revisions.
They can not provide just a binary for a Broadwell NUC. It has to cover the entire x64 family.

However. I couldn't care less. I'm digging for performance improvements. You can't leave any stone untouched if you want to succeed.


I now compared three different sox binaries on my NUC server. Remember. The one that also runs the LogitechMediaServer.   

1. LMS supplied sox (sox-lms) - nightly LMS 7.9 -- Apr-10th-2018  
2. Ubuntu supplied sox (sox-ubu) - Ubuntu 17.10 -- Last update Apr-10th-2018
3. My own compiled sox (sox-opt) - sources from here and using special compiler options

As testcase I've chosen a real world scenario. 

A 6.29min 44.1kHz flac file that gets samplerate converted to 352800Hz.

Now the little benchmarking code snippet I've written and been using:

########################################################

SRC="rate -v -b 95.0 -p 50 -a 352800 "

DITHERMODE="dither -S"

IF="/tmp/test.flac"
OF="/tmp/tmp.flac"

for i in sox-lms sox-ubu sox-opt ; do
   BIN="/tmp/$i"
   echo "****************************"
   echo "Binary = $BIN"
   time $BIN -t flac $IF -t flac -C 0 -b 24 $OF $SRC $DITHERMODE
   sleep 1
   rm $OF
   echo 
done
#########################################################


Basically the music file is resampled 3 times, each time using a different sox binary.
The program "time" is used to count and display the actual execution time.

And here comes its output:

****************************
Binary = /tmp/sox-lms

real 1m51,729s
user 1m51,424s
sys 0m0,294s

****************************
Binary = /tmp/sox-ubu

real 0m37,420s
user 0m37,189s
sys 0m0,230s

****************************
Binary = /tmp/sox-opt

real 0m14,821s
user 0m14,552s
sys 0m0,269s

****************************

SUMMARY

Processing times:

* 112s for LMS. 
* 37s  for Ubuntu and 
* 15s  for my own optimized binary.


Folks that's what I'd call a serious difference! 
Just using the Ubuntu supplied binary generates almost a factor 3 improvement. 
Using my own binary equals factor 7.5!
Wow. I didn't expect such a difference. 

And that's exactly the reason why I wanted to share this story with you.

Bottom line. Mission overaccomplished!

Just to mention it. I also did the test for the "flac" binary that comes with LMS. Same process. I just did the testflac decoding to wav. 
The Ubuntu and my optimized binary show a reproducible 20% improvement over the LMS flac binary. That's what I call a decent improvement. Not earth shattering though.
Surprisingly the optimization didn't show any effect this time.


I'll also add a HowTo. It'll outline how to get these binaries enabled on your LMS installation.


Enjoy.


##############################################################

HowTo

Part1  - Using Ubuntu binaries

It is assumed that LMS is installed on a Debian/Ubuntu based Linux PC.
First I show you how to use the Ubuntu supplied sox and flac binaries.
That'll cause a worthwhile performance increase already.

Open a terminal and run below commands. 
You can use copy/paste below. First the sudo and then the rest all at once.

##################################################

sudo su

##################################################

apt-get -y install flac sox
cd /usr/share/squeezeboxserver/Bin/x86_64-linux
cp sox{,.orig}
cp flac{,.orig} 
cp /usr/bin/{flac,sox} .


###################################################

Now restart the LMS server or your machine. 

Note1: Using the flac binary has some small limitations. The biggest drawback is the loss
of FWD/REW functionality. I don't use this at all. Some of you probably can't live without it.
You can try above first. And then - if you don't like it - restore the original binary.

Note2: Every LMS server update will overwrite the binaries. You'd have to redo above
exercise once in a while.


Part2 - Compiling your own sox binary

Coming soon!




2 comments:

  1. great article. can you please provide instructions for picoreplayer, as well?

    ReplyDelete