FLAC, OGG, MP3, CDs- setting recording, mixing, compression and printing levels

27 replies [Last post]
kvk
User offline. Last seen 3 years 49 weeks ago. Offline
Joined: 2009-09-27
Posts:

I'm having a lot of fun with Ardour, and I know I'll never even scratch the surface because my needs as a singer-songwriter aren't that complex. I'm stuck, however, on this issue.

I have two stereo tracks: one guitar, and one vocal. In mixing, I set whatever plugins I have (EQ, some reverb, compression and possibly a high/low pass filter) and mix. I listen to the mix in two sets of speakers in the room (one little doinky pair and a larger pair of KRK Rokit 8 monitors). I then burn whatever sets of mixes that potentially might be the final onto a CD and go listen to them in my car. The problem is that levels and mixes that sound good in the room change when I burn them to CD- levels that sound good and solid when playing a .WAV file in the room are barely audible when burned to CD, but when I try to raise the printing level (either via the Master Output, the compressor output, raising both the track levels, and/or normalizing the tracks), things shift and are no longer in balance, even if they sound that way on the speakers! (Everything sounds good on the headphones, so I try not to rely on that.)

This is a critical issue for me because I don't make CDs anymore. I give away all my music for free on my website as FLAC, OGG, or MP3 files. Up until recently I went into the studio, but it shut down and I went ahead and bought my own gear to do it in-house. But I'm trying to figure out what measuring line I should use for these tunes before I release them. Most people are NOT going to burn any downloads to a CD- they will either listen to them on their computer or put them on an MP3 player or similar. I get occasional emails for CDs, but that's secondary for now. So do I base my mix decisions on the digital sound or the CD, and why does it change?

GEAR:

Guitar mic: AKG C1000S
Vocal mic: Shure KSM44
Presonus FP10
AV Linux

I set my FP10 input channels to about 7, Master Gain to 7 as well. I record with the track levels at -30 dB and generally try to mix so that any peaks reach about -12 dB or so, and I place a vocal compressor a few dB below that. But in trying to match everything up, I play around with all those levels quite a lot to see what I can find out.

Sorry for the length. Thanks!!

kvk

beejunk
beejunk's picture
User offline. Last seen 19 weeks 2 days ago. Offline
Joined: 2009-04-13
Posts:

Some of your post is confusing, so clarifying a few issues first is necessary in order to find the source of your problem.

First, I'm curious why you would be using any stereo channels to do your tracking. If you're recording from one source, in this case a single mic, then there is no need for stereo. Just use mono tracks. This will also be less labor intensive for your gear, since recording in stereo with a mono signal is in effect recording two identical tracks when all you need is one.

It is also a little confusing that you mention you are setting the FP10s pre-amps to 'about 7'. It seems unlikely that your guitar mic would require the same gain as your vocal mic, and anyhow the actual number that the pre-amps are set to is not so important as long as you've got enough gain to get an appropriate signal strength. Also, I do not know what you're talking about when you refer to a Master Gain. The Main Level output of an FP10 refers to the strength of the signal being sent to the Main CR outputs used for monitoring (which are the same as whatever is being sent to system output channel 1 and 2 via JACK). This knob will not effect the levels coming from your pre-amps. The Mix knob that is on the front of the FP10 controls whether you want to monitor from the hardware or from the software. Turning it all the way to the left gives you the signal straight from the pre-amps, and all the way right gives you only the signal from your DAW. Either way, this knob also does not effect the pre-amp levels.

You state that you're printing the tracks at around -30 db, which seems very low. In the digital realm, and as long as you're recording in at least 24-bit, you should probably be shooting more for an average of -18 to -12 dbFS (if you're recording in 16-bit, you'll probably want to go louder). As for your peaks, all that matters is that none of them cause the signal to clip. Otherwise, if a few peaks get close to that 0 dbFS line, then it should be fine.

It is difficult to quantify the statement 'barely audible' when talking about the levels coming from your CD. If you're talking about the levels being much quieter than other CDs, then this is to be expected, since you need to do some appropriate mastering to get those 'hot' levels a lot of people have come to expect. However, if you mean that you actually can't turn your radio up loud enough to get a listenable volume out of the CD, then there is something wrong with the way the CD has been created that has nothing to do with your recording, mixing and mastering levels.

Last, could you clarify what you mean by 'things shift and are no longer in balance'? Do you mean the level of the vocals compared to the guitar will sound different when playing back from a CD than they did while mixing?

kvk
User offline. Last seen 3 years 49 weeks ago. Offline
Joined: 2009-09-27
Posts:

Sorry for not being clear!

On the FP10, each channel has its own gain, of course , but the unit as a whole has a MAIN knob, in addition to the HEADPHONES and INPUT/PLAYBACK knobs. Does this MAIN not affect the signal strength going into the DAW?

I was under the impression that recording stereo resulted in a "fatter" signal that could be panned, giving a more full and roomy feeling to the mix itself. This is perhaps another error?

Yes, -30 is a bit low- I wanted to err on the side of caution and experiment with a strong input that had a lot of headroom to be increased. I am confused between the individual track levels, and the Master Output level, which I increase until it sits around -10 or more to export the .WAV file. Should the Master Out not be used this way?

As for the CD level, I mean that I have to turn the car radio up to it's full level to hear the CD at a decent level. Now, by normalizing the tracks and cranking both tracks' levels AND the Master Out, I can get a decent level, but it clips into the red and definitely goes above 0.

When I say things shift, I mean a mix that sounds good in the speakers appears to lose some of the compression levels on the CD, so that the bridge, which was fine in the speakers and phones, is suddenly far too hot on the CD and jumps out from the mix as uncomfortably loud.

Interestingly enough, when the CD levels are too low and I have to raise the levels in the car, those are generally the best mixes, unless I'm not hearing the hot bridge simply because the entire thing is too quiet.

beejunk
beejunk's picture
User offline. Last seen 19 weeks 2 days ago. Offline
Joined: 2009-04-13
Posts:

Okay, that does clear a lot of things up.

The Main knob indeed does not effect the signal strength going into the DAW. That knob controls the strength of the signal coming out of the Main CR outputs that are labeled on the back of the unit. This is usually where you connect your monitors, and those outputs are carrying the same signal as whatever is being sent to the system outs 1 and 2, as mentioned earlier (this signal is also sent to the Cue Mix outs, the headphone out and, of course, the standard output channels labeled 1 and 2). Ardour will usually automatically send the Master channel out through those. If you are not doing this, I'm curious: where are you currently connecting your monitors?

You do not need to record in stereo in order to be able to pan a channel. As long as the channel has two outputs into the Master channel (which Ardour does automatically when creating mono tracks), then you can pan. You'll also want to set the channel to 'post' mode if you want to see the effects of the panning on the meter. Recording in stereo would, I believe, increase the loudness of the tracks because you're playing back two exactly identical recordings at the same time, but there's no reason that would give you a better tone or fidelity, and it certainly uses up a lot of hard disk space and processing power. You could just as easily turn up one track.

As for tracking and mixing levels, I'll try to write out a quick primer that might make things more clear. Your signal chain is going to look something like this:

Going in: Mic > FP10 > Ardour
Coming out: Ardour > FP10 > Monitors

When the mic signal hits the pre-amps, your goal is to boost the signal to the point where you are getting the most fidelity. The FP10 is designed, like all pieces of audio equipment, to sound the best at a certain signal strength, which is usually around 0 dbVU. The level that you can go above that point before getting digital distortion is the headroom, and in the FP10s case I believe that headroom is something like 18 db, although I'd have to look through the manual to be sure. So, while your -30 db tracking level is certainly cautious, there is reason to believe you are not getting the best sound out of it becauese your converter is designed for optimal sound somewhere in the -18 to -12 range. You really shouldn't need any more headroom than that.

Once in Ardour, you'll use your individually recorded channels to do some mixing, after which your master volume should generally be louder due simply to the fact that you're layering sound. It should not be significantly louder, and in general you shouldn't worry about the end volume at this point. The goal should be just to get the best sounding mix possible. Also, I personally almost never use the Master channel fader to adjust the volume, since it is almost never necessary. Sometimes I will use it to turn a mix down that has perhaps gotten too loud during the mixing process.

After you export a mix, if you want that mix to sound louder but still good, you'll need to do some mastering, which I won't go into here. This is a handy link that should get you going, though: http://www.64studio.com/node/823

That tutorial will also explain what levels you might want to shoot for and why.

It's not surprising that the mixes you attempted to make louder did not sound is good as the originals, because the processes you used (such as normalizing, which is always a no no, or just trying to boost the overall levels even if they go in the red) would almost certainly maul the sound in unpredictable ways.

That's all I've got time for. Hope this is helpful and good luck!

kvk
User offline. Last seen 3 years 49 weeks ago. Offline
Joined: 2009-09-27
Posts:

Thanks for the feedback and the link! I found an enormously long thread over on the Tape Op message board regarding recording levels and the difference between analog and digital gear and such- very similar info.

Yes, you are correct- my monitors are connected in the rear of the FP10 in the CR outputs.

I'll check out the link and read it over- thanks!!

And (maybe the tutorial will explain it) - why is normalizing a no-no?

Thanks again!

beejunk
beejunk's picture
User offline. Last seen 19 weeks 2 days ago. Offline
Joined: 2009-04-13
Posts:

I was thinking of linking to that thread. It is somewhat of a classic on the Tape Op forums, and that's pretty much where I got all the info in that last post.

As for normalizing, I'll just reference good ol' Bob Katz:

"I'll give you two reasons. The first one has to do with just good old-fashioned signal deterioration. Every DSP operation costs something in terms of sound quality. It gets grainier, colder, narrower, and harsher. Adding a generation of normalization is just taking the signal down one generation.

The second reason is that normalization doesn't accomplish anything. This ear responds to average level and not peak levels, and there is no machine that can read peak levels and judge when something is equally loud."

There you go.

roaldz
User offline. Last seen 1 year 40 weeks ago. Offline
Joined: 2008-11-12
Posts:

That´s not entirely true when your highest peak is around -10 dB. When you normalize that, it will just be 10dB louder, so your peak is at 0dB, and then you´re using all your headroom. This could also be done with a fader, but hey, that´s just another dsp operation so that´s bad..?

beejunk
beejunk's picture
User offline. Last seen 19 weeks 2 days ago. Offline
Joined: 2009-04-13
Posts:

You are right, roaldz, using a fader is a DSP, and in a strict sense that is in fact bad. This is why a lot of engineers who are mixing digitally recorded audio on an analogue board go out of their way to not change any of the levels in the DAW. Of course, most of us are mixing in the box, and so lots of signal processing is necessary. The goal is to use only the processing we need, at least if we want to maintain the highest audio quality. I find it extremely unlikely that someone who is mixing will normalize a track and then proceed to never adjust the volume of that track again for that session, though I suppose it is possible. In most cases, though, you are adding an extra unnecessary DSP, and that's just bad practice.

You are also correct in saying that a track with a highest peak level of -10 db would be made louder with normalization, but Katz's point is that the perceived loudness will likely not change much. What matters is the average sound of the track. If that peak is, say, 20 db louder than the average volume on the track (yes, it is an extreme example), then raising the whole track 10db is not going to make it sound that much louder. And in many cases, at least when recording live instruments, the peak sound is caused by some transient that is much louder than the average sound of the track.

seablade
User offline. Last seen 17 hours 30 min ago. Offline
Joined: 2007-01-22
Posts:

beejunk-

I disagree with some of the comments there sorry.

Normalization can be done in a way that undoing it causes the signal to revert exactly back to the initial status. In fact Ardour does a variant of this I believe. It is not an 'extra' level of DSP that 'degrades' the signal at all IMO. Yes I am aware of who Bob Katz is, but I am also more than slightly aware of how the math works as well and will strongly disagree with that statement as it is quoted I would not be surprised to hear there is vastly more to that story than has been let on in this thread, I have not read the TapeOp thread however.

Seablade

macinnisrr
macinnisrr's picture
User offline. Last seen 6 days 1 hour ago. Offline
Joined: 2008-01-14
Posts:

Agreed. The whole point of digital media of ANY KIND is that reproduction does not degrade the copy. Like when a person makes a photocopy of a photocopy, THAT degrades the original, but if i copy a file on the computer an infinite number of times, it will stay the same (provided the software is doing its job correctly). normalizing, moving a fader, panning, etc. should not degrade the signal if the software is coded correctly, just as copying a track to multiple other tracks should make PERFECT copies. In fact, while I agree that analog recordings CAN sound better than digital, this is not inherent in the technology. It's just that we're so used to hearing the distortion (and I guarantee that's what it is) that analog gear ADDS to the signal. Take tape for instance. Many people, myself included, think that tape sounds "warmer" than a digital recording. That is absolutely not because tape is a more accurate. The opposite is in fact true. The only reason we think tape sounds better is because that's what we've been used to hearing for the last 100 years on recordings, and the recordings we've heard are our reference for how our recordings should sound. Don't get me wrong, I love analog gear and the sound it ADDS to a recording. But is analog more true? Is a mixing console more accurate than a DAW? Does a tube amplifier make an electric guitar sound more clean than a direct feed? Absolutely not. It just sounds better because that's what we're used to. In several recent studies, college age students and younger PREFER the sizzle that mp3 compression at 128kbps adds to a recording, when compared to an uncompressed recording. Does that make sense? Absolutely, but only when one considers that that's what the people taking the survery are USED to hearing. But I digress...

Most likely the reason that normalization changes the level of the bridge in your song comparable to the rest is because you have a separate region for the bridge on at least one track (or an additional track). So what happens when you normalize all the regions is that the bridge jumps out because it had lower peaks than the rest of the song BEFORE normalization, and normalizing raises the peaks to the same level, thereby causing the AVERAGE level of the bridge to be much higher than before.If you were to cosolidate the regions into one new one BEFORE normalization, the result would set the peaks of the ENTIRE track to 0, keeping the relative loudnesses of the different sections of song the same.

Sorry about the rant, but I've just heard this "analog is inherently better" argument too many times and I simply disagree.

kvk
User offline. Last seen 3 years 49 weeks ago. Offline
Joined: 2009-09-27
Posts:

There's so much to learn!! :-p

When I made my first commercial album, it was a tape (CDs were beyond the budgets of most of us back then) and the Mac Classic II hadn't even come out yet!

Thanks for all the input!

Macinnisrr, I have the entire vocal track, including the bridge vocals, on a single track. I think I just sang it hotter.

I thought I had it set to levels comparable to some of my other tunes that had been done in the studio, but when I listened to it off the website today it was definitely much quieter. Time for more messing about!

beejunk
beejunk's picture
User offline. Last seen 19 weeks 2 days ago. Offline
Joined: 2009-04-13
Posts:

seablade, no need to add a 'sorry' to your comment, this is all in the spirit of informative debate. You may very well be right about how normalization is done in Ardour, I personally am not versed enough in how Ardour works under-the-hood to disagree with you. I can say that, for many DAWs (including Cubase 4, which is the only other program which I have real familiarity) normalization is a destructive process that instantly creates a new file, and this is possibly what Katz is referring to in the quote. I should also say that the quote is relatively old, about nine years, so take from that what you will.

It is still true, though, that normalizing quite often does not accomplish anything, due to the second point that Katz makes about the way the ear reacts to sound.

macinnisrr, I agree with your point on analogue vs. digital, and I want to make clear that this has little to do with that tired debate. We are just talking about the pro's and con's of normalization. However, I have to take issue with your comments about digital copies, because we are not talking about making copies. When you process the signal in a DAW and eventually mix it down, you are creating a fundamentally different file that is going to have some level of degradation, even if it is relatively slight. Suggesting that correctly coded software can avoid this is, I think, putting a bit too much faith in digital processing.

Here is a link to the Tape-Op thread. It's good stuff:

http://messageboard.tapeop.com/viewtopic.php?t=38430

macinnisrr
macinnisrr's picture
User offline. Last seen 6 days 1 hour ago. Offline
Joined: 2008-01-14
Posts:

touche! Thanks for the link, the thread is amazing. I've always tried to avoid extra steps in mixing, and as such have not normalized tracks for quite some time because I'd have to turn them down later (basically just to save a step and streamline), but I had no idea about the idea of printing at -18 just for the headroom alone. In fact, I usually have tried to print lower dependant on how many tracks will be in the same frequency range (record bass guitar higher than electric guitars which I'm going to be layering) just to avoid touching the faders as much as possible. It's nice to find out WHY that makes everything sound better, and to be sure, I'll be keeping a closer eye on this in the future! Apologies for my tone, and thanks again for the wonderful info!!

seablade
User offline. Last seen 17 hours 30 min ago. Offline
Joined: 2007-01-22
Posts:

It is still true, though, that normalizing quite often does not accomplish anything, due to the second point that Katz makes about the way the ear reacts to sound.

Actually such a blanket statement I again will disagree with. While the concept of what you are saying is indeed correct, for a track with large peaks that go above well above the average level of the track, or even for tracks that are largely dynamic, normalization's usefulness is limited.

However in my work at least, those two are not the case the majority of the time, especially when recording at 24 bit. When I was dealing with 16 bit recording, this was more true, but 24 bit has become the norm to replace that, and as a result it allows for people to record at lower levels more safely without as much risk of the increased noise floors 16 bit recording had, thus increasing the effect normalization will have in many cases as there will often be more headroom on a track just for the sake of safety and preventing peaks.

For the record, Ardour's editing is almost entirely non-destructive. There are a few exceptions, but those are very specific exceptions, rather than the rule. This includes normalization I believe (Haven't double checked this mind you, just going off the top of my head at the moment) which these days will often work on the concept of simply amplifying the existing signal by X amount. This then gets added with any amplification from the fader, and if you move the fader down by X amount, you will end up with an identical signal to what you started with. There is no degredation in signal there in the realm of digital so long as these functions are properly implemented(And I am fairly certain Ardour's are).

As has been mentioned, file system operations coming from copying files, and reading files, have to be VERY dependable to be identical. One single bit being off when loading an executable of hundreds of megabytes, will cause it not to run. So there should again be absolutely no degredation of signal, even if a new file is created during the normalization process. The new file if modified in a negative direction by the same amount of gain the normalization, should be exactly identical to the source. So even in the destructive process you alluded to, normalization should NEVER affect the quality of the signal.

Normalization is just a form of amplification. If implemented correctly and occurring strictly in the digital domain, this should result in a signal that is only mathematically louder, but no other degradation should have occurred. So if you then take that signal and modify in the negative direction with an amplification process(IE. Fader) by an identical amount, you will end up with an identical signal mathematically. Again this is all dependant on the two processes being implemented correctly(And no plugins in the line). And again I am fairly certain Ardour has those processes implemented correctly.

Seablade

lowen
User offline. Last seen 43 weeks 8 hours ago. Offline
Joined: 2009-07-30
Posts:
Also take a read through the Gearslutz forums; specifically, read http://www.gearslutz.com/board/so-much-gear-so-little-time/420334-reason... But do search through these; while the advice is all over the board, there are some real gems in there.
beejunk
beejunk's picture
User offline. Last seen 19 weeks 2 days ago. Offline
Joined: 2009-04-13
Posts:

Seablade, I see what you're saying about the processing, but it clashes with a whole lot of things that I've heard and read. I'll add another quote from The Mastering Engineers Handbook, which is where that previous Bob Katz quote was taken (by the way, it might clarify things to point out that Katz's statements about the usefulness of normalizing were in reference to mastering only, and not any other stage of recording)

Even the smallest adjustment inside the DAW causes some massive DSP recalculations, all to the detriment of the ultimate sound quality.

It then goes on to make the point about average listening levels. My question to you is, is this incorrect, and if so, why? This is a very widely held belief that I've read in many places. And I understand that the goal of the software is to not degrade the signal, but I don't want a response that amounts to saying the numbers add up, I want to know if people have confirmed that the actual sound of the signal has not degraded. This isn't meant to be combative, I'm not even sure how you would go about confirming such a thing. But a lot of very experienced people believe that the audio is degraded with any digital processing, and if that is not the case, it would be good to know. If I have time after work, I'll do a little more extensive Googling on it, but if someone on this forum has already researched this, that would be quicker! :)

As far as the usual levels one sees in mixes, my experience has been quite different from yours. Most projects I've worked on have had recorded levels that were either in the correct range, or WAY too loud, making normalizatin useless to me. There really is a widespread misconception that you should be recording as loud as possible. Perhaps it's all those rock n' roll bands I deal with.

macinnisrr
macinnisrr's picture
User offline. Last seen 6 days 1 hour ago. Offline
Joined: 2008-01-14
Posts:

hmmm....While I've never done a side by side comparison of different levels, It's hard for me to argue with either point. It makes perfect sense to me now (I hadn't even thought of it before) why the mixing board I use to feed my DAW shows peaks at 0db (or rather 0dbvu) and the tracks all appear in software as being WAY below that (around -20). Now on one hand, I completely agree that one must not overdrive the analog portion of the signal past 0dbvu (or -18dbfs), I also agree with Seablade in that I can't understand why normalizing a level in the box (where even if it DOES clip, it really doesn't due to 32-bit floating point), would possibly result in a degradation of the signal. Seablade is absolutely right in saying that Ardour does not destructively edit when normalization takes place. Just look at the interchange folder in your project and you'll see that the file remains the same and no extra CPU cycles are being used when playing back a normalized file (which suggests no processing is taking place realtime). Given these two facts, one must assume that that the math is correct and no actual processing is taking place (even compared to running a gain plugin). I completely agree that some people (especially those without metering OUTSIDE the box) record their signals far too hot to begin with, but it's hard for me to conceed that EVERY DAW maker (PT, Steinberg, MOTU, Ardour, Apple, SAW, etc...) has for the last twenty years been providing us with meters in their software that are, in effect, completely wrong. But then again, stranger things have happened ;-)

beejunk
beejunk's picture
User offline. Last seen 19 weeks 2 days ago. Offline
Joined: 2009-04-13
Posts:

Not feeling well, so I just called in sick to work and am drinking tea while reading through articles on DSP. :)

Specifially, this article: http://www.masterworksound.com/Signal%20Degradation.pdf

Here's the money quote. It's a long one, sorry:

This unavoidably destructive component in DSP can be largely explained through a discussion of bit depth and word length. In the DSP environment, a computer's processor conducts all calculations to a pre-set point of precision, which is computed every time there is a sample in the signal (discussed later). To define this pre-set point of precision, bit depth is used, which is a measure of the number of binary digits used to compute a "digital word". An audio signal that has a bit depth of 16 then has its digital words computed to 16 binary digits (and no more beyond that); likewise, an audio signal of bit depth 24 has its digital words computed to 24 binary digits, and therefore, has greater precision. When a higher bit depth is used, a computer's processor is able conduct every DSP operation (such as adding gain, reverb, or compression) to a longer word length relative to using a shorter bit depth. Practically speaking, every DSP operation yields some sonic degradation, because after conducting many DSP operations, the intended word length of the audio signal will be truncated to the actual word length of the audio signal thus altering the waveform of the audio signal in an unintended way. over time, this cumulative effect can have a negative, audible result but by using a longer word length this degradation, which is really just data truncation, can be kept to a bare minimum.

This seems to suggest that even adding gain, such as in normalization, can cause a minute amount of degradation. I am not even remotely an expert when it comes to programming, though, so as far as I know this could be way off base. Thoughts, anyone?

linuxdsp
linuxdsp's picture
User offline. Last seen 2 days 4 hours ago. Offline
Joined: 2009-02-04
Posts:

@beejunk: Any numerical process inside the CPU will be limited to a finite degree of precision. Even using floating point, and as such even a simple multiply operation (which is all that is happening when you change gain) will lose some of the original precision. However in the case of 32 bit floating point this is likely to be so minute as to be inaudible. Especially when you consider that by far the greatest cause of signal degradation is the conversion back to analogue voltages in the sound card, and then ultimately to vibrating paper cones attached to coils of wire in order to make the sound. The DSP processes that are most likely to suffer from rounding errors etc are recursive processes such as IIR filters where a small rounding error can be accumulated in successive passes around the filter.
In simple terms, I think you can consider a simple gain change in the floating point domain as not introducing any degradation to the signal. However you should try and ensure the signal makes best use of the D/A converter resolution (even for 24Bit) since this is generally where the greatest degradation occurs.

paul
paul's picture
User offline. Last seen 4 hours 59 sec ago. Offline
Joined: 2006-03-16
Posts:

@beejunk: whenever you perform a mathematical operation with two numbers, you get a result. There are two versions of the result: the theoretical answer, and the answer you can store in a piece of computer memory in a given format. In an ideal world, these are never different. In the real world, they often are. Take a truly trivial case: 1/3. Store the irrational result of this division in a floating point (or double precision floating, or fixed point) value, and you do not have the answer totally correct. Obviously, the error is extremely small, but there is an error. Now take the two numbers either side of 3 and do the same: 1/2 and 1/4 - both of these can be represented with 100% accuracy in any binary computer. Next imagine that this is a 3 element number sequence and you performed this operation on each number. The correct answer is a series of 3 numbers whose ratios are 2:3:4. However, because the storage of 0.33333.... is truncated, what you end up with is something very slightly different. If this represented a waveform, you have introduced distortion, although arguably inaudible distortion.

There are a lot of operations that can produce numbers that can't be represented with absolute fidelity, and there is every change that even in just applying gain you will introduce tiny distortions like this. Those who believe in the magic of analog point to such distortions as one of the reasons to be wary of digital (conveniently ignoring 2 or even 3 orders of magnitude greater errors in the analog domain). They are not wrong that from a purely mathematical perspective, these distortions do exist. Whether they represent anything that can be detected by any human being, anyway, is another story entirely.

seablade
User offline. Last seen 17 hours 30 min ago. Offline
Joined: 2007-01-22
Posts:

There are some concepts in there you are confusing I believe, this may get somewhat technical however and hopefully I won't type out an entire book in itself....

Normalization in an ideal digital audio system is only applying gain. That means you are mathematically increasing every value in a 16,24,32, etc. bit signal by a set amount. Talking in floating point systems your range of values will typically be between 1.0 and -1.0. If your audio signal peaks at .5 and -.5, and you are normalizing it, you are taking the entire signal and increasing all of it by an identical amount so that the point in time that did reach .5 or -.5 before now reach 1.0 and -1.0 (Simplifying it slightly). The added effect this has is that you are increasing the quietest parts of the signal, which assuming you are talking about real world recordings(And even if you aren't in many cases) will NOT be equal to 0, but instead minutely above or below zero. In increasing the level of the entire signal, you are also going to increase that minute level, otherwise known as the noise floor, to compensate.

In a 16 Bit signal, that noise floor by default can only be so quiet, so normalizing would create a very noticeable increase in the signal. This is why when recording at 16 bit you need to make sure you are recording with levels just below peak. In a 24 Bit signal, it is possible for that quiet value to be much better defined and much quieter, so normalizing the signal does not increase the noise floor in as noticeable way, thus allowing for people to get more usable signal out of a signal that was recorded with more headroom. I really don't want to dwell on this topic quite as much as I probably should as it is a conversation in itself.

So when people say normalization adds to the noise, they are technically correct in that it will increase the noise floor. However the actual amount of signal compared to noise will not change, and in actuality because of this, there is no additional noise or distortion introduced in the process.

Now all this aside...

When dealing with algorithmically generating or modifying sound, you can end up with values that cannot be recorded precisely in a fixed bit depth. For instance multiply .12 by 1.2 and you will end up with .144, which actually needs more precision than either of those values. This tends to be the largest issue when dealing with actual processing more advanced than a simple multiplication, aka reverb DSP etc. You can always increase the bit depth of the saved value, and in fact many systems use a much larger data path internally than the source might traditionally need, Ardour included(Digital Audio only gets recorded at about 20 bits by the best AD converters to my knowledge, Ardour IIRC uses a 32 Bit data path in its mix engine, Harrison I believe uses 64 Bit data paths in its large format consoles). This allows for a tremendous amount of precision for the results to grow into. However you can still lose a slight(Literally, inaudible) amount of audio depending on the processing. So yes in those cases you will suffer a slight degredation of the signal IF YOU ARE DEALING WITH A DESTRUCTIVE PROCESS THAT DOES NOT ACCOUNT FOR THIS BY PROVIDING AN INCREASED BIT DEPTH TO ACCOUNT FOR THE ADDITIONAL NEEDED PRECISION. Otherwise you will always have the source signal that will be unaffected.

Ok all this said, lets look at Ardour and why this really isn't an issue in a properly designed and implemented non-destructive workflow. When dealing with Ardour's normalization, what you get is the audio file analyzed and the value needed to actually normalize the audio file is stored in the region description in the session file. For instance one value from a normalized region I have right now is 1.8104865551. This value is then combined with the value for gain presented by the fader itself, region gain, and any other value in that method, to come up with the total gain applied to a region at any given point in time, a single gain operation is applied to the value, not multiple. Thus normalization, even in the extremely minute cases mentioned above, does not actually affect the audio at all by itself, but instead is combined with other methods of affecting gain before the audio data is modified at all.

So... how much of that makes sense?;)

Seablade

seablade
User offline. Last seen 17 hours 30 min ago. Offline
Joined: 2007-01-22
Posts:

And of course while I was typing, you get two other very qualified answers on the topic;)

Seablade

beejunk
beejunk's picture
User offline. Last seen 19 weeks 2 days ago. Offline
Joined: 2009-04-13
Posts:

I appreciate the very informative replies. Yes, Seablade, your post makes sense. :)

Zzeon
User offline. Last seen 4 years 16 weeks ago. Offline
Joined: 2009-12-31
Posts:

You guys are so silly....

Normalize amplifies the samples equally, it is a simple, find-the-peak, whats-it-take-to-get-it-to-100%, use that multiplier for all sample points.

Guys it's like:
2 times 2 = 4
2 times 3 = six

The ratio is still the same, that is why the dynamic range is preserved and not affected. You want your old 2 and 3 back, cut the level by 50%.

From Adobe, and a very smart CoolEdit author:

Now when we choose the "normalize" effect, the software looks for the loudest point of the waveform, and then raises (or lowers) the amplitude of the entire waveform until the volume reaches a particular percentage of the clipping point. Every point in the waveform is amplified equally, so that the original dynamics of the piece are preserved. The default figure of 98% is a typical percentage figure for normalization.

Sure, the levels from track to track are lost and noise comes up, but isn't it the role of mixing and mastering processes to define what those should really be, relatively speaking. YES, you want your quiet passage preserved, so just back the fader down for that track.

In one case, your pushing the level because it is too low - no normalization.
In the other, your backing the fader/level down because it's high - attenuation

It's a matter of preference, and experience. If you have a bad experience, then that tends to affect your approach to workflow.

Don't be ascared of the big bad gain, it's your friend, or can be if you treat it right.

Hey, after all...., your gonna pump it through some non-linear process to "normalize-by-ear" to a final master using discrete frequency bands of compression/expansion. That is why Katz gets paid the big money, his ability to hear how to make the final work the best it can be, in his mind, is why artists are drawn to his way of thinking (sonically, not technically).

Compression, now there is a word that carries more fear in my mind than linear transforms, and requires the most work to get it right...

sha!

seablade
User offline. Last seen 17 hours 30 min ago. Offline
Joined: 2007-01-22
Posts:

Zzeon..

Multiply .04523 by 1.86.

You get .0841278. A number that requires more bits to store precisely than either of the original numbers. You only have a finite storage space in a 32 bit float. Same with 64 bit. So eventually in some math you will get a result that requires much more precision than you have availiable, that is what they were referring to when talking about normalization being a process that removes information and the basis of Bob Katz's statements. Your post ignores this completely.

However as I pointed out above, normalization as done in Ardour does not suffer to the same degree as it would via a destructive process. However in either case the actual difference is VERY minimal.

Seablade

catraeus
catraeus's picture
User offline. Last seen 5 days 34 min ago. Offline
Joined: 2009-01-26
Posts:

@seablade ...

Here's a little number theory. 0.04523000000 has as much absolute precision as 1.86000000000 ... and the result, 0.08412780000 has exactly the same absolute precision (using the decimal allegory that you started.) For exactly that answer, there was no round-off error if the storage mechanism was the 11 decimal-point system I just proposed. That is how such numbers are stored in the computer. The real problem, as Paul pointed out, is that the intermediate result of a long string of operations really is what needs the larger bit-depth of storage. The Pentium class of processors use a really big accumulator for some of their floating point so Ardour, Cubase, ProTools, Logic, all of the plug-in developers etc. take advantage of these before they store back out to a newly generated intermediate track.

Really, the lost precision for a floating-point system (single-precision) is that the mantissa has only 24 bits, which is 122 dB of background noise for a 0 dBFS signal. So the processing noise gain comes up by these nano-scale amounts each and every multiply. The multiply is in any mix, and the rendering to a master-ready set of tracks has to do this whether you use non-destructive or destructive editing tools. It's literally nano-bits of noise addition, so it isn't heard.

The only hearable (my word) processing noise comes from the tweaky filters we love to put in. They turn out to subtract two numbers from each-other which are very close to each other. The numbers there only leave a small number of mantissa bits that can't be recovered when the floating-point accumulator puts the mantissa back up to full scale. When I say tweaky, I mean a radically high-Q filter. The gentle filtering of a brightness/darkness or whatever correction doesn't add this kind of noise. A high-Q filter has really obvious noise generation which is frequency shaped on top of that, it isn't white.

As a matter of course, I normalize all tracks immediately after capturing, then set levels in the mix. But when I do a good job of level-setting during capture, there were a few peaks that went over 0 dBFS and the normalizer gives up.

Now for the punch-line.

The reason that @kvk's mix changes radically between listening venues ... Both Fletcher-Munson and patterning interference IMHO the most likely culprits.

The apparent loudness of differing frequency content programs will vary dramatically because of the non-linear loudness detection that our ears impose. The mix in the monitors will not sound the same as the mix in the car. The only way to fix this is to make your engineering monitor room shaped like a car with motor and highway noises to boot. Or learn what gets boosted and ducked in the car compared to the studio.

Patterning interference is a nasty problem of multiple sources. You mentioned that you have two sets of monitor speakers. If they aren't very nicely managed surround channels, then the two lefts will cause focusing to make nulls in frequency that change as you move around the room (likewise, of course, the two rights.) Get rid of the second set of speakers (keep the Rokits!) and you will have a better listening in the monitor.

Finally, I personally agree with the stereo sourcing for a fatter, sweeter sound. I record classical guitar and have found a stereo arrangement for two source positions to give me the best ability to image the sound.

I would say, however, that for a layered highly-tracked, dubbed, produced-till-it-screams session this technique will get lost in the juicy sauce that is being cooked. The mono source material is MUCH easier to keep clean, while still providing panning. There are some really great panning-reverb plugins that make image placement downright fun from a mono source.

my 2 ¢.

b.t.w. Paul, 1/3 is quite by definition rational ... sqrt(2) is irrational.

catraeus

seablade
User offline. Last seen 17 hours 30 min ago. Offline
Joined: 2007-01-22
Posts:

Here's a little number theory. 0.04523000000 has as much absolute precision as 1.86000000000 ... and the result, 0.08412780000 has exactly the same absolute precision (using the decimal allegory that you started.) For exactly that answer, there was no round-off error if the storage mechanism was the 11 decimal-point system I just proposed. That is how such numbers are stored in the computer. The real problem, as Paul pointed out, is that the intermediate result of a long string of operations really is what needs the larger bit-depth of storage. The Pentium class of processors use a really big accumulator for some of their floating point so Ardour, Cubase, ProTools, Logic, all of the plug-in developers etc. take advantage of these before they store back out to a newly generated intermediate track.

You are correct, however you also didn't demonstrate any floating point operations that would require more precision, which as you and Paul both stated tend to happen with multiple operations stacked on top of each other. That is ALL I was trying to explain. Now if you want to be technical, in any of the cases you mentioned none of the numbers NEED as much precision as 11 decimal places in a base 10 system. In a base 2 system, which is what computers work off of, they still don't need as much as you gave, but that isn't the point. My point was entirely to give an example of how a destructive normalization process COULD cause a loss of precision in response to the poster directly above my post. If you read my points above you would realize I was not saying that it would be noticeable, and in fact in proper non-destructive systems(Such as Ardour) isn't an issue at all.

The reason that @kvk's mix changes radically between listening venues ... Both Fletcher-Munson and patterning interference IMHO the most likely culprits.

There are many possible reasons, none of which have to do with the topic of normalization degrading signal quality, which was all I was focused on, and very few I could even guess at without knowing a lot more about the reproduction systems and acoustics in question. I could give likely candidates of course(And yours are quite possible, but also add in acoustics as well as a major contributor), but to give accurate responses would need much more.

Seablade

catraeus
catraeus's picture
User offline. Last seen 5 days 34 min ago. Offline
Joined: 2009-01-26
Posts:

Right on @seablade. Thanks for the corrections and expansions.

The possible causes of the original problem are really big, and the number theory problems are also both many and arcane and not @kvk's problem.

My point is that normalization and mixing in general are not a sonic problem in any modern DAW, The only problems I ever see with sonics due to number systems come from tight filtering or other radical processing. Even after 100 destructive operations of IEEE 32 float, that puts the noise floor at -102 dBFS due to number-system only. Starting off with normalized inputs, healthy record levels etc. stops that from becoming audible.

Finally, won't the 32-float vs. 32-int wars be just as fun as the analog vs. digital wars 8-o

Catraeus