FLAC, OGG, MP3, CDs- setting recording, mixing, compression and printing levels

I’m having a lot of fun with Ardour, and I know I’ll never even scratch the surface because my needs as a singer-songwriter aren’t that complex. I’m stuck, however, on this issue.

I have two stereo tracks: one guitar, and one vocal. In mixing, I set whatever plugins I have (EQ, some reverb, compression and possibly a high/low pass filter) and mix. I listen to the mix in two sets of speakers in the room (one little doinky pair and a larger pair of KRK Rokit 8 monitors). I then burn whatever sets of mixes that potentially might be the final onto a CD and go listen to them in my car. The problem is that levels and mixes that sound good in the room change when I burn them to CD- levels that sound good and solid when playing a .WAV file in the room are barely audible when burned to CD, but when I try to raise the printing level (either via the Master Output, the compressor output, raising both the track levels, and/or normalizing the tracks), things shift and are no longer in balance, even if they sound that way on the speakers! (Everything sounds good on the headphones, so I try not to rely on that.)

This is a critical issue for me because I don’t make CDs anymore. I give away all my music for free on my website as FLAC, OGG, or MP3 files. Up until recently I went into the studio, but it shut down and I went ahead and bought my own gear to do it in-house. But I’m trying to figure out what measuring line I should use for these tunes before I release them. Most people are NOT going to burn any downloads to a CD- they will either listen to them on their computer or put them on an MP3 player or similar. I get occasional emails for CDs, but that’s secondary for now. So do I base my mix decisions on the digital sound or the CD, and why does it change?

GEAR:

Guitar mic: AKG C1000S
Vocal mic: Shure KSM44
Presonus FP10
AV Linux

I set my FP10 input channels to about 7, Master Gain to 7 as well. I record with the track levels at -30 dB and generally try to mix so that any peaks reach about -12 dB or so, and I place a vocal compressor a few dB below that. But in trying to match everything up, I play around with all those levels quite a lot to see what I can find out.

Sorry for the length. Thanks!!

kvk

There’s so much to learn!! :-p

When I made my first commercial album, it was a tape (CDs were beyond the budgets of most of us back then) and the Mac Classic II hadn’t even come out yet!

Thanks for all the input!

Macinnisrr, I have the entire vocal track, including the bridge vocals, on a single track. I think I just sang it hotter.

I thought I had it set to levels comparable to some of my other tunes that had been done in the studio, but when I listened to it off the website today it was definitely much quieter. Time for more messing about!

You are right, roaldz, using a fader is a DSP, and in a strict sense that is in fact bad. This is why a lot of engineers who are mixing digitally recorded audio on an analogue board go out of their way to not change any of the levels in the DAW. Of course, most of us are mixing in the box, and so lots of signal processing is necessary. The goal is to use only the processing we need, at least if we want to maintain the highest audio quality. I find it extremely unlikely that someone who is mixing will normalize a track and then proceed to never adjust the volume of that track again for that session, though I suppose it is possible. In most cases, though, you are adding an extra unnecessary DSP, and that’s just bad practice.

You are also correct in saying that a track with a highest peak level of -10 db would be made louder with normalization, but Katz’s point is that the perceived loudness will likely not change much. What matters is the average sound of the track. If that peak is, say, 20 db louder than the average volume on the track (yes, it is an extreme example), then raising the whole track 10db is not going to make it sound that much louder. And in many cases, at least when recording live instruments, the peak sound is caused by some transient that is much louder than the average sound of the track.

touche! Thanks for the link, the thread is amazing. I’ve always tried to avoid extra steps in mixing, and as such have not normalized tracks for quite some time because I’d have to turn them down later (basically just to save a step and streamline), but I had no idea about the idea of printing at -18 just for the headroom alone. In fact, I usually have tried to print lower dependant on how many tracks will be in the same frequency range (record bass guitar higher than electric guitars which I’m going to be layering) just to avoid touching the faders as much as possible. It’s nice to find out WHY that makes everything sound better, and to be sure, I’ll be keeping a closer eye on this in the future! Apologies for my tone, and thanks again for the wonderful info!!

seablade, no need to add a ‘sorry’ to your comment, this is all in the spirit of informative debate. You may very well be right about how normalization is done in Ardour, I personally am not versed enough in how Ardour works under-the-hood to disagree with you. I can say that, for many DAWs (including Cubase 4, which is the only other program which I have real familiarity) normalization is a destructive process that instantly creates a new file, and this is possibly what Katz is referring to in the quote. I should also say that the quote is relatively old, about nine years, so take from that what you will.

It is still true, though, that normalizing quite often does not accomplish anything, due to the second point that Katz makes about the way the ear reacts to sound.

macinnisrr, I agree with your point on analogue vs. digital, and I want to make clear that this has little to do with that tired debate. We are just talking about the pro’s and con’s of normalization. However, I have to take issue with your comments about digital copies, because we are not talking about making copies. When you process the signal in a DAW and eventually mix it down, you are creating a fundamentally different file that is going to have some level of degradation, even if it is relatively slight. Suggesting that correctly coded software can avoid this is, I think, putting a bit too much faith in digital processing.

Here is a link to the Tape-Op thread. It’s good stuff:

http://messageboard.tapeop.com/viewtopic.php?t=38430

Some of your post is confusing, so clarifying a few issues first is necessary in order to find the source of your problem.

First, I’m curious why you would be using any stereo channels to do your tracking. If you’re recording from one source, in this case a single mic, then there is no need for stereo. Just use mono tracks. This will also be less labor intensive for your gear, since recording in stereo with a mono signal is in effect recording two identical tracks when all you need is one.

It is also a little confusing that you mention you are setting the FP10s pre-amps to ‘about 7’. It seems unlikely that your guitar mic would require the same gain as your vocal mic, and anyhow the actual number that the pre-amps are set to is not so important as long as you’ve got enough gain to get an appropriate signal strength. Also, I do not know what you’re talking about when you refer to a Master Gain. The Main Level output of an FP10 refers to the strength of the signal being sent to the Main CR outputs used for monitoring (which are the same as whatever is being sent to system output channel 1 and 2 via JACK). This knob will not effect the levels coming from your pre-amps. The Mix knob that is on the front of the FP10 controls whether you want to monitor from the hardware or from the software. Turning it all the way to the left gives you the signal straight from the pre-amps, and all the way right gives you only the signal from your DAW. Either way, this knob also does not effect the pre-amp levels.

You state that you’re printing the tracks at around -30 db, which seems very low. In the digital realm, and as long as you’re recording in at least 24-bit, you should probably be shooting more for an average of -18 to -12 dbFS (if you’re recording in 16-bit, you’ll probably want to go louder). As for your peaks, all that matters is that none of them cause the signal to clip. Otherwise, if a few peaks get close to that 0 dbFS line, then it should be fine.

It is difficult to quantify the statement ‘barely audible’ when talking about the levels coming from your CD. If you’re talking about the levels being much quieter than other CDs, then this is to be expected, since you need to do some appropriate mastering to get those ‘hot’ levels a lot of people have come to expect. However, if you mean that you actually can’t turn your radio up loud enough to get a listenable volume out of the CD, then there is something wrong with the way the CD has been created that has nothing to do with your recording, mixing and mastering levels.

Last, could you clarify what you mean by ‘things shift and are no longer in balance’? Do you mean the level of the vocals compared to the guitar will sound different when playing back from a CD than they did while mixing?

Sorry for not being clear!

On the FP10, each channel has its own gain, of course , but the unit as a whole has a MAIN knob, in addition to the HEADPHONES and INPUT/PLAYBACK knobs. Does this MAIN not affect the signal strength going into the DAW?

I was under the impression that recording stereo resulted in a “fatter” signal that could be panned, giving a more full and roomy feeling to the mix itself. This is perhaps another error?

Yes, -30 is a bit low- I wanted to err on the side of caution and experiment with a strong input that had a lot of headroom to be increased. I am confused between the individual track levels, and the Master Output level, which I increase until it sits around -10 or more to export the .WAV file. Should the Master Out not be used this way?

As for the CD level, I mean that I have to turn the car radio up to it’s full level to hear the CD at a decent level. Now, by normalizing the tracks and cranking both tracks’ levels AND the Master Out, I can get a decent level, but it clips into the red and definitely goes above 0.

When I say things shift, I mean a mix that sounds good in the speakers appears to lose some of the compression levels on the CD, so that the bridge, which was fine in the speakers and phones, is suddenly far too hot on the CD and jumps out from the mix as uncomfortably loud.

Interestingly enough, when the CD levels are too low and I have to raise the levels in the car, those are generally the best mixes, unless I’m not hearing the hot bridge simply because the entire thing is too quiet.

Okay, that does clear a lot of things up.

The Main knob indeed does not effect the signal strength going into the DAW. That knob controls the strength of the signal coming out of the Main CR outputs that are labeled on the back of the unit. This is usually where you connect your monitors, and those outputs are carrying the same signal as whatever is being sent to the system outs 1 and 2, as mentioned earlier (this signal is also sent to the Cue Mix outs, the headphone out and, of course, the standard output channels labeled 1 and 2). Ardour will usually automatically send the Master channel out through those. If you are not doing this, I’m curious: where are you currently connecting your monitors?

You do not need to record in stereo in order to be able to pan a channel. As long as the channel has two outputs into the Master channel (which Ardour does automatically when creating mono tracks), then you can pan. You’ll also want to set the channel to ‘post’ mode if you want to see the effects of the panning on the meter. Recording in stereo would, I believe, increase the loudness of the tracks because you’re playing back two exactly identical recordings at the same time, but there’s no reason that would give you a better tone or fidelity, and it certainly uses up a lot of hard disk space and processing power. You could just as easily turn up one track.

As for tracking and mixing levels, I’ll try to write out a quick primer that might make things more clear. Your signal chain is going to look something like this:

Going in: Mic > FP10 > Ardour
Coming out: Ardour > FP10 > Monitors

When the mic signal hits the pre-amps, your goal is to boost the signal to the point where you are getting the most fidelity. The FP10 is designed, like all pieces of audio equipment, to sound the best at a certain signal strength, which is usually around 0 dbVU. The level that you can go above that point before getting digital distortion is the headroom, and in the FP10s case I believe that headroom is something like 18 db, although I’d have to look through the manual to be sure. So, while your -30 db tracking level is certainly cautious, there is reason to believe you are not getting the best sound out of it becauese your converter is designed for optimal sound somewhere in the -18 to -12 range. You really shouldn’t need any more headroom than that.

Once in Ardour, you’ll use your individually recorded channels to do some mixing, after which your master volume should generally be louder due simply to the fact that you’re layering sound. It should not be significantly louder, and in general you shouldn’t worry about the end volume at this point. The goal should be just to get the best sounding mix possible. Also, I personally almost never use the Master channel fader to adjust the volume, since it is almost never necessary. Sometimes I will use it to turn a mix down that has perhaps gotten too loud during the mixing process.

After you export a mix, if you want that mix to sound louder but still good, you’ll need to do some mastering, which I won’t go into here. This is a handy link that should get you going, though: http://www.64studio.com/node/823

That tutorial will also explain what levels you might want to shoot for and why.

It’s not surprising that the mixes you attempted to make louder did not sound is good as the originals, because the processes you used (such as normalizing, which is always a no no, or just trying to boost the overall levels even if they go in the red) would almost certainly maul the sound in unpredictable ways.

That’s all I’ve got time for. Hope this is helpful and good luck!

Thanks for the feedback and the link! I found an enormously long thread over on the Tape Op message board regarding recording levels and the difference between analog and digital gear and such- very similar info.

Yes, you are correct- my monitors are connected in the rear of the FP10 in the CR outputs.

I’ll check out the link and read it over- thanks!!

And (maybe the tutorial will explain it) - why is normalizing a no-no?

Thanks again!

I was thinking of linking to that thread. It is somewhat of a classic on the Tape Op forums, and that’s pretty much where I got all the info in that last post.

As for normalizing, I’ll just reference good ol’ Bob Katz:

"I’ll give you two reasons. The first one has to do with just good old-fashioned signal deterioration. Every DSP operation costs something in terms of sound quality. It gets grainier, colder, narrower, and harsher. Adding a generation of normalization is just taking the signal down one generation.

The second reason is that normalization doesn’t accomplish anything. This ear responds to average level and not peak levels, and there is no machine that can read peak levels and judge when something is equally loud."

There you go.

That´s not entirely true when your highest peak is around -10 dB. When you normalize that, it will just be 10dB louder, so your peak is at 0dB, and then you´re using all your headroom. This could also be done with a fader, but hey, that´s just another dsp operation so that´s bad…?

beejunk-

I disagree with some of the comments there sorry.

Normalization can be done in a way that undoing it causes the signal to revert exactly back to the initial status. In fact Ardour does a variant of this I believe. It is not an ‘extra’ level of DSP that ‘degrades’ the signal at all IMO. Yes I am aware of who Bob Katz is, but I am also more than slightly aware of how the math works as well and will strongly disagree with that statement as it is quoted I would not be surprised to hear there is vastly more to that story than has been let on in this thread, I have not read the TapeOp thread however.

   Seablade

Agreed. The whole point of digital media of ANY KIND is that reproduction does not degrade the copy. Like when a person makes a photocopy of a photocopy, THAT degrades the original, but if i copy a file on the computer an infinite number of times, it will stay the same (provided the software is doing its job correctly). normalizing, moving a fader, panning, etc. should not degrade the signal if the software is coded correctly, just as copying a track to multiple other tracks should make PERFECT copies. In fact, while I agree that analog recordings CAN sound better than digital, this is not inherent in the technology. It’s just that we’re so used to hearing the distortion (and I guarantee that’s what it is) that analog gear ADDS to the signal. Take tape for instance. Many people, myself included, think that tape sounds “warmer” than a digital recording. That is absolutely not because tape is a more accurate. The opposite is in fact true. The only reason we think tape sounds better is because that’s what we’ve been used to hearing for the last 100 years on recordings, and the recordings we’ve heard are our reference for how our recordings should sound. Don’t get me wrong, I love analog gear and the sound it ADDS to a recording. But is analog more true? Is a mixing console more accurate than a DAW? Does a tube amplifier make an electric guitar sound more clean than a direct feed? Absolutely not. It just sounds better because that’s what we’re used to. In several recent studies, college age students and younger PREFER the sizzle that mp3 compression at 128kbps adds to a recording, when compared to an uncompressed recording. Does that make sense? Absolutely, but only when one considers that that’s what the people taking the survery are USED to hearing. But I digress…

Most likely the reason that normalization changes the level of the bridge in your song comparable to the rest is because you have a separate region for the bridge on at least one track (or an additional track). So what happens when you normalize all the regions is that the bridge jumps out because it had lower peaks than the rest of the song BEFORE normalization, and normalizing raises the peaks to the same level, thereby causing the AVERAGE level of the bridge to be much higher than before.If you were to cosolidate the regions into one new one BEFORE normalization, the result would set the peaks of the ENTIRE track to 0, keeping the relative loudnesses of the different sections of song the same.

Sorry about the rant, but I’ve just heard this “analog is inherently better” argument too many times and I simply disagree.

It is still true, though, that normalizing quite often does not accomplish anything, due to the second point that Katz makes about the way the ear reacts to sound.

Actually such a blanket statement I again will disagree with. While the concept of what you are saying is indeed correct, for a track with large peaks that go above well above the average level of the track, or even for tracks that are largely dynamic, normalization’s usefulness is limited.

However in my work at least, those two are not the case the majority of the time, especially when recording at 24 bit. When I was dealing with 16 bit recording, this was more true, but 24 bit has become the norm to replace that, and as a result it allows for people to record at lower levels more safely without as much risk of the increased noise floors 16 bit recording had, thus increasing the effect normalization will have in many cases as there will often be more headroom on a track just for the sake of safety and preventing peaks.

For the record, Ardour’s editing is almost entirely non-destructive. There are a few exceptions, but those are very specific exceptions, rather than the rule. This includes normalization I believe (Haven’t double checked this mind you, just going off the top of my head at the moment) which these days will often work on the concept of simply amplifying the existing signal by X amount. This then gets added with any amplification from the fader, and if you move the fader down by X amount, you will end up with an identical signal to what you started with. There is no degredation in signal there in the realm of digital so long as these functions are properly implemented(And I am fairly certain Ardour’s are).

As has been mentioned, file system operations coming from copying files, and reading files, have to be VERY dependable to be identical. One single bit being off when loading an executable of hundreds of megabytes, will cause it not to run. So there should again be absolutely no degredation of signal, even if a new file is created during the normalization process. The new file if modified in a negative direction by the same amount of gain the normalization, should be exactly identical to the source. So even in the destructive process you alluded to, normalization should NEVER affect the quality of the signal.

Normalization is just a form of amplification. If implemented correctly and occurring strictly in the digital domain, this should result in a signal that is only mathematically louder, but no other degradation should have occurred. So if you then take that signal and modify in the negative direction with an amplification process(IE. Fader) by an identical amount, you will end up with an identical signal mathematically. Again this is all dependant on the two processes being implemented correctly(And no plugins in the line). And again I am fairly certain Ardour has those processes implemented correctly.

      Seablade

Also take a read through the Gearslutz forums; specifically, read http://www.gearslutz.com/board/so-much-gear-so-little-time/420334-reason-most-itb-mixes-don-t-sound-good-analog-mixes.html

But do search through these; while the advice is all over the board, there are some real gems in there.

Seablade, I see what you’re saying about the processing, but it clashes with a whole lot of things that I’ve heard and read. I’ll add another quote from The Mastering Engineers Handbook, which is where that previous Bob Katz quote was taken (by the way, it might clarify things to point out that Katz’s statements about the usefulness of normalizing were in reference to mastering only, and not any other stage of recording)

Even the smallest adjustment inside the DAW causes some massive DSP recalculations, all to the detriment of the ultimate sound quality.

It then goes on to make the point about average listening levels. My question to you is, is this incorrect, and if so, why? This is a very widely held belief that I’ve read in many places. And I understand that the goal of the software is to not degrade the signal, but I don’t want a response that amounts to saying the numbers add up, I want to know if people have confirmed that the actual sound of the signal has not degraded. This isn’t meant to be combative, I’m not even sure how you would go about confirming such a thing. But a lot of very experienced people believe that the audio is degraded with any digital processing, and if that is not the case, it would be good to know. If I have time after work, I’ll do a little more extensive Googling on it, but if someone on this forum has already researched this, that would be quicker! :slight_smile:

As far as the usual levels one sees in mixes, my experience has been quite different from yours. Most projects I’ve worked on have had recorded levels that were either in the correct range, or WAY too loud, making normalizatin useless to me. There really is a widespread misconception that you should be recording as loud as possible. Perhaps it’s all those rock n’ roll bands I deal with.

hmmm…While I’ve never done a side by side comparison of different levels, It’s hard for me to argue with either point. It makes perfect sense to me now (I hadn’t even thought of it before) why the mixing board I use to feed my DAW shows peaks at 0db (or rather 0dbvu) and the tracks all appear in software as being WAY below that (around -20). Now on one hand, I completely agree that one must not overdrive the analog portion of the signal past 0dbvu (or -18dbfs), I also agree with Seablade in that I can’t understand why normalizing a level in the box (where even if it DOES clip, it really doesn’t due to 32-bit floating point), would possibly result in a degradation of the signal. Seablade is absolutely right in saying that Ardour does not destructively edit when normalization takes place. Just look at the interchange folder in your project and you’ll see that the file remains the same and no extra CPU cycles are being used when playing back a normalized file (which suggests no processing is taking place realtime). Given these two facts, one must assume that that the math is correct and no actual processing is taking place (even compared to running a gain plugin). I completely agree that some people (especially those without metering OUTSIDE the box) record their signals far too hot to begin with, but it’s hard for me to conceed that EVERY DAW maker (PT, Steinberg, MOTU, Ardour, Apple, SAW, etc…) has for the last twenty years been providing us with meters in their software that are, in effect, completely wrong. But then again, stranger things have happened :wink:

Not feeling well, so I just called in sick to work and am drinking tea while reading through articles on DSP. :slight_smile:

Specifially, this article: http://www.masterworksound.com/Signal%20Degradation.pdf

Here’s the money quote. It’s a long one, sorry:

This unavoidably destructive component in DSP can be largely explained through a discussion of bit depth and word length. In the DSP environment, a computer's processor conducts all calculations to a pre-set point of precision, which is computed every time there is a sample in the signal (discussed later). To define this pre-set point of precision, bit depth is used, which is a measure of the number of binary digits used to compute a "digital word". An audio signal that has a bit depth of 16 then has its digital words computed to 16 binary digits (and no more beyond that); likewise, an audio signal of bit depth 24 has its digital words computed to 24 binary digits, and therefore, has greater precision. When a higher bit depth is used, a computer's processor is able conduct every DSP operation (such as adding gain, reverb, or compression) to a longer word length relative to using a shorter bit depth. Practically speaking, every DSP operation yields some sonic degradation, because after conducting many DSP operations, the intended word length of the audio signal will be truncated to the actual word length of the audio signal thus altering the waveform of the audio signal in an unintended way. over time, this cumulative effect can have a negative, audible result but by using a longer word length this degradation, which is really just data truncation, can be kept to a bare minimum.

This seems to suggest that even adding gain, such as in normalization, can cause a minute amount of degradation. I am not even remotely an expert when it comes to programming, though, so as far as I know this could be way off base. Thoughts, anyone?

@beejunk: Any numerical process inside the CPU will be limited to a finite degree of precision. Even using floating point, and as such even a simple multiply operation (which is all that is happening when you change gain) will lose some of the original precision. However in the case of 32 bit floating point this is likely to be so minute as to be inaudible. Especially when you consider that by far the greatest cause of signal degradation is the conversion back to analogue voltages in the sound card, and then ultimately to vibrating paper cones attached to coils of wire in order to make the sound. The DSP processes that are most likely to suffer from rounding errors etc are recursive processes such as IIR filters where a small rounding error can be accumulated in successive passes around the filter.
In simple terms, I think you can consider a simple gain change in the floating point domain as not introducing any degradation to the signal. However you should try and ensure the signal makes best use of the D/A converter resolution (even for 24Bit) since this is generally where the greatest degradation occurs.

@beejunk: whenever you perform a mathematical operation with two numbers, you get a result. There are two versions of the result: the theoretical answer, and the answer you can store in a piece of computer memory in a given format. In an ideal world, these are never different. In the real world, they often are. Take a truly trivial case: 1/3. Store the irrational result of this division in a floating point (or double precision floating, or fixed point) value, and you do not have the answer totally correct. Obviously, the error is extremely small, but there is an error. Now take the two numbers either side of 3 and do the same: 1/2 and 1/4 - both of these can be represented with 100% accuracy in any binary computer. Next imagine that this is a 3 element number sequence and you performed this operation on each number. The correct answer is a series of 3 numbers whose ratios are 2:3:4. However, because the storage of 0.33333… is truncated, what you end up with is something very slightly different. If this represented a waveform, you have introduced distortion, although arguably inaudible distortion.

There are a lot of operations that can produce numbers that can’t be represented with absolute fidelity, and there is every change that even in just applying gain you will introduce tiny distortions like this. Those who believe in the magic of analog point to such distortions as one of the reasons to be wary of digital (conveniently ignoring 2 or even 3 orders of magnitude greater errors in the analog domain). They are not wrong that from a purely mathematical perspective, these distortions do exist. Whether they represent anything that can be detected by any human being, anyway, is another story entirely.