44.1 vs 48 vs 88.2 vs 96

44 replies [Last post]
joegiampaoli
joegiampaoli's picture
User offline. Last seen 22 weeks 3 days ago. Offline
Joined: 2008-02-27
Posts:

I know this has probably already been discussed before, maybe over a thousand times....

Now that I am almost ready to start recording (moving from the composition stage) I need a little help from the experienced users with this subject or those who are real audiophiles who can perceive maybe spiritual voices.

I have seen and heard a lot that you should just record @ 24 bit/44.1 simply because "It's all just going to end up on CD anyway", and that doesn't really justify much to me.

What I really need to know is if during the mixing process of a song or working with an array of plugins at rates above the 44.1, is there a sonic difference while working with the audio? Do plugins sound better? Is mixing easier? Do eq's respond better?

And then again, when I downsample the song to 44.1 will it really be reflected, or somehow it all gets readjusted as if I had done it directly at 44.1?

Some say you should record and work at 88.2 simply beacuse of the "theory" that it is double the frequency of 44.1 and the downsample should be smoother and better, I find that one a bit hard to believe, whether its precisely half (or even if you like give it the golden ratio), it's still going through a resample process, just like when you probably scale an image.....doesn't matter if you're doing it half or quarter, it still should look smooth at the end, I guess.

Then some say do it at 48000 simply because you do get a little extra headroom and you're still not wasting drive space, and that you do hear a little more and better during the mix.

And then others (maybe 1%) say "If you consider yourself or want to become good at mixing, do it at 44.1" simply because the power of a good quality song lays on the mixing technique, not in all this computer nonsense, this last one did really mean a lot to me, so I would like to see your thoughts.

So let's try to stick this maybe at 48000 vs 44100.

Can you really hear a difference WHEN WORKING at 48000 than 44100, mostly the mixing process, do those eq's or plugins sound better?

And does it really reflect on the final resampled 44100 file?

Thanks to all.

dougal2
User offline. Last seen 3 years 51 weeks ago. Offline
Joined: 2010-02-06
Posts:

Hi Joe,

For years now I've worked only at 44.1 because of the majority of stuff being mastered to CD.

Not once in any part of, recording, editing, mixing or mastering have I ever thought that the sample rate was a limiting factor in the quality of the recordings. What I mean by that is that usually an improvement in any one of:
- the performance
- mic choice / mic placement
- editing skills
- mixing skills
- time available
would have produced a vastly more noticeable difference in the final master than choosing to use a different Fs to start with.

I'm sure there are valid arguments for using other rates, but from my experiences so far, it hasn't really mattered that much.

seablade
User offline. Last seen 43 min 43 sec ago. Offline
Joined: 2007-01-22
Posts:

If it is a choice between 48k and 44.1k, do whatever the end result is going to be. If you are going to be doing a fair amount of processing, there can be a benefit to working at 96k or 88.2k depending on the processing algorithms involved. But honestly most of what i have worked on recently is still recorded at 44.1 or 48k, there isn't to much of a difference.

Seablade

Ricardus
Ricardus's picture
User offline. Last seen 5 hours 53 min ago. Offline
Joined: 2009-05-31
Posts:

I've seen several double blind studies that show that no one (normal folk and audio engineers) can tell the difference between 16/44.1, 24/96, and live source material.

Having said that, in a real studio environment, with good equipment, I track to 24/88.2, and mix to 24/88.2.

joegiampaoli
joegiampaoli's picture
User offline. Last seen 22 weeks 3 days ago. Offline
Joined: 2008-02-27
Posts:

Thanks guys, just learning all this mixing thing which I'm still really a noob at, and just want to make sure I start with the right foot before getting to an important point of my progress and then find out I've been doing it all wrong.

@ seablade:

You said "there isn't to much of a difference" when you record 44.1 or 48k, so in other words there is one, just not that much. What is this difference like, are they like ultra high frequencies? Is it worth it? And can you hear it during mixing or maybe eq-ing?

Thanks!

Ricardus
Ricardus's picture
User offline. Last seen 5 hours 53 min ago. Offline
Joined: 2009-05-31
Posts:

If anyone can hear the difference between 44.1 and 48, they're either a genetically modified cat, or an alien.

seablade
User offline. Last seen 43 min 43 sec ago. Offline
Joined: 2007-01-22
Posts:

@joegiampaoli

You will get more of a difference from the quality loss from resampling between 48k and 44.1 than you will hear from using those sample rates from the get go. In other words Ricardus is pretty close to dead on.

The reason for those sample rates has nothing to do with any difference in audio quality, but rather the difference in what they are being sync'd to. 44.1 was decided as it would allow for recordings covering the audio spectrum up to 22kHz, above the range of most people's hearing if not everyone's hearing. 48kHz was used for syncing to film, I can't find a good article explaining exactly why and am not sure I could explain it very well from memory at the moment. The end destination should be the determining factor in choosing between those two formats only, so as to avoid resampling and any associated quality loss from it. Any benefit gained from working at 48kHz will be more than overshadowed by the slight loss in quality from the resampling process.

Seablade

Ricardus
Ricardus's picture
User offline. Last seen 5 hours 53 min ago. Offline
Joined: 2009-05-31
Posts:

The reason 44.1 was chosen as the sampling frequency originally for CD audio (20-20Khz frequency response) was because at that time, the filters they built to cut frequencies above 20K needed a slope to work effectively, and could not simply kill everything above 20Khz. They needed to roll off the higher frequencies using the additional 4.1Khz. In an ideal world the sampling frequency for CD audio would only need to be 40Khz.

linuxdsp
linuxdsp's picture
User offline. Last seen 6 days 8 hours ago. Offline
Joined: 2009-02-04
Posts:

@Ricardus: The choice of 44.1KHz is also linked to the video field rate. In the early days of digital audio, hard disks had the required storage bandwidth to capture the audio but not the capacity, so video tape recorders were used to store the digital data. As such - an integer number of samples has to be contained in one video frame and so you end up with:

Field rate X Number of active lines X samples per line:

60 X 245 X 3 = 44.1 KHz for 60Hz video since there are 245 lines per field = 490 lines per frame

50 X 294 X3 = 44.1 KHz for 50Hz Video since there are 294 lines per field = 588 lines per frame

Ricardus
Ricardus's picture
User offline. Last seen 5 hours 53 min ago. Offline
Joined: 2009-05-31
Posts:

Numbers. If I wanted to see numbers I'd go back to school. (that was a paraphrase of a Beavis and Butthead quote). :-)

But @joegiampoli, record at 44.1 for now. It will be fine.

I work at 88.2 when I work digital because it's a multiple of 44.1, and many of the mastering engineers I work with enjoy mastering at the higher sampling rates, and either resample algorithmically, or recapture the audio, analog, through a high-end set of converters, to bring it down to 44.1.

Mastering engineers who cut vinyl always prefer high-res masters to work with.

rephil
User offline. Last seen 5 weeks 2 days ago. Offline
Joined: 2010-01-31
Posts:

The other consideration in choosing sample rates has nothing to do with software. An A/D or D/A needs to be built well to achieve the theoretical noise and jitter specs. An industrial grade A/D at 16 bit, 44.1kHz may sound better than a consumer grade box at double the sample rate, because of things like how well the power supply is decoupled, grounding, etc. Even a lot of this stuff doesn't matter as much as it used to, since a lot of stuff is monolitic ICs anymore.

There are a couple of paper advantages to high sample rates for CD - a lot of noise your DSP introduces (yes, it does) will be in the octave above the band of interest and will get filtered out during resampling. But the downside is it takes a lot more HP to do DSP at high sample rates - but even this isn't much of a concern today.

So my take (as a mastering engineer) - as long as you don't do anything *too* weird, the differences in sample rates won't make as big a difference in the recording as having good or bad lyrics will! ;)

joegiampaoli
joegiampaoli's picture
User offline. Last seen 22 weeks 3 days ago. Offline
Joined: 2008-02-27
Posts:

Thanks guys, I've got a better understanding of all this now.

So I'll just keep it simple and straightforward, I'll just record and work on my mixes at 24/44100

Cheers!

Azeroth
Azeroth's picture
User offline. Last seen 2 years 51 weeks ago. Offline
Joined: 2009-10-15
Posts:

Hi everybody

appart from the framerate "44,1; 48; 88,2; 96......".

Can anyone tell me what the number of Bits is all about? The "Red Book" standard for CD is 16Bit/44,1kHz I tink. So if you are recording with 24 Bit everything would have to be resampled? Or not?

Im always recording with 16/44,1 cause I am afraid of resampling.

Sry for changing the topic. :-)

joegiampaoli
joegiampaoli's picture
User offline. Last seen 22 weeks 3 days ago. Offline
Joined: 2008-02-27
Posts:

Well I do use 24 bit myself, two reasons:

1) I do hear a very subtle difference when I record 24 vs 16, but this only affects certain instruments, not all, I did notice a fuller and richer sound in clean acoustic and electric guitars with overdrives and distortions, maybe some synths, a bit hard to explain, but I could hear more chimyness? and harmonics, and I did discover this when I started using Ardour, but the difference is very, very subtle, but I have not tested yet exporting both comparisons and see if there is a difference or is really reflected on the final file.

2) I busted my butt pushing my M-Audio Fast Track Pro to run under 24 bits just for the heck of it under linux, it was really hard to achieve and I'm probably one of the first (maybe the first one) to do it (not to blow my horns and whistles), so I do it as a respect to my hard work.

I have seen and read a lot that mostly all studios do work with 24 bit or 32 floating, and then they just sample down to 16 bit for the CD.

I just wanted to know if there was such a difference doing all this 44.1, 48, ....... etc, that's why this post.

Don't worry about changing topic, I'd like to hear other opinions about your question.

So yes, I do hear a very, very small difference between 16 and 24 bit, but nothing out of this world, just on that high end of the range and only some instruments seem to show it.

Again it could just be me and my imagination.......

Azeroth
Azeroth's picture
User offline. Last seen 2 years 51 weeks ago. Offline
Joined: 2009-10-15
Posts:

Im using a Presonus FP10 with Firewire FFADO. And it works very very well.

When I'm home from work i think I'm gonna play around with Jack and Ardour, so maybe i can see/hear some effects on my recordings. Doesnt changing from 16 to 24 Bit influence your latency? I still dont know how Jack calculates it. BTW I never had problems with Xruns with my settings. So I left everything as it was.

When changing from 16 to 24 Bit dont you get much bigger files?

Guess I'll have to learn some basix about recording and what all these numbers are about.

Seb
Seb's picture
User offline. Last seen 13 hours 8 min ago. Offline
Joined: 2006-04-07
Posts:

I use 48khz just because the jconv data files are at 48khz so i can use the wonderful Jconv reverb, even if my CPU hates me for it!

linuxdsp
linuxdsp's picture
User offline. Last seen 6 days 8 hours ago. Offline
Joined: 2009-02-04
Posts:

@Azeroth: I would say that the number of bits used is probably more important than choosing between 44.1, 48 or 88.2 etc sample rates. If you make sure the signal uses most of the available resolution then you probably won't hear too much difference if you just record a file and listen to it back again. The main advantage comes when processing the audio.
I would recommend that you record at the highest resolution (probably 24Bits) - make sure the file is stored as a floating point file (especially if you bounce or export tracks etc) and only convert to 16Bits if or when you have to for CD. This means you keep the highest quality representation of the sound in digital form, so that when you apply processing to the audio you do not degrade it noticeably due to rounding errors etc.
There probably won't be any effect on latency - JACK latency is expressed in frames or samples - so the actual sample 'size' shouldn't affect it, its still 'n' samples latency. There maybe a small change in the ADC converter latency - the time taken for the soundcard to compute the sample value but this depends on the soundcard, the ADCs used etc. and the difference is normally not worth worrying about.

calimerox
calimerox's picture
User offline. Last seen 5 hours 53 min ago. Offline
Joined: 2007-10-19
Posts:

24 bits is like the "volume depth" of your recorded file: of course if you record everything close to 0 db you with your normal 16 bits you have 16 bits depth, but if you record on a lower level, or because you record something very dynamic, it can occur, that some silent parts will only have 8 bits depth... therefore it makes sense to use 24 bits...
As well I guess also for fadeouts: the normal digital fadeout with 16 bits has 2^16 steps from full volume to nothing. that sounds much, but you actually hear a "break" between very silent and nothing, because its logarithmic.
I think 24 bits always make sense.. and also: you can t burn it down on cd with 24 bits(and the audio cd is a quite old standard for the digital world...), you can already encode your flac ´s with 24 bit...

seablade
User offline. Last seen 43 min 43 sec ago. Offline
Joined: 2007-01-22
Posts:

linuxdsp is completely correct on all counts. Record in 24 bits for a number of reasons, but mostly because it makes a HUGE difference in noise floor in your recordings and ease in making sure you don't clip while obtaining a decent noise floor. It will NOT affect your latency in any noticeable way(Meaning we are talking microseconds(much smaller than a milisecond) if that, maybe someone familiar with the hardware AD/DA design can comment here). Jack uses 32 bit floats internally anyways so you won't be doing yourself any favors by limiting it really.

Seablade

philip8888
philip8888's picture
User offline. Last seen 1 year 1 day ago. Offline
Joined: 2007-03-30
Posts:

The "Head Room" has greatly improved around here:-}

peder
User offline. Last seen 48 weeks 5 days ago. Offline
Joined: 2007-05-08
Posts:

@seablade: my guess as to why it's 48kHz would be that film is usually 24 fps, which would make things nicely dividable.

As for 44.1 vs 48 my take is that even if might be practically inaudible, the accuracy of the resulting waveform when mixing multiple tracks together should make 48kHz the right choice.
I don't think the downsampling makes more inaccurate than the result of rounding errors from multiple 44.1kHz tracks mixed together. After all libsamplerate is supposed to be very accurate.

And didn't your math teacher always tell you not to round until you had calculates the result?

That said, nowadays, with all the bitcrunching in mastering and the kids listening to it through crappy PC speakers, you could probably get away with 22kHz/8bit ...

Erdie
Erdie's picture
User offline. Last seen 1 day 7 hours ago. Offline
Joined: 2006-03-27
Posts:

@Seb: I would suggest to resample the impulse responses to 44100 kHz and then work with 44100 in the whole project instead of just using 48kHz to be compatible to the 48kHz impulses.
In the first case you have to resample your project, in the 2nd case you are just resampling your impulse file.

Just my opinion
Erdie

duffrecords
duffrecords's picture
User offline. Last seen 36 weeks 5 days ago. Offline
Joined: 2008-05-20
Posts:

One of the advantages of 48 kHz (and especially 88.2 or 96) is that it reduces distortion and audible aliasing. To understand this, an explanation of the Nyquist frequency is necessary. The Nyquist frequency is the highest frequency that can be represented by a particular sample rate. Since the minimum number of samples required to represent a single frequency is two (one positive and one negative turn) a 20 kHz frequency would need at least a 40 kHz sample rate. We use sample rates such as 44.1 kHz because its corresponding Nyquist frequency is 22.05 kHz, roughly the upper threshold of human hearing (at least during early childhood).

So if 44.1 kHz completely covers the range of hearing, what's the problem? As a waveform approaches the Nyquist frequency, it is represented by fewer and fewer samples. Graphically, it begins to resemble a staircase, and at the Nyquist frequency itself (remember, only two samples) it is a pure square wave. If you've heard a square wave before you know how harsh and distorted it sounds.

Also, frequencies above the Nyquist frequency cannot be correctly represented (because the sample rate is not fast enough), so you end up with incorrect samples mixed in there that don't belong, creating phantom noise known as aliasing. Typically, low-pass filters are used to eliminate data above the Nyquist frequency before they are sampled but this is a gradual roll-off, not a brick wall filter.

So even though a 44.1 kHz sample rate includes all the frequencies we can hear, the quality of the upper end of that range progressively degrades. This is the range where the sizzling upper harmonics of instruments such as cymbals and muted trumpets reside. Just ask any audiophile about the high-end response of vinyl versus compact disc. Most of us have been exposed to enough rock concerts and power tools to have permanently numbed that part of our hearing, so we don't really notice it. But when you consider that distortion is fed into each DSP effect you use and included in every calculation when channels are mixed, it gets distorted further and processed into all sorts of unexpected artifacts, thus amplifying the problem. By using a 48 kHz sample rate, we're shifting all of that chaos up and out of the range of human hearing. That's why it's so difficult to discern the difference between 44.1 and 48. What you're listening for is the absence of artifacts that occur near the Nyquist frequency, not the inclusion of higher frequencies that a typical human can't even hear.

If you start with 48 and stick with it throughout the tracking/mixing/mastering processes, you will avoid the unwanted artifacts inherent to digital recording. But if you're eventually downsampling it to 44.1 kHz for CD, does that negate all the measures you've taken to preserve the quality up to this point? I've heard one of my professors propose this argument but I will unashamedly admit I don't hear it, and I don't know anyone else who can. Ultimately, any degradation is probably more due to the quality of the anti-aliasing filter used prior to downsampling rather than the uneven relation between the two sample rates.

Then again, I've got a bit of tinnitus due to a rather unsportsmanlike paintball gun maneuver, so bear in mind my advice tends to be more theoretical than practical.

matt_fedora
User offline. Last seen 3 years 49 weeks ago. Offline
Joined: 2010-04-14
Posts:

Thanks duffrecords. I found a few videos on YouTube to illustrate this for the visually minded.

Short:
http://www.youtube.com/watch?v=Fy9dJgGCWZI

Long:
http://www.youtube.com/watch?v=VF2DHsJmf7s

From what I understand so far, you need exactly twice the sample rate of a frequency to be able to reproduce the square wave of that frequency, assuming that the samples are taken at peak and trough. If the sample is taken when the wave form intersects at 0, then the recorded wave from is flat. If the samples are taken in between peak and 0, then the wave form is recorded as a square wave with less amplitude. If you sample slightly above or below twice the sample rate of the frequency, then you get a horrible modulation added to the recording. This is often the case because we do not record perfect high frequencies in relation to the sample rate. In order to prevent aliasing artifacts from appearing in our recordings, we should probably sample at 5 times the highest frequency that we are recording for a lo-fi recording and much higher for a hi-fi recording.

linuxdsp
linuxdsp's picture
User offline. Last seen 6 days 8 hours ago. Offline
Joined: 2009-02-04
Posts:

@matt_fedora: I think there are a few misunderstandings here...

The sampling theorem - which is what we are really talking about here states that:

A band-limited analogue signal that has been sampled can be perfectly reconstructed from an infinite series of samples if the sampling rate EXCEEDS 2B samples per second where B is the highest frequency component of the original signal.

So, when you talk about (re-)constructing a square wave at precisely 1/2 Fs where Fs is the sample rate, you are correct that it cannot be done, but that is because a square wave of frequency 1/2Fs has significant energy at frequencies far in excess of its fundamental frequency (that's what makes it a square wave and not a sine wave...) and thus violates the sampling theorem in this case.

In a completely band-limited system, an anti-alias filter at the ADC prevents significant energy at frequencies at or above 1/2Fs from entering the system. So, obviously if a square wave close to 1/2Fs is presented to such a system then it won't look much like a square wave when its re-constructed, but as previously stated this is because such a wave only has that form because of the other harmonics. However, I would be surprised if most people can distinguish the difference between a square and a sine wave at such extreme frequencies (or even hear them at all) so the debate about whether the sample rate should be high enough to capture such extreme frequencies is largely irrelevant.

Problems can occur if a system creates signals with significant energy at or above 1/2Fs as part of its signal processing, as when these are reconstructed (in the DAC), they will violate the sampling theorem, and these frequency components will manifest as aliases, which will fold down below 1/2Fs and become audible. Simple DSP processing such as filtering or level changes will not normally cause this, but any type of wave-shaping (or clipping) almost certainly will, which is why most correctly designed DSP code will do this at a higher sample rate internally, and then decimate to the correct Fs using an appropriate filter before the output, thereby removing the aliases.

projectMalamute
User offline. Last seen 14 weeks 6 days ago. Offline
Joined: 2010-03-03
Posts:

Some bad information here. A sampled sine wave does not degrade to a square waves as it approaches the Nyquist frequency. You are ignoring the whole process by which audio is reconstructed from sampled data. Although it looks counter intuitive even with only two data points per cycle a perfectly good sine wave can be reconstructed, right up to the Nyquist frequency.

Good information here: http://www.lavryengineering.com/documents/Sampling_Theory.pdf

linuxdsp
linuxdsp's picture
User offline. Last seen 6 days 8 hours ago. Offline
Joined: 2009-02-04
Posts:

@projectMalamute: I don't know if you are refering to my post or to parts of this thread in general, but I would agree that others have posted incorrect information, however, what I was trying to say was NOT that a sampled sinewave degrades to a square-wave, it does not, I was trying to point out that you can, as the sampling theorem states, perfectly reconstruct a band-limited signal from an infinite series of samples if the sampling rate EXCEEDS 2B samples per second where B is the highest frequency component of the original signal.
A square-wave of fundamental frequency 1/2Fs - where Fs is the sampling frequency contains frequency components with significant energy at frequencies far in excess of 1/2Fs which is why a digital system will have trouble reproducing it correctly, however a pure sine-wave can be properly re-constructed, provided it has a frequency below 1/2Fs

projectMalamute
User offline. Last seen 14 weeks 6 days ago. Offline
Joined: 2010-03-03
Posts:

@linuxdsp: It was not your post I was referring to. It was this:

"So if 44.1 kHz completely covers the range of hearing, what's the problem? As a waveform approaches the Nyquist frequency, it is represented by fewer and fewer samples. Graphically, it begins to resemble a staircase, and at the Nyquist frequency itself (remember, only two samples) it is a pure square wave. If you've heard a square wave before you know how harsh and distorted it sounds."

Which is a common and understandable misconception. It is not intuitively obvious that a sine wave can be reconstructed from so little data, but it really can. The Lavry paper I linked to is good reading on the subject. A sine wave at 21K sampled at 44.1k is absolutely reconstructed as a sine wave, not a square wave.

Another point: moving from 44.1 to 88.2 or from 48 to 96 gives you an octave of extra bandwidth. One could argue that this allows for a less artifact prone anti-aliasing filter and thus cleans up the top octave. Moving from 44.1 to 48 has no such advantage. This only extends your bandwidth a little over a semitone.

seablade
User offline. Last seen 43 min 43 sec ago. Offline
Joined: 2007-01-22
Posts:

@linuxdsp: It was not your post I was referring to. This forum software does not seem to allow one to quote another's message. It was this:

Try using the BLOCKQUOTE tags as listed below the text entry;)

Seablade

peder
User offline. Last seen 48 weeks 5 days ago. Offline
Joined: 2007-05-08
Posts:

projectMalamute :
A sampled sine wave does not degrade to a square waves as it approaches the Nyquist frequency.

You got it the wrong way.
A square wave is constructed from a sine with added (higher frequency) harmonics. So, by the theorem, the reconstructed analog signal of a a sampled square wave at 20kHz will be more or less exactly the same as a sine, since all the harmonics that make up the square haven't been sampled in the first place.

projectMalamute
User offline. Last seen 14 weeks 6 days ago. Offline
Joined: 2010-03-03
Posts:

No, I really don't.

By the theorem a square wave sampled close to the Nyquist frequency would be reconstructed as a sine wave and a whole ton of inharmonically related garbage due to aliasing.

On the other hand, a sine wave sampled near the Nyquist frequency can be reconstructed as a sine wave, not as some sort of staircase or square wave as was being claimed.