Pitch Correction options

I’m converting some commercial cassette recordings to CDs and I’m trying to verify the pitch accuracy of the recordings based on having the actual musical scores. As I don’t know how accurate the original studio equipment was (producers sometimes time-compress to ‘fit’ the recordings into the media) nor trust the absolute accuracy of the TASCAM 302 deck I’m using.

I recorded the cassettes at 24/96 using my new TASCAM DR-40 digital recorder, and uploaded the resulting files into WL7, one file per side of cassette. My strategy is to correct for mechanical anomalies first and performance problems second.

Here is the process so far, and where I ran aground:

  1. I noticed that the left channel was lower than the right one. So I used Process->Change Level->Find Current Peak Level on each channel to determine what the overall level difference was between them. Noting that I then boosted the lower one by the difference so both are relatively equal. Since it is average, any balance issues in the mixdown should still be valid (instrument location on stage).

  2. I selected two passages from two songs where the performers sustained one unison note. #1 was a Flute, #2 was a children’s choir. I created two new audio files and pasted the copied ‘phrases’ into the window. I checked for DC offset (Analysis->Global Analysis-> Estras-> Find DC Offset) and found it was 10db for one and 14 db for the other. So, each recording is offset and will need to be corrected separately.

  3. Having corrected the DC offset, I then looked at the pitch of the single tones. There appears to be two places to get this information. Analysis->Global Analysis->Pitch Tab which provides the average pitch of the left and right channels (Even when only one channel is being analyzed !!! ) and they are different even with this sample of a single tone…(F#5 and D#5)… The second choice is Process->Pitch Correction->Find current pitch of audio selection (E1). Since it is E1/-16cents I chose to ‘correct’ it for this note. When I select the pitch E1 and have it calculate the correction it comes up +16 as expected (auto/preview, best, correct formants, Modulate formants correction). When I check the pitch again, using the same tool, it tells me E1 +2 cents. I can go -2 and it will be E1 -1 cent, etc. It does everything but land right on 0 cents … Why?
    Am I correct in assuming that I can determine the pitch difference (i.e. +16) and just apply it to the entire file to correct the overall pitch variance from the cassette speed differences? Or does this kind of variation usually go all over the place depending on the skill of the performers and trying to correct it universally is a waste of time?

What is the difference between the Pitch Correction analysis, the Global Analysis Pitch tab, and the Pitch Quantizing choice under Process? Which should be used where?

I also noticed that artifacts accumulate if you correct the pitch up and down 1 cent repeatedly. The effect is a percussive effect at the start of the sample. Why? It seems to me that this should be simple addition/subtraction of numbers from the frequencies.

What should I be doing instead? I tried this approach because I know I have voice recordings from old radio programs/dramas that are recorded at incorrect speeds and need correction as well. So, I need to understand the preferred process.

Thanks in advance for you help.

First of all, start by removing the DC offset, before doing anthing else to the audio. Otherwise you’d have to do it again.

As for finding current pitch, try to locate a couple of places in one song to see how much off it is, and then correct once for the average. The measurement cannot be exact because every note also contains harmonics and acoustics, so don’t try to reach perfect pitch - it’s impossible with such source material. Another thing is that this will only work if the source recording was at least somewhat constant: if the speed of the original tape fluctuated you can’t correct ofcourse. How the separate pitch processes in WL differ I wouldn’t know without being in front of it…

In general, depending on how much you intend to do to this audio, I’d say try to make the least processing steps as possible (one time pitch correct, one time level correct etc.). If you need to save inbetween passes, use the highest resolution: 32 bit float files.

Perhaps someone will correct me if I’m wrong, but …

… if the reason why the pitch is incorrect is that a tape machine was running at a slightly wrong (but constant) speed, I’d suggest it would be best to use a type of pitch correction that corresponds to playing the sound file at a slightly higher or lower speed, rather than a method that (chops the sound into short segments and) alters the pitch without varying the duration. As well as (I think) causing fewer pitch-change artifacts, a “speed-change” method would also adjust the duration of the playback, correcting a slightly wrong duration arising from the wrong tape speed.

And (in agreement with Arjan P) even if you do do some trials involving a sequence of pitch changes til you find the correct amount, don’t keep the result in your final version - start again and do the whole pitch change in one step, to avoid the cumulative errors that you’ve noticed.

You asked why artifacts accumulate with repetitive adjustments - that’ll be because pitch change involves altering the wave data, and the process is inherrently one that causes artifacts with each change (especially when the duration is unchanged(?)). So, for instance, moving the pitch up and then back down won’t precisely recreate the original data.

Anyway, before tweeking the pitch/duration of the recording, I wonder whether you ought, perhaps, to be confident that the pitch of the performances wasn’t slightly off standard?

Yes, you’re right chase, ofcourse the length of the recording is involved. But you either use Time Stretch in WL without keeping pitch, or you use Pitch Correct without maintaining time. I think Time Stretch works better actually, but you have to experiment a bit with it (and use undo going back all the time!).

Thanks arjan and chase for your replies, I did find the info useful. I changed my process to always do DC offset correction first.

I don’t understand why:

A) Two pitch correction tools give completely different results for the same file

B) Corrections to pitch using an absolute number as a goal, can’t result in achieving the target frequency

C) How in this day and age, I’m not provided with a tool that will adjust overall speed of a recording to achieve a specific target pitch change using software (it is just a calculation after all) rather than forcing me to spend endless hours trial-and-error divining the speed change needed to adjust an overall recordings resulting pitch a certain amount.

The recording I’m working with at the moment is a final mix of three separate recordings (flute/organ, then recorders, then 3-part children’s choir), so each of the sections may require different correction, even if I correct for the overall speed of the tape.

We’re dealing with music, something that has well-defined physics behind it - at least terminology (pitch/frequency/formants/harmonics) that are well documented. I guess I’m hoping to have a tool with a selection of ‘how’ to make the pitch correction (vary frequency vs change speed of recording).

The pitch recognition only works if the pitch is the sale from start to end, and does not cary. I guess we are not in this case here. Or isolate a small chunk and analyse that part only.

A) Don’t know about this without more info, but it’s probably due to different approaches and algorithms.
B) It can, but not with such source material. As a test, create a pure 440 Hz sinus wave in WL, then change the pitch: you’ll see it will be 100% accurate.

Yes, I agree a bit more could be done in this department, especially in the Time Stretch dialog it would be nice to enter a source and target pitch and have the time stretch value (speed of the recording) based on that. Might be something for the Wavelab 8 feature request topic.