Chipmunk voices comes from speeding up audio. The cartoon chipmunks were sung by men at low pitch and tempo. When the tape was played back at a high speed, the pitch and tempo would pick up, along with the formants.
Formants are the key.
[Formants are the near continuum of overtones that appear in the kHz range of a note. The harmonic series goes something like : root, root, 5th, root, 3rd, 5th, flat 7th, root, 2nd, 3rd, etc., with the notes getting closer and closer together as you ascend, owing to the fact that the series increases linearly with frequency while the musical scale increases exponentially. It doesn’t take long for elements of the series to land on mircotones grouped closely together. Then you get a near continuum of frequencies.]
Pitch shift tools don’t (or shouldn’t) move the formants, just the lower harmonics. Some kind of sampling is required, a bit like what happens in granular synthesis. It’s a much more elaborate process than merely speeding up time.
You need a tool that can shift formants independently from the lower part of the harmonic series. Elastique has some algorithm options when it comes to formants. However, many people prefer a more sophisticated tool and buy some version of Melodyne by Celemony. Which is my plan when I can scrape together the cash.