Several threads here, as well as endless threads in DAW forums, have focused on the tools and tasks of musical expression for MIDI. Beyond doubt getting MIDI to sound convincingly expressive requires a great deal of work, often with tedious attention to detail, on articulation, on dynamics, on timbre/intensity, etc., etc., etc. This is all in addition to the obvious musical requirements of pitch and note length.
I have always thought the best potential for a tool that would render musical expression as naturally as possible would be one based on the human voice or whistling as a source to be analyzed and converted into the data needed to implement expression. It would not be exhaustive or perfect, but if we are talking about a tool that most everyone can use, what better medium for combining all the elements of musical expression at one time than the human voice? Pitch. Duration. Volume. Timbre/Intensity. Dynamics. Articulation. All available as one continuous, musical data source. Producing musically expressive material is almost effortless.
Imagine if you could sing or whistle (depending on one’s innate talents in this area) a line and have an audio to midi tool that would translate what it hears into appropriate MIDI information, including assigned CC Values. Tidying up and replacing translations that don’t quite work as desired might be needed, but oh what a savings in tedium.
And consider how naturally humanized such a tool would be. What is more human than human? Right now our humanizations are either algorithmic or based on keyboard input. Those not proficient in keyboard input are severely handicapped. Myself, I can input notes, but I’m simply not wired for playing keyboard well enough for recording (although I do it from time to time, with predictable results).
I know there are existing tools for audo to MIDI conversion (the standard ones I’m aware of are Melodyne and Cubase VariAudio), but so far as I am aware most of those focus on pitch, duration, and perhaps volume. And they can require more effort to use them than is justified by the limited results. None I am aware of tackle the full potential of the source material. I’m sure the task of creating such a tool would be very hard. For all I know the reason it hasn’t been done is it’s impossible with current technology.
My guess is that, in addition to overall capability, such a tool would require the user to “train” the translator to correctly recognize what is intended.
Anyway, just throwing this out there.