More multi-processor and VST questions / discussion

I am running Cubase 6.5 on an i7 2600 Sandy Bridge system. I have multi processing turned on and core parking turned off via the known registry setting for Windows 7.

I have noticed that with a single channel VST instrument Cubase 6.5 uses only one processor and that I do not get multi-processor enhancement unless I am running multiple VST instruments. When I use a patch for a single channel CPU-intensive VST instrument that has long envelope decays and I play a large number of notes (say arpeggiated chords on my keyboard), the VST Performance meter will go pretty darned high, but in Taskmaster I see Cubase 6.5 is using only 1 core (or 1 hyperthread) pretty much exclusively. When I load multiple instances of the plugin with multiple MIDI tracks and play them all at once then the VST Performance meter doesn’t change much relative to one instance but multiple cores (hyperthreads) get used.

So why is there not multi-core or hyperthreading for a single instance of a virtual instrument (VST plugin)? It seems to me that processing MIDI note on events is almost (but not quite) “embarrassingly parallel” in the sense that each note on event can be sent to a processor that is least busy.

I have done some searching and read that single core per channel has something to do with summing the outputs within the cache for one core. Is this the reason for the load balancing that I am seeing?

Or am I missing a setting somewhere in Windows 7 (64 bit) or Cubase 6.5 that will balance the load for a single VST among multiple cores or hyperthreads?