Render Speed

bob99 · January 20, 2015, 6:46am

I’m looking for a computer to replace an existing Win 7 Pro, and I’m running a basic speed test of renders on Win and Mac. Thought someone here might know about the results I’m getting. In all cases I’m rendering a one hour 96K montage to 44.1 using Crystal Resampler in Ultra. These are the results:

Win 7 Pro Intel Xeon 6-core 3.3 GHz, 6 GB 1333 MHz RAM. Render time: 10:00
Win 7 Pro Intel Core 2 4-core 2.33 GHz, 3 GB 800 MHz RAM. Render time: 15:00
Mac OS 10.7.5 Intel i7 4-core 2.3 GHz, 4 GB 1600 MHz RAM. Render time: 8:30
Mac OS 10.7.5 Intel i5 2-core 2.3 GHz, 8 GB 1333 MHz RAM. Render time: 10:15

The Windows computers are about 4 years old. The Macs are about 2 years old.

How can the 2.3 GHz quad core Mac be faster to render than the 3.3 GHz 6-core Windows? The faster RAM? The generation of processors? The Mac OS? Even the i5 on the Mac is almost as fast as the Xeon. Also, I can see in task manager that the Xeon is using all 6 cores during the render. But maybe I’m overlooking something.

If anyone has time to try this on their machine, I’d really appreciate seeing what you get.

Also a final question, maybe to PG, but does the RAM speed affect render speed?

bob99 · January 20, 2015, 7:13am

btw, for the Xeon, Turbo-Boost is on, Hyper-threading and Speed Step are off.

PG1 · January 20, 2015, 7:17am

The cpu generation is certainly the best bet.
I don’t think the RAM speed is very important for the resampler process.

robw · January 20, 2015, 7:41am

Let me paraphrase the OP, How can a “slower, fewer core” processor be faster than a “faster, more core” processor?

The answer lies in CPU generation and exact model. There isn’t sufficient information to get to the heart of the differences but in many newer generation processors operations like floating point Multiply and Add can be done in fewer clock cycles and indeed as a single instruction. Things like Multiply+Add are very common operations in certain types of signal processing - so if you can do that in 1/2 as many clock cycles then you’ll get similar performance to a processor that can’t at twice the clock speed.

There are other factors that kick into play to with Level 2 cache size, file IO performance (SSD’s rock , the code being optimised for the particular processor architecture. I hope the explanation though helps.

It’s a little surprising seeing seeing this result with commodity hardware but in the high performance computing arena its not uncommon though it is becoming less so as technology becomes more mainstream.

In this case you may well be seeing a processor generation change that happens to suit your chosen benchmark. I’d be inclined to ensure your other factors, hard disk speed for example, are also equal in the comparison.

bob99 · January 23, 2015, 12:36am

Thanks for the replies PG and robw, and for the explanation of the possible fewer clock cycles. Robw, I’ve tried an SSD (Adata S599 esata) as Read/Write storage for this test, but it makes no difference in the render times vs the standard internal drives. Are you getting faster render times in Wavelab with SSD? (storage or system drive).

bob99 · February 2, 2015, 7:06am

More specifics for this test. Does this make sense? Is it possible this is Win vs Mac, or is this fewer clock cycles? Has anyone compared Wavelab Win and Mac with the same processor?

Win 7 Pro Intel Xeon (W3680) 6-core 3.3 GHz, 12MB L3 cache, 6 GB 1333 MHz memory. 7200rpm. Render time: 10:00

cpubenchmark.net CPU MARK for W3680: 9340.

Mac Mini OS 8.5 Intel i7 (3615QM) 4-core 2.3 GHz , 6MB L3 cache, 4 GB 1600 MHz memory. 5400rpm. Render time: 8:30

cpubenchmark.net CPU MARK for 3615QM: 7349.

All Wavelab 8.0.4 32bit. I tried 64bit on the Mac and it was about 15 seconds faster for this process.
I also tried enabling Hyper-threading on the Windows machine, but it made no difference to the render time.

Jarno · February 2, 2015, 3:26pm

Simple:
i7 performance/core is about 18% greater than Xeon (9340/6 * 1.18 = 7349/4).
i7 renders about 18% faster (510s * 1.18 = 600s).
Conclusion: Wavelabe seems to render in sigle thread and doesn’t do multiprocessing.
NOTE “1.18” in calculations above is approximate value, not exact.

bob99 · February 3, 2015, 1:36am

Thanks Jarno. I’m still confused because I don’t know how single thread or multiprocessing affect the render, but I can see in task manager that all 6 cores of the Xeon are being used for this Wavelab SRC render process. (that was buried in one of my earlier posts). Does that still mean the 2.3 i7 4C with 7349 score should render faster than the 3.3 Xeon 6C with 9340 score? Sorry if I’ve misunderstood because of my lack of knowledge.

bob99 · February 3, 2015, 5:06am

When I say I don’t know how single thread or multiprocessing affect the render, I mean I don’t fully understand how they work, but my assumption has been that the total speed of the 6 cores operating at the speed they do would be greater than the total speed of the 4 cores operating at the speed they do. Is that not correct? Even based on those cpubenchmark numbers.

Jarno · February 3, 2015, 1:39pm

Processing using single thread only runs in single CPU core in any given moment. What you see in Task Manager is average values of last second or so. During this second operating system has switched Wavelab from one core to another many many times.

Performance of single-threaded application is always* only depedent on speed of a singe processor core.

*) Not always, but I don’t go there. Wrote my bachelor thesis on computer performance analysis and if I’ll start wtriting about it nobody would understand anyway.

bob99 · February 3, 2015, 7:48pm

Wow. So I’ve been mistakenly thinking plugin render processing speed in Wavelab was affected by more cores, but it’s not? The only thing that matters for this is the “single core performance” number at cpubenchmark.net? A dual core processor at 3.0GHz is just as fast as a 6 core processor at 3.0GHz in the case of Wavelab processing (as long as the SCP at cpubenchmark is the same)? All this time I thought otherwise, seeing the processing shared across the cores and thinking that was proving more cores was faster and performing the render faster.

So the advantage of a 6-core Mac Pro or PC is what? No real advantage if only using Wavelab and not using multiple tasks in Wavelab? A dual core would serve just as well, and would be just as fast?

PG1 · February 3, 2015, 8:21pm

wavelab uses a single core to render a dsp task.
wavelab uses several cores when encoding to several file formats at the same time.
wavelab uses several cores during batch processing (one per task). This is the context where all your cores can really be used at the same time.
Apart that wavelab use one core for doing file read/write.

bob99 · February 12, 2015, 5:59am

PG, is that how it will always be (single core per dsp task)? As I said I’ve been naively assuming 6 cores is generally better than 2 or 4 because it looked like Wavelab was using all cores when rendering a single task (see pics). But as Jarno said (if I understand correctly), that’s just the operating system switching cores, using only the performance of a single core. And his explanation of the speed (being based only on the single core performance of the processor) correlates exactly with my (originally confusing to me) results.

I’ve always thought that Wavelab performance of a single task is not helped by multiple processors, but the core thing was news to me.

I’m certainly not looking for another program, but I just wondered if there are any audio programs that render or bounce a session, edl, or montage, processing all plugins etc., where that single task is actually using more than one core? (if I only perform one task at a time, which is what I do, assuming rendering one montage with all plugins and processing is considered one task). Cubase, Nuendo, Pro Tools, Studio One, Logic? Anything?

PG1 · February 12, 2015, 7:13am

When there is a single plugin, a single core can be used (unless the plugin has special core handling, but I have never seen that). When several plugins are used, it is possible to imagine the usage of more than 1 core, but the expected gain is not obvious at all because of audio serialization: “plugin C needs to wait on B which needs to wait on A, etc…”. I mean, the slowest plugin will always be a bottleneck, even if there are 100 cores.
WaveLab does not perform this kind of optimization for plugins (maybe in the future). But it does it when performing rendering of multiples file formats (each file format in a different core).