Quantcast



Docs


Blogs


Forums


Samples


Media


Labs


Resources

 




DevCentral > Weblogs > Persistently Different - Not right, just different.
 Multi-Core III - a panacea of options
posted on Monday, September 22, 2008 12:13 PM

I find it amazing how many ways we can find to slice a pizza, skin a cat, or solve a technology problem. Truly amazing. Thus far in my multi-core odyssey, I have run into driver-based solutions, library-based solutions, and shim-based solutions to how to maximize use of multi-core. That's not to mention the upcoming version of C++, which is supposed to include the solution in the compiler.

Each of these approaches has its strengths, though I think the best option would be for the OS to resolve the problem directly. It already has a scheduler and mechanisms for semaphores and atomic code. Thought certainly SMP isn't the most optimal solution, it is already in contemporary operating systems and it might be a good first step to extend it to include multi-core.

On my list of people to talk to or products to look at is RogueWave - a company that was near and dear to me ages ago for their libraries, but I haven't heard from in a very long time, and eXludus, a company I had not heard of before a comment to one of the posts in this series. Also on the list is a review of the Cilk++ documentation, which is now available in public beta.

After years of working as a reviewer, I have a tendency to want to compare all of these solutions and give you definitive answers on which is better. I'm not going to do that for several reasons - first is the completely different architectures of some of these solutions, and right after that is the fact that talking about each one's strengths and weaknesses (from a developer perspective and a resource utilization perspective) is more useful to you.

Cilk++, for example, is a library that handles parallelizing the code for you, you just have to make a few calls to make it work from within your code. Excludus claims no changes to source or OS... We don't believe it, something is scheduling CPUs, but we'll give them a chance to prove it. Likely that's market-speak for "we hook the calls and change the processing" which in my book is the same as installing a shim, no?

RogueWave is also kind of sticky about how they handle implementation - they say in their marketure that the Hydra system doesn't require "significant recoding", but the examples in their video all include the phrase "if your developers use Hydra to develop the application..." I guess you're not doing significant recoding if you code it with their stuff from the ground up, heh. Again, I'll talk to them and find out for you.

My reasoning for pursuing this is simple - we can make things faster by pooling your servers to handle heavier load, but the other bit of the equation is making the servers themselves faster, and what impact that has on the network - remember that I tested a box not so long ago (about 2 years) that buried the network cards before maxing out anything else - network was the bottleneck, and when it was removed, disk became the bottleneck. CPU and memory were clipping along without any issues at all. So between our solutions and these solutions, you should be able to uber-optimize your infrastructure... If someone gets multi-core processing right anyway.

I'd also like to point out the article submitted in a comment to my first post in this Multicore series by Dale C. - Some Thoughts on Concurrency it does address the issue of threading pretty well, and talks about the disparity between academia's brainstorming and real-world needs (anyone else remember when ML was going to take over the world? Lori and I were subjected to it during our Masters studies, and the instructor was just gushing with the possibilities (by then ML was 20 years old, but OSS had given it a new lease on life). A fun language to play with, but not much in the productivity department).

There are more out there, I'll keep digging into the topic as long as you all keep telling me it matters to you.ExcludusLogo

Don.

Share this post :


Email This
  del.icio.us
      

Feedback


9/23/2008 7:58 AM
Gravatar To clarify, Cilk++ is a minimal set (3 keywords) of language extensions to C++ which allow the programmer to express parallelism while retaining the serial semantics of the code. The system comes with a compiler (Visual Studio and gcc currently supported), runtime system, development tools, and libraries. We're currently accepting applications for our Early Visibility Program (http://www.cilk.com/Home/sign-up-page).
Charles E. Leiserson

11/12/2008 1:43 AM
Gravatar Don,

For the last 4-5 hours, I've been surfing the web to "catch up" to the SMP "craze". I'm writing to you for two reasons that caught my eye; 1) your realization for the need for the OS to do more here, and 2) your statement of not wanting to preach what is better simply because there are vast degree of considerations.

We have I guess what can be called based on the labeling out there, a monothelic server. It is a 12 year old high end RPC client/server where clients are modem and IP hosting servers. It uses traditional multi-thread design models and for the most part, when it came to scalability and performance questions, we defined the hardware needs for the customers to either scale up (add more CPU/Memory, etc) or scale out, spread out the hosting server, load balancing.

Back in 2003 or so, I began to play with IOCP (I/O Control Ports) but I had determined there were still bottlenecks that Microsoft IOCP techology did not resolved. More importantly I was still looking for a reason why we would have to "break" what wasn't broken. The only issue we had was LOADING issues - make the PC bigger and faster was essentially the general solution. Our load needs is generally around 5-10 on the low end, to 50-100 on the mid side, and over 100 to 250 users on the high end.

The bottle necks were generally GUI and database I/O - which has improved for us, and here again, we even tell customers get a GAME CARD if you want faster GUI I/O and get a better backend database server that we support and thats usually all it takes - we also tell them 'STOP USING THE MACHINE AS A DESKTOP' especially our small market customers, and if you don't use it as a desktop - then you don't need any high overhead AVS system on it.

Anyway, reading your articles and others, makes me believe that its still all premature. You can do more I guess to leveledge multi-cores, but there are still limits to what it can offer you.

Just consider that I see my VS2005 compiler 300-400% slower on what is suppose to be a faster dual core Intel XP box over my single CPU Windows 2000 box - and Microsoft has no real answer to this but to tell us get even bigger machines than what we already have.

But whats funny about this? The slow down had nothing to do but related to the hidden overhead in the COMPILER accessing the sub-system and libraries, in this case the CRYPTO library for some odd ball reason. Nothing to do with the C/C++ compiler itself or dual core or only having 2G RAM, but pure overhead in extra stuff the compiler is doing with slow sub-system items.





Hector Santos
 Leave Feedback
Title  
Name  
Email
Url
Comments   
Please add 1 and 1 and type the answer here: