hyperthreading question

조회 수: 2 (최근 30일)
Mate 2u
Mate 2u 2012년 5월 1일
Hi there, I have matlab running on a Quad core I7 processor which has 8 threads. I did a loop of a function test changing the number of labs. These are my results in seconds:
matlabpool 12 21.61s
matlabpool 10 21.96s
matlabpool 8 22.27s
matlabpool 6 23.29s
matlabpool 4 25.54s
I tested it on another loop with another function and also found that using 12 labs is the fastest? How comes this is happening if I only have 8 threads available and 4 cores?
I look forward to a reply.
  댓글 수: 3
Ken Atwell
Ken Atwell 2012년 5월 2일
The timing differences are pretty subtle, all with 20% of each other. I'd be tempted to call this a "tie". :)
Richard Brown
Richard Brown 2012년 5월 2일
That was my initial thought too, but the numbers are uniformly decreasing as the number of labs went up ... it would be interesting to know if that always happens

댓글을 달려면 로그인하십시오.

채택된 답변

Edric Ellis
Edric Ellis 2012년 5월 2일
There are several reasons why different numbers of workers can behave differently. In some situations, even if each PARFOR loop iterate takes quite a long time, it can be quicker to run fewer workers than you have cores available; other times, it can be quicker to run more workers than you have cores. This is because of the various resource contentions that your code encounters.
If your algorithm is memory bound - i.e. the main contention is for access to RAM (for example, adding together two large matrices - the amount of computation is trivial compared to the time it takes to get the data into the CPU), then you often find that fewer workers perform better.
If your algorithm is compute bound - i.e. not much memory access compared to the compuational complexity, then more workers (up to the number of physical cores) works better.
It's possible in some cases that if your algorithm is bounded by some sort of latency elsewhere, that running more workers than you have cores works best.

추가 답변 (1개)

Geoff
Geoff 2012년 5월 2일
I'd like to see the results if you expand your operation to something that takes about 10 minutes. And do it with the utter minimum of background processes running.
You're talking about a few hundred milliseconds, which can easily be eaten up by, say, a piece of software doing some routine background work.
Also, in all fairness, you MUST ensure that the number of iterations in your parfor loop is a multiple of 4, 6, 8, 10 and 12, or some number sufficiently high as to cancel out the effect of some workers finishing the task early, while others have to perform one extra loop (I'm assuming your test function is a constant-time operation).
If would be interesting to see your test code, if you are happy to post it.
At this stage, I'm not open to accepting that 10 workers is faster than 8 on a machine with 8 logical cores. But that could be due to my own ignorance. =)
  댓글 수: 1
Geoff
Geoff 2012년 5월 2일
Just an afterthought about my comment on the number of parfor loops. I would in fact prefer starting with prod([4,6,8,10,12]) iterations, and multiply that by about a million. The goal is that you share out a decent amount of work evenly between all your workers, then set them loose.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Logical에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by