why innerjoin does not work in parfor?

Question

Boram Lim 2018년 5월 4일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/399159-why-innerjoin-does-not-work-in-parfor

답변: Edric Ellis 2018년 5월 8일

While trying to use parfor, I am trying to find an error. I found that using a innerjoin (line 10-12 below) makes a problem. It is okay when I use just for-loop but it does not work with parfor. Why it causes a problem? I used innerjoin as a way of randomly sampling 'id' (one of a variable in my data) and merge it with original dataset (dta2 is here). Any idea or solution? please let me know if there is anything to be cleared here to understand.

parpool(4)
N_boot = 5;
coeff_out2 = zeros(N_boot,N_coef);
parfor i = 1:N_boot
dta2 = dta;
decisions2 = unique(dta2.decision_id);
Ndecisions2 = size(decisions2,1);
sampled_id01 = randsample(decisions2,Ndecisions2,true);
sampled_id2 = dataset2table(mat2dataset(sampled_id01));
sampled_id2.Properties.VariableNames{1} = 'decision_id';
resample_dta = innerjoin(sampled_id2,dta2,'Keys','decision_id');
resample_dta = table2array(resample_dta);
result1 = mean(resample_dta(:,1:4));
coeff_out2(i,:) = result1;
end

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Boram Lim 2018년 5월 5일

Error using mat2dataset (line 63) Transparency violation error. See Parallel Computing Toolbox documentation about Transparency

Error in Model01_interpolated_May1 (line 62) parfor i = 1:N_boot

Boram Lim 2018년 5월 5일

This is the error message.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Edric Ellis 2018년 5월 8일

2
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/399159-why-innerjoin-does-not-work-in-parfor#answer_319246

MATLAB Online에서 열기

(x-post from identical question on stackoverflow)

Unfortunately, innerjoin uses the inputname function, which is causing the "transparency violation" error. There's a simple workaround, which is to wrap the call to innerjoin, like so:

innerjoinFcn = @(varargin) innerjoin(varargin{:});
parfor ...
    ...
    resample_dta = innerjoinFcn(sampled_id2,dta2,'Keys','decision_id00');
end

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 2

Walter Roberson 2018년 5월 5일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/399159-why-innerjoin-does-not-work-in-parfor#answer_318776

MATLAB Online에서 열기

I can get further:

decision_id = randi([1 9], 50, 1);
d1 = randi([-10 10], 50, 1);
d2 = randi([-2 2], 50, 1);
d3 = randi([0 255], 50, 1);
dta = table(decision_id, d1, d2, d3);
N_coef = 4;
cp = gcp('nocreate');
if isempty(cp); parpool(4); end
N_boot = 5;
coeff_out2 = zeros(N_boot,N_coef);
parfor i = 1:N_boot
    dta2 = dta;
    decisions2 = unique(dta2.decision_id);
    Ndecisions2 = size(decisions2,1);
    decision_id = randsample(decisions2,Ndecisions2,true);
    sampled_id2 = table(decision_id, 'VariableNames', {'decision_id'});
    resample_dta = innerjoin(sampled_id2,dta2,'Keys','decision_id');
    resample_dta = table2array(resample_dta);
    result1 = mean(resample_dta(:,1:4));
    coeff_out2(i,:) = result1;
end

This gives up on the innerjoin instead of earlier.

The conversion to table was running into problems when it was not being told variable names when the table was constructed, which could hypothetically be explained if the variable names themselves were not guaranteed to be the same in the workers (because the default creation of tables involves using the name of the variable being converted as the column name.)

We could hypothesize that something similar might be happening with the innerjoin.

I am not sure how to fix it yet, as I am still trying to figure out what the intention of the code is, especially in regard to what should happen when there are multiple table entries with the same key.

Or is it safe to assume that the decision_id values will be unique? If so then the call to unique would seem to be redundant ?

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Walter Roberson 2018년 5월 5일

Right but to do this efficiently I need to know if decision_id is unique in dta or not, and if it is not then what the meaning of sampling with it should be.

Boram Lim 2018년 5월 5일

it is not unique. As shown in the example in the link, first I need to sample 5 ids from unique decision_id. and then need to produce a new data set (it's for bootstrapping). Do you understand what I want to do in the link? Using a innerjoin worked in just-loop as an answer of the question in the link. but it seems I need to find alternative way for the work in parlor

댓글을 달려면 로그인하십시오.

why innerjoin does not work in parfor?

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

답변 (2개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

why innerjoin does not work in parfor?

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

답변 (2개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기