Scripts compiled with mex not accessible from parpool declared as "Threads" instead of "Processes"
조회 수: 8 (최근 30일)
이전 댓글 표시
I have following code. This runs completely fine when declared in a parpool declared as Processes. However when I change it to Threads I run into two issues.
- memmapfile will result in crash
- decayCube_avx2 will result in crash
The fact memmapfile crashes the code isn't a huge deal. In threads enviroment I can handle the memory sharing in some other way. The problem is that my mex functions written in C++ seemingly don't run. I've check path variable inside this funciton and the directory with those scripts is inside the path. Running latest MATLAB R2024b update 6, but the issue was same on update 5. Tested primarly on linux (not officially supported arch) but I was able to replicate the issue on Windows 11.
I would really like to use Threads since Processes consume a whole lot of RAM and on a 16 gig system I cannot declare pool with more than 4 of them wiithout running out of memory,
function processBatch(buffer, spreadPattern, rawCubeSize, yawBins, pitchBins, processRaw, processCFAR, decay)
% ... some other cases here
cfarCubeSize=rawCubeSize([1 3 4]);
cfarCube = memmapfile('cfarCube.dat', ...
'Format', {'single', cfarCubeSize, 'cfarCube'}, ...
'Writable', true, ...
'Repeat', 1);
if(decay)
batchDecay = single(prod([buffer.decay]));
decayCube_avx2(cfarCube.Data.cfarCube, batchDecay);
end
for i = 1:numel(buffer.yawIdx)
contribution = buffer.cfar(:, i);
if(decay)
batchDecay = single(prod(buffer.decay(i:end)));
decayCube_avx2(contribution, batchDecay);
end
cfarCube.Data.cfarCube(:, buffer.yawIdx(i), buffer.pitchIdx(i)) = contribution;
end
end
Just to be clear this issues aries even with mex script that don't access memmap file data like here. For example script bellow uses only local variables and the result is the same when called from Thread.
#include "mex.h"
#include <immintrin.h>
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
// Validate inputs
if (nrhs != 2) {
mexErrMsgTxt("Two inputs required: adjPattern and rangerDoppler.");
}
if (!mxIsSingle(prhs[0]) || !mxIsSingle(prhs[1])) {
mexErrMsgTxt("Inputs must be single-precision.");
}
// Get dimensions of adjPattern (MxN) and rangerDoppler (PxQ)
const mwSize *adjDims = mxGetDimensions(prhs[0]);
mwSize M = adjDims[0]; // Yaw dimension
mwSize N = adjDims[1]; // Pitch dimension
const mwSize *rangeDopplerDims = mxGetDimensions(prhs[1]);
mwSize P = rangeDopplerDims[0]; // Fast time (range) dimension
mwSize Q = rangeDopplerDims[1]; // Slow time (doppler) dimension
// Create 4D output array [P, Q, M, N] - Fast time x Slow time x Yaw x Pitch
mwSize outDims[4] = {P, Q, M, N};
plhs[0] = mxCreateNumericArray(4, outDims, mxSINGLE_CLASS, mxREAL);
float *output = (float *)mxGetData(plhs[0]);
// Get input data pointers
float *adjPattern = (float *)mxGetData(prhs[0]);
float *rangerDoppler = (float *)mxGetData(prhs[1]);
// Pre-calculate the total size of the range-doppler rangerDoppler
mwSize rdSize = P * Q;
// Iterate over each antenna pattern position (yaw, pitch)
for (mwSize j = 0; j < N; ++j) {
for (mwSize i = 0; i < M; ++i) {
// Get pattern value for this yaw/pitch combination
float patternVal = adjPattern[i + j * M];
// Calculate base index in the output array
mwSize baseIdx = i * rdSize + j * rdSize * M;
// Process all range-doppler points at once with vectorized operations
// This leverages the contiguous memory of range and doppler dimensions
mwSize idx = 0;
// Use SIMD for vectorized operations on contiguous memory
__m256 pattern_vec = _mm256_set1_ps(patternVal);
for (; idx + 7 < rdSize; idx += 8) {
__m256 rangeDoppler_vec = _mm256_loadu_ps(&rangerDoppler[idx]);
__m256 result = _mm256_mul_ps(rangeDoppler_vec, pattern_vec);
_mm256_storeu_ps(&output[baseIdx + idx], result);
}
// there should never be a case where P*Q is not multiple of 8
// for (; idx < rdSize; ++idx) {
// output[baseIdx + idx] = rangerDoppler[idx] * patternVal;
// }
}
}
}
댓글 수: 2
Walter Roberson
2025년 5월 11일
To confirm:
You are trying this with parpool("threads") instead of using backgroundpool() ?
채택된 답변
Walter Roberson
2025년 5월 11일
Check Thread Supported Functions
In general, functionality in Graphics, App Building, External Language Interfaces, Files and Folders, and Environment and Settings is not supported.
(emphasis added.)
댓글 수: 0
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Programming Utilities에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!