Minimize Difference in Partition Sums from Experimental Data
이전 댓글 표시
I have a 2,000 row dataset representing measured data. I would like to create 60 partitions of 30 measurements (discarding the 200 outliers or "worst contributors") with the partitions created to minimize the differences in the sum of the measurements in each partition. Or, perhaps more simply, I want each partition to have as close as possible to the same sum of the measurements in partition.
My first attempt was based on a random sampling approach, which was inefficient as expected. I am considering a histogram-based approach for my next attempt, but wanted to sample the community for ideas or best practices first.
Thanks!
댓글 수: 2
Bjorn Gustavsson
2020년 11월 26일
This sounds like a variant of the knapsack-problem - that similarity makes me think that it is a hard problem, but that also means that there should be algorithms for this available...
Bruno Luong
2020년 11월 27일
편집: Bruno Luong
2020년 11월 27일
Do you really want to find the best partitions (which is very hard to solve) or you just want to have partitions having the sums that are "close enough"?
답변 (1개)
Bjorn Gustavsson
2020년 11월 26일
0 개 추천
댓글 수: 2
Jeffrey Corbets
2020년 11월 27일
Bjorn Gustavsson
2020년 11월 27일
That's a bit of a bummer - and also slightly confusing to me, I cannot see why they should be that restricted, perhaps that is somewhat of a artificial limitation (since you will be using finite precision numbers they are not really real numbers anyway?). Perhaps some of the algorithms can be adapted anyway....
카테고리
도움말 센터 및 File Exchange에서 Matrix Indexing에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!