MATLAB Answers

Quick removal of records in structure

조회 수: 3(최근 30일)
Maciej Trzeciak
Maciej Trzeciak 16 Jun 2020
편집: per isakson 17 Jun 2020
I am dealing with big amount of data in structures. I start with data in structure struct1 with several fields, and I need to recast it into structure struct2 (with different fields). Let's say each 8 records of struct1 go to 1 record of struct2. Both guys are pretty big, so I cannot keep both in my ram. So I am doing the calculation in a loop and after each operation struct1==>struct2 I delete the used 8 records from struct1, by simply:
struct1(i1:i2) = [];
I am looping backwards so deleting doesn't change the rest of the indeces. Doing so, struct1 gets smaller as struct2 gets bigger, and my memory is fine. The problem is that the deleting operation is horribly inefficient.
Is there any smart (and quick) way to delete the records in a structure?

  댓글 수: 5

표시 이전 댓글 수: 2
Walter Roberson
Walter Roberson 16 Jun 2020
Sometimes it becomes easier to think in terms of
idx_to_save = setdiff(1:length(struct1), i1:i2);
struct1 = struct1(idx_to_save);
Especially if you have several groups to delete.
per isakson
per isakson 16 Jun 2020
@Walter, I fail to reproduce your observation on R2018b/Win10.
sa1 = struct( 'f1', num2cell( randi([0,9],1e6,1) ) );
sa2 = struct( 'f1', num2cell( randi([0,9],1e6,1) ) );
ix1 = 2e5;
ix2 = 9e5;
sa1(ix1:ix2) = [];
sa2 = sa2([1:ix1-1,ix2+1:end]);
>> cssm
Elapsed time is 0.051256 seconds.
Elapsed time is 0.063430 seconds.
>> cssm
Elapsed time is 0.063631 seconds.
Elapsed time is 0.063024 seconds.
Walter Roberson
Walter Roberson 17 Jun 2020
In the past, people have posted timings in which there was a clear difference. I do not recall the details at the moment.

댓글을 달려면 로그인하십시오.


per isakson
per isakson 17 Jun 2020
편집: per isakson 17 Jun 2020
Caveat: I don't fully understand your use case.
I assume that
  • "record" is what in the documentation is called "element of structure array".
  • the only purpose of deleting elements of struct1 is to free memory
  • struct2 is the only output of the process
  • "Let's say each 8 records of struct1" the number, N, whether eight or not is known beforehand
I see two problems
  • limit the number of times the large structure is rewritten (i.e. modified)
  • avoid "out of memory"
  • split struct1 into a number of subarrays and save those to a mat-file. The size of the subarrays should be a multiple of N. Or let the subarrays overlap by N elements.
  • pre-allocate struct2
  • process one subarray at a time in a loop. Do not delete elements of the subarray; overwrite it by next subarray.

  댓글 수: 0

댓글을 달려면 로그인하십시오.

Translated by