How can I improve code efficiency?
이전 댓글 표시
Hello Experts,
I am a novice in this and would like to get help / advice.
I have written the code to successfully load and unpack the CAN message data through through for-loop. Once the messages are unpacked I am then writing it into a table for further processing / plotting work. Currently the code does the job, but it takes around 15mins (depending on CAN data) to unpack message from ~200k lines (yes it is ~200k). This can take even longer depending on the size of CAN data file I am receiving, so I would like to improve the code to reduce the time it takes to unpack the CAN message. I am seeking experts advice on improving the code (if it is possible) to reduce the processing time.
To make it clear, I will briefly explain how I am unpacking the message with the code below which is done in two steps.
CAN file has data recorded for every milli second with repetative messages over time. Every message in the CAN file have multiple signals / channels logged in at the respective time step. First, I am reading the message name through outer for-loop for first time step (say 0.1 second). Then inner for-loop receives the message name and extracts the signals stored under the message name for that time step and writes the signal information in to a table.
for i = 1:length(unique(canData.Name))
Signals = Msgs{idx(i)};
MsgName = Msgs{idx(i),2}{1};
for j = 1:size(Signals,1)
db = Signals(j,:);
value = double(unpack(Messages(i), db.StartBit, db.SignalSize, char(db.ByteOrder), char(db.Class)))* db.Factor + db.Offset;
Name(j+offset) = db.Name(1);
Time(j+offset) = Messages(i).Timestamp;
Value(j+offset) = value;
end
offset = offset + size(Signals,1);
end
This process repeats for ~200k lines which currently takes long time to unpack the message. I have attached a screenshot of canMsgs variable with red box to show an example message name and signal info and their repeat over time.
Please help me in improving the code efficiency, let me know if you need any additional information.
Thanks in Advance. :)
댓글 수: 7
Guillaume
2020년 1월 21일
I'm a bit confused by your code.
Is the "screenshot of canMsgs" actually a screenshot of the canData variable in your code? Your code doesn't have a canMsgs variable.
What is Msgs? It looks like it might be a table, possibly the same canMsgs or canData table, but in this case, the line:
Signals = Msgs{idx(i)};
would error. You can't {} index a table with a linear index.
We could do with an explanation of what type is each variable, in particular Name, Time and Value (not be confused with value, which is just asking for trouble!)
As Bjorn answered, have you profiled the code? In particular, I would establish first that unpack isn't the bottleneck.
Logesh Velusamy
2020년 1월 21일
Guillaume
2020년 1월 21일
Ok, it's a different table. Still if it's a table, Msgs{whatever} is an error.
Still trying to understand your code, but it's difficult since I don't know what canData is (possibly it's canMsgs but you haven't answered that part). I also don't know what Idx or Messages is. And you still haven't explained what Name, Time and Value are.
Logesh Velusamy
2020년 1월 21일
Logesh Velusamy
2020년 1월 21일
편집: Logesh Velusamy
2020년 1월 21일
Guillaume
2020년 1월 21일
Still very confused
"'canData' is the data only part extracted from 'canMsgs' which does not include any Name or Signals"
On the first line of the code you've posted, you extract Name from canData!
What are the classes of Name, Time and Value? You said you're decoding the messages into a table, but these are clearly not tables.
Anyway, from the profile results, in particular file8, which is the script profile, you can see that it spends 50% of the time on the unpack line and 25% of the time on the dbInfo = SignalInfoTable(j,:); On the unpack line, I'm not sure if the time spent is by unpack itself or the table indexing. The two indexings could be combined into one which would probably perform better.
Can you attach the whole Can_Test.m? it would be useful if we saw the actual code you're using, not a slightly edited version.
Logesh Velusamy
2020년 1월 21일
채택된 답변
추가 답변 (1개)
Bjorn Gustavsson
2020년 1월 21일
First step of improving code is to run the scripts and functions with the profiler on (seel documentation and help for profile) so that you can see what parts of your code uses most of the time. Once that is determined you can start to try to pick off the parts where most of the time is spent.
Even before that you should have a look at the code-advices you get in the matlab-editor. It often points out where things looks a bit dodgy (though the editor code analysis doesn't know exactly the sizes of your variables or number of iterations and such), so you can start to clean out those warnings too.
In your case it seems that you will dynamically grow the sizes of your variables Name, Time and Value. That will start to become expensive when you increase the variable-sizes in increments of one. It is much prefered to preallocate the variables. Something like this:
maxSizeSignal = 12; % you might know this number, you might have to determine it before the loop
Value = zeros(1,length(unique(canData.Name))*maxSizeSignal); % this might be a little bit large
% then proceed as before.
Then after the loop you might prefer to cut the excess:
Value = Value(1:offset);
HTH
댓글 수: 3
Logesh Velusamy
2020년 1월 21일
편집: Logesh Velusamy
2020년 1월 21일
Bjorn Gustavsson
2020년 1월 21일
Hello Logesh,
OK, I'll wait for the profile-report. Then we can continue. Your description of "pre-defining variables with an empty table" sounds confusing to me, it is the actual variables Value, Name and Time that should individually be initialized to their "final full size" to avoid time-wasting on memmory reallocation. But that will now wait for more information...
Logesh Velusamy
2020년 1월 21일
카테고리
도움말 센터 및 File Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!