How do I avoid getting fooled by 'implicit expansion'?

I am trying my very best not to be a grumpy old man here, but I just wasted the better part of an hour because of the addition of 'implicit expansion', and I need to hear from the proponents of this feature how to live with it. The offending bit of code was:
smoothed=smooth(EEG.data(iC,:),EEG.srate*60,'moving');
deviation=EEG.data(iC,:)-smoothed;
this bit of code kept giving me a 'memory error', which was odd, since 'deviation' should be a lot smaller than 'EEG'. However, EEG is very large, and I don't have room for two of them in my memory, so I figured there might be some intermediate step in calculation that was tripping me up, or perhaps windows was hogging the resources, or something else out of my control. It took two restarts of matlab, and one reboot of the pc, and finally a complicated rewrite of the script (which still didn't fix it) to finally realize that 'smooth' (which is a matlab built-in) was changing the dimensions, such that I was subtracting a column vector from a row vector.
What is the good coding practice for this not to occur? I would have thought that having a language that complained when you performed an ill-defined operation was the good solution to this problem, but I can see from my google search that apparently a lot of people think that 'implicit expansion' is a great good. How do you avoid pitfalls such as this? (please note that if I hadn't been running out of memory, I might have made it a good deal further down the script before noticing that something was off).
Should I just never trust any command to preserve the dimensions of my arrays, even if it's an inbuilt one?

댓글 수: 10

Stephen23
Stephen23 2018년 2월 19일
편집: Stephen23 2018년 2월 19일
"I would have thought that having a language that complained when you performed an ill-defined operation was the good solution to this problem, but I can see from my google search that apparently a lot of people think that 'implicit expansion' is a great good"
Programmers often seem to follow the philosophy of "an answer at any cost", and prefer that their code returns a result, silently making any conversions required in order to get it.
Sadly this does not reflect the rules of linear algebra, so anyone expecting strict mathematical operations will, at some point, be caught by something like this.
"Should I just never trust any command to preserve the dimensions of my arrays, even if it's an inbuilt one?"
Yes.
Contrary to the answers below, I would suggest using (:) a lot, even internally in your code, then it does not matter what orientation vector a user supplies.
I would point out that essentially the same hazards as this one have always existed in MATLAB, even before implicit expansion. For example, a scalar vector multiplication
out = scalar*rowvector
would produce the same memory explosion if "scalar" were somehow inadvertently made a column vector.
Great question and great discussion. Very informative. Thanks to all.
I agree, I hate this feature. PLEASE PLEASE Mathworks make it optional to implicitly expand. I know we should be able to debug, but when I have a 10000 element vector and I make a minor mistake where I haven't checked the row/column vector status I would far rather have an error than my computer freezing trying to generate a 10^8 element array.
@Alexander Thomas - Sigh. Yes, you hate it. I know. This is the kind of thing that new users probably get tripped up on too often. However, a flag that anyone could turn on or off would also make code suddenly fail. It would cause failures in provided code. And then people would scream about bugs in the code.
Implicit expansion is a tool that as you gradually become a more experienced user, you will approeciate. You learn to know what size your arrays are. You know if you created a row or a column vector. And as much as you think you want TMW to remove implicit expansion as you think you now hate, it is part of the language, to stay.
Or, you could migrate all of your work to an old enough MATLAB release that did not have implicit expansion. I think it was R2016b that introduced it. Of course then you will not have new toolboxes, nothing introduced beyond R2016a. You will not have new features that were introduced. And before long, you will find that old release will no longer even run on newer computers.
Such is life. We move forwards, even when not everyone wants to move, even if you don't think of it as progress. :)
@John D'Errico, why the high horse? When I wrote the original question, I was a veteran matlab user. It’s been four years more now, and I still hate IE. Just because people disagree with you, doesn’t mean you get to write off their opinion as a sign of inexperience. That’s just you being arrogant and presumptuous.
In the original example, I knew what the size of my array was (it was a row). I was expecting matlab not to change it for me. And I certainly wasn’t expecting my code to break because matlab changed the dimensions. I was used to matlab not changing it for me (in 2015, the ‘deviation’ variable would have had exactly the size of EEG.data(iC,:)).
But, sure, IE is here to stay. I’m not. I’m slowly moving my (extensive) code base out of matlab. Obviously, not just due to this infuriating little feature, but the lack-of-a-solution to problems like this certainly hasn’t made me stay any longer. You get to keep your party to yourself, and to be as condescending about it as you need to.
@Alexander Thomas: "make it optional to implicitly expand" - No, this cannot work. Remember, that Matlab's toolbox function expect an enabled implicit expansion. If a switching is introduced, each and every toolbox function must store the former setting, enable the expansion and restore the former value finally. This would waste too much time.
Many of the codes, I write here in the forum to solve questions make use of the implicite expansion. In the first 3 years I've commented this by "% Auto-expanding, >= R2016b" and added a comment with the corresponding bsxfun call. Today Most of the Matlab users in this forum are familiar with this feature. Removing it or even allowing to switch it of manually would cause serious incompatibilities.
It was always a feature and a problem, that Matlab tries to be smart. Prefering to operator along the first nonsingelton dimension is convenient and dangerous, because beginners tend to forget, that the dimensions can differ from their expectations. The length command was a really bad idea, findstr(a,b) also: It searched the shorter in the longer of the elements. Of course "hold on" save some keyclicks compared to "hold('on')", but the non-functional form of commands was a source of bugs frequently in the past, when Matlab's guessing fails, if the argument is a char vector or number. The command plot(1:10, rand(1,10)) creates an axes automagically and a surrounding figure as well - except if there is an existing already.
Implicite expanding is another smart feature. It would have been a more secure decision to introduce new operators like $+, $*, etc. But MathWorks decided for making it transparent. I was not happy about it, because I prefer a program to stop, if something unexpected happens, but the auto-magic produces an unexpected result instead. But it the auto-expanding is applied intentionally, it is an efficient and powerful tool.
@John D'Errico: "Implicit expansion is a tool that as you gradually become a more experienced user, you will appreciate" - I do not agree. I'd prefer an explicit operator instead of increasing the power of existing operators. But, as you said: Such is life, especially as a programmer. It is part of Matlab now and I use it. The questions in this forum show, that the expanding is not a frequent cause of bugs, e.g. rand(1e6) appears more frequently.
@kaare: Thanks for caring about the tone in the forum. As I read John's answer, the most emotional part is: "Sigh". I do neither see a high horse, nor arrogance, nor presumptuousness, nor a condescending statement. Of course, politeness is the base of an efficiently working community.
I continually waste hours of time having to debug code because of this poorly implimented 'explicit expansion' and I agree with John D'Errico that it has nothing to do with levels of expericence. Many times old codes run this new function and it takes working through 1000s of lines to find that the bug was just a bad update. The other update that is a bain to so many users is the requirements of figure data to either all have time zones or none have time zones. As Matlab continues to design updates it should be more careful not make past codes obsolite, especially with simple operators like '+'.
Matt J
Matt J 2025년 11월 19일
편집: Matt J 2025년 11월 19일
Many times old codes run this new function and it takes working through 1000s of lines to find that the bug was just a bad update.
Not me. Hasn't hapened to me once in the 9 years since implicit expansion was introduced.
I was caught by implicit expansion roughly 3 times when implicit expansion was first introduced. In each case, it was sloppy code that would have crashed before implicit expansion. Code along the lines of
function c = addme(a,b)
c = a + b;
end
when I originally wrote the code, I "knew" that I had to pass in two vectors with the same orientation, but I forgot about that a couple of months later and passed in vectors of mixed orientation. Sure, the first time I encountered it, it might have taken 20-ish minutes to track down, as I wasn't used to looking for such problems, but I quickly learned to ask the right questions and it stopped being a problem.

댓글을 달려면 로그인하십시오.

 채택된 답변

Jan
Jan 2018년 2월 19일
편집: Jan 2022년 2월 16일
Should I just never trust any command to preserve the dimensions of my arrays,
even if it's an inbuilt one?
The shapes of the output of built-in function have been subject to changes in the past and I expect this to happen in the future also. Therefore I catch my assumptions about shapes in a "unit-test" like function. So when I write
smoothed = smooth(EEG.data(iC,:), EEG.srate*60, 'moving');
I add this to the test function:
y = smooth(rand(1, 100), 5, 'moving');
if ~isrow(y)
error('SMOOTH does not reply a row vector for a row vector input.');
end
Nevertheless, it is a lot of work to do this for all assumptions. In many cases I'm not even aware of what I assume, e.g. for strncmp('hello', '', 2), which has changed its behavior in the past also.
In your case it would have been smart and efficient, if
deviation = EEG.data(iC,:) - smoothed
causes an error. Unfortunately the implicit expansion tries to handle this smartly, but it is smarter than the programmer in many cases. When it is intended, the implicit expansion is nice and handy, but it is an invitation for bugs also. All we can do is to live with it, because it is rather unlikely that TMW removes this feature. But I cannot be bad to write this as an enhancement request to them.
To answer the actual question:
How do I avoid getting fooled by 'implicit expansion'?
Use Matlab < R2016b, at least for testing your code.

추가 답변 (4개)

Guillaume
Guillaume 2018년 2월 19일
Yes, some complain about implicit expansion. For me, it's a logical expansion (pun intended) of the 1-D scalar expansion to N-D. On the other hand, I was also happy with bsxfun.
As to your problem, at the heart it's a design failure on your part I'm afraid. If you never validate your assumptions, you can expect that things go wrong. Instead of a single large script, use functions. The first thing that a function should do is validate that its inputs are as expected. If two vectors are expected as input, check they're the same size, etc. Whenever I write a function, the first thing written is the help, with a clear listing of the input and their requirements, followed by actual validation of these requirements.
In your case, a simple
assert(size(v1) == size(v2))
would have caught the problem.
The implicit expansion forces you to be more careful about the shape of your vectors. In my opinion, it's not a bad thing.

댓글 수: 2

... I'm sorry, but this is not very helpful advice. We are, literally, talking about two lines of code. Your suggestion amounts to checking my assumptions every time I call a built-in routine. Would you have placed 'deviation=EEG.data(iC,:)-smoothed;' inside a subfunction?
John D'Errico
John D'Errico 2018년 2월 19일
편집: John D'Errico 2018년 2월 19일
Admittedly, I would not be using assert here to check dimensions. But the fact remains that I would KNOW the shape of my arrays. If you don't know positively what shape arrays and vectors are produced by any operation, then it pays to learn to be more careful.
When you use a function, make sure you know what it returns. Read the help. Try a test case, in case the help was not sufficiently clear for you.

댓글을 달려면 로그인하십시오.

Matt J
Matt J 2018년 2월 19일
Well, the error message would have told you that the out-of-memory was originating in the line
deviation=EEG.data(iC,:)-smoothed;
I should think that checking the dimensions of the right hand side quantities in the debugger would have been your first step.

댓글 수: 1

As it happens, my first step was to check the size (in bytes) of the ingredients to the offending line. That did not help me, since 'smoothed' was, as expected, a lot smaller than 'EEG'. It is very possible that next time I get this error, I'll remember that implicit expansion is a thing, and that the memory error might not in fact have anything to do with the size of the array.

댓글을 달려면 로그인하십시오.

Petorr
Petorr 2022년 6월 20일

0 개 추천

I would like the debugger to automatically highlight where implicit array expansion is taking place. Is that option available? For example:
Here, A might be 100*1 and B, 1*500 so the highlighting would let me know that these are compatible unequal array sizes and will be implicitly expanded.

댓글 수: 2

No such option exists. Would you expect this option to always highlight that operation in your code? A might be 100-by-1 and B might be 1-by-500. But they may both be scalars. There are circumstances where a static analysis of the code could prove at parse-time that the operation will perform implicit expansion, but what about this one?
function z = computeProduct(x, y)
z = x.*y;
end
Should the .* in this code be highlighted or not?
In this example, it would only be highlighted while debugging within computeProduct, as in putting a breakpoint at the line z=x.*y. It would be comparable to the hover-tip that shows array dimensions. I use that often, even though it depends on the program state. I see how it might seem a little inconsistent since all other highlighting is based on static parsing, but some variation of it might be worth considering. If I get so good with the implicit sizing that I never need this feature, I'll be sure to come back and comment ; )

댓글을 달려면 로그인하십시오.

Rav
Rav 2024년 3월 25일

0 개 추천

Ok, since I use eeglab I managed to trace your screw-up.
EEG.data is sort-of horizontal, but 'smooth' outputs a vertical array.
Actually, the error message also shows what's wrong, look at dimensions. That's not the expansion
Correct debig would be to call "size(smoothed)" and "size(EEG.data)".
Correct solution to your code is this:
deviation=EEG.data(iC,:)-smoothed.';
'smoothed' here is transposed - a difference of 2 symbols
A good coding practice is to run all unfamiliar and failing functions through console by hand and just look at the output - not just console output, but also environment.

카테고리

도움말 센터File Exchange에서 Matrix Indexing에 대해 자세히 알아보기

질문:

2018년 2월 19일

댓글:

2025년 11월 19일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by