Is applying a binary operator (+,-,*,/) to char arrays supported by MATLAB or just a "trick"

조회 수: 5 (최근 30일)
Jon 2022년 1월 20일
댓글: Jon 2022년 1월 21일
I recently came across the use of the following way to obtain the individual digits of a binary number as an array of doubles, for example to get the individual binary digits of the decimal value 6:
digits = '110' - '0'
which provides the result:
digits =
1 1 0
This seemed really surprising to me, I had no idea that subtraction of character arrays was even defined.
I see digging more deeply https://www.mathworks.com/matlabcentral/answers/399557-explanation-of-num2str-x-0 that what is happening seems to be equivalent to
a = double('110')
b = double('0')
digits = a - b
where the double() creates a vector of unicodes for each character. So we have:
a =
49 49 48
b =
48
digits =
1 1 0
So I can see that if MATLAB interprets applying the binary operator, - , to two character arrays as first converting them to vectors of doubles using the unicode of each character, and then performing the subtraction on those, then it makes sense that '110' - '0' = [1 1 0].
I then experimented further and found that not only could the minus operator be applied to character arrays, but also +,*, / also give results.
In each case, MATLAB apparently first converts the pair of character arrays to vectors of doubles using the unicodes of the individual characters and then applies the operation to those. So for example:
>> '110'+'123'
ans =
98 99 99
'110'*'2'
ans =
2450 2450 2400
>> '110'/'2'
ans =
0.9800 0.9800 0.9600
My question though, is where is it documented that applying binary operators to character arrays is even defined? I couldn't seem to find this in the MATLAB documentation, but maybe I missed it.
Is this considered just a "trick" and maybe not even behavior that can be depended upon, or is it a supported operation in MATLAB?
댓글 수: 2없음 표시없음 숨기기
Steven Lord 2022년 1월 20일
Note that using arithmetic operators on strings doesn't behave the same way. Addition is bafflingly treated as a concatenation operator for strings.
That's correct. See the last entry in the FAQ in the documentation. Using + for string concatenation is common in a number of other languages.
Jon 2022년 1월 20일
Thanks all for your very interesting and informative answers to my question. There were a lot of good answers here, so it was hard to select just one to accept.

댓글을 달려면 로그인하십시오.

채택된 답변

John D'Errico 2022년 1월 20일
편집: John D'Errico 2022년 1월 20일
The plus operator has long been supported, as it applies to character arrays. I recall it existing for as long as I go back using MATLAB, which goes back to around the late 1980's, so 35 years or so.
Unary plus converts characters to their ascii equivalents.
+'abcde'
ans = 1×5
97 98 99 100 101
And numerical operations on character arrays convert them to ascii equivalent doubles first.
2 * 'ABC'
ans = 1×3
130 132 134
And I can be confident this will be the case into the future, as MathWorks does try strongly to keep features like this supported to be compatible, unless there is a compelling reason to need to change such a capability. There are huge bases of code that use this trick. So I doubt they will want to force many users to go into existing code and hack it just so they can remove an old feature.
At the same time, They MAY be preparing us for a long term eventuality where unary plus will no longer apply to character arrays, because a quick search through the docs does not show this capability. Of course, + does not do the same when applied to strings.
1 + "ABC"
ans = "1ABC"
Anyway, my guess is this feature will be supported for long after I am dead and converted to soylent green, even if it is not documented. There are other ways to convert a character array to ascii equivalents. Double may be the preferred way now:
double('abcde')
ans =
97 98 99 100 101
though I would need to think. (I am so used to just using the plus operator.)
댓글 수: 2없음 표시없음 숨기기
Jon 2022년 1월 20일
Thanks very much, for your informative reply. As I wasn't aware of how the binary operators acted on char arrays, I definitely didn't know about the unary plus. By the way, did you know that the movie of Soylent Green takes place in 2022?
John D'Errico 2022년 1월 20일
Sadly, I had heard that. We may want to avoid the green smoothies. :)

댓글을 달려면 로그인하십시오.

추가 답변 (4개)

Matt J 2022년 1월 20일
I can't find the documentation, but binary operators in Matlab can't define themselves. It had to be deliberate.
Also, the char binary operators have been there for 40 years, so I don't think they'd dare remove them now.
Also, the same definitions are found in C\C++ and very commonly used for the kind of purposes shown in your example.
댓글 수: 0이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Yongjian Feng 2022년 1월 20일
Char array is a vector of chars. So you are basically doing vector operation, right?
댓글 수: 0이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

DGM 2022년 1월 20일
편집: DGM 2022년 1월 20일
MATLAB is a weakly-typed language; consequently, implicit type conversions tend to happen all the time. I don't know where (or if) this specific behavior is explicitly documented, although the docs for these operators mention that 'char' is a supported type.
As to whether it can be used safely, I would say yes. It's fairly routine to use this sort of approach for converting text representations of numbers into numeric vectors:
mytextnum = '01234567';
mynum = mytextnum-'0'
Or perhaps for converting text into numeric indices:
bunchofchars = '123abcXYZ';
positioninalphabet = lower(bunchofchars(isletter(bunchofchars)))-'a'+1
Note that using arithmetic operators on strings doesn't behave the same way. Addition is treated as a concatenation operator for strings.
"110"+"0"
Trying to use the other operators on a pair of strings would result in an error.
댓글 수: 2없음 표시없음 숨기기
Stephen23 2022년 1월 20일
편집: Stephen23 2022년 1월 20일
"Addition is bafflingly treated as a concatenation operator for strings"
It is not very baffling:
• a character array in memory is really just an array of numbers that is interpreted as codepoints of Unicode. Nothing more than that. Simply an array numbers just like any other number array, onto which MATLAB basically hangs a note saying, "oh by the way, these are characters".
• a string is a container class, as the documentation explicitly states: "A string array is a container for pieces of text." source: https://www.mathworks.com/help/matlab/characters-and-strings.html
Arithmetic on an array of numbers is clearly trivially defined by the fact that they are numbers. MATLAB does not even have to do anything: they are already numbers! In an array!
But what does arithmetic on containers mean? Containers are not numbers. The meaning of PLUS is only due to that particular operator being overloaded for the STRING class, not due to any inherent property of how the STRING arrays stored in memory.
One of my favorite things about MATLAB is the convenience of operating on actual arrays in memory without having to dig into C/whatever, as the character class neatly demonstrates. I suspect that older users appreciate this more.
DGM 2022년 1월 20일
편집: DGM 2022년 1월 20일
That's basically why I think it is confusing. Let me clarify. I don't think it's confusing that addition doesn't add strings. I think it's confusing that addition concatenates instead of throwing an error. It's been almost 15 years since I touched a language that did this, so it seems wrong to "add" words. Maybe that's just a demonstration of how quickly and thoroughly I forget things.

댓글을 달려면 로그인하십시오.

Paul 2022년 1월 20일
Not directly on point to the Question, but a related "feature" is that char vectors can be used to directly into index into arrays:
data = rand(1,150);
isequal(data('a'),data(double('a')))
ans = logical
1
isequal(data('abc'),data(double('abc')))
ans = logical
1
However, there is an execption for ':'
isequal(data(':'),data(double(':')))
ans = logical
0
Because indexing with ':' is the same as :
isequal(data(':'),data(:))
ans = logical
1
댓글 수: 8이전 댓글 6개 표시이전 댓글 6개 숨기기
Walter Roberson 2022년 1월 21일
There is no restriction that says that if you index by character that the array must be character itself.
map('s') = '1'
map = ' 1'
map('f') = '2'
map = ' 2 1'
% and then for example
code = 'sf'
code = 'sf'
decode = map(code)-'0'
decode = 1×2
1 2
map2('s') = 1
map2 = 1×115
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
map2('f') = 2
map2 = 1×115
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
% and then for example
code = 'sf'
code = 'sf'
decode = map2(code)
decode = 1×2
1 2
Jon 2022년 1월 21일
Aha! Thanks so much.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Matrix Indexing에 대해 자세히 알아보기

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by