Write unicode flag to text file
์กฐํ ์: 4 (์ต๊ทผ 30์ผ)
์ด์ ๋๊ธ ํ์
Using the html decimal codes is possible to write unicode characters to a text file, for example
writecell({['down ' char(8595) ' arrow']}, 'a_filename','Encoding','UTF-8')
create a text file containing the string 'down โ arrow'.
I'm trying to do the same but with national flags instead of arrows.
For example, the html decimal codes I found for the italian flag ๐ฎ๐น are 58639, 127481 and 127470, but by plugging them in the previous command, the flag is not saved in the text file.
Is this because flags are not supported by matlab or because there are some errors in the code?
๋๊ธ ์: 0
์ฑํ๋ ๋ต๋ณ
Rik
2020๋
8์ 20์ผ
Matlab stores characters internally in a uint16. That means only your first character is supported:
isvalidchar = double(uint16(inf)) > [58639, 127481, 127470]
% 1 0 0
As a workaround you can print the raw binary data. You can read the Wikipedia page for full details, but essentially you need to pick the line below that results in the fewest bytes. Replace the x with the binary of your character value.
%0xxxxxxx
%110xxxxx 10xxxxxx
%1110xxxx 10xxxxxx 10xxxxxx
%11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
Here is a function you can use. Note that whatever file reader you use next will also have to support these newer constructs.
function b=char_to_UTF8_bin(c)
%equivalent to unicode2native(char(c),'UTF-8')
%for 55250:55295 and 57344:65535 (and 65536:2097151) the outputs don't match
c=double(c);
if ~isscalar(c)
b=arrayfun(@char_to_UTF8_bin,c,'UniformOutput',0);
b=horzcat(b{:});
return
end
if c<128 %0xxxxxxx
b=c;
elseif c<2048 %110xxxxx 10xxxxxx
b=zeros(1,2);
c=dec2bin(c,11);
b(1)=bin2dec(['110' c(1:5)]);
b(2)=bin2dec(['10' c(6:11)]);
elseif c<65536 %1110xxxx 10xxxxxx 10xxxxxx
b=zeros(1,3);
c=dec2bin(c,16);
b(1)=bin2dec(['1110' c(1:4)]);
b(2)=bin2dec(['10' c(5:10)]);
b(3)=bin2dec(['10' c(11:16)]);
elseif c<2097152 %11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
b=zeros(1,4);
c=dec2bin(c,21);
b(1)=bin2dec(['11110' c(1:3)]);
b(2)=bin2dec(['10' c(4:9)]);
b(3)=bin2dec(['10' c(10:15)]);
b(4)=bin2dec(['10' c(16:21)]);
else
error('not a valid UTF-8 character')
end
b=uint8(b);
end
๋๊ธ ์: 12
Rik
2020๋
12์ 17์ผ
You're welcome.
And yes, I was aware, I just prefer to make it explicit with a direct call to either horzcat or vertcat. Not that it will actually matter, but that is wat [] is calling under the hood, so there may be an imperceptible speed increase.
์ถ๊ฐ ๋ต๋ณ (0๊ฐ)
์ฐธ๊ณ ํญ๋ชฉ
์นดํ ๊ณ ๋ฆฌ
Help Center ๋ฐ File Exchange์์ Data Type Conversion์ ๋ํด ์์ธํ ์์๋ณด๊ธฐ
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!