Hello Everyone. Please, I want to know if you can read Arabic document in matlab. Arabic is install on my computer, and when I try to read the file it gives me: {'المملكة' 'المغربية'} is that you have an idea please??

 채택된 답변

Walter Roberson
Walter Roberson 2011년 4월 28일

0 개 추천

How are you reading the file, and how are you displaying it? What is your locale set to? What is your font set to?

댓글 수: 3

Walter Roberson
Walter Roberson 2011년 4월 28일
I was thinking that perhaps Arabic was outside the range of characters representable in MATLAB, but it is not. The chart is at http://en.wikipedia.org/wiki/Arabic_%28Unicode_block%29
Please try
dec2hex(0 + doc{1})
and look to see if the values that were stored look appropriate for the text. If they do, then it is a problem with the displaying of the text rather than the reading of it. If the values look wrong, then you may need to explore the possibility that the file is not UTF-8 encoded.
Walter Roberson
Walter Roberson 2011년 4월 28일
It doesn't help to say "Its not working". Please show the first line of the output of dec2hex(0 + doc{1}) and indicate the unicode code points for the first 16 or so characters you are expecting in the file. Also, please change your 'r' option to 'rt' so that you are working with text instead of binary.
Please also execute this and indicate the output:
fid = fopen('arabe.txt','r');
dec2hex(0 + fread(fid, 32, '*uint8'));
fclose(fid);
Walter Roberson
Walter Roberson 2011년 4월 28일
If that is the entire output, then your file is only 5 bytes long. I need a longer sample than that to debug this problem.
I also still need the first line of the output of dec2hex(0 + doc{1}), and the first few unicode code points of what you are expecting. Unfortunately this forum is not able to support posting arabic directly so you will have to look up the characters in the wikipedia article I referenced and write them down manually.

댓글을 달려면 로그인하십시오.

추가 답변 (6개)

Walter Roberson
Walter Roberson 2011년 4월 28일

1 개 추천

If I am correct about the file having been double-encoded, then:
fid = fopen('arabe.txt','r');
inputtext = char(native2unicode(fread(fid)));
fclose(fid)

댓글 수: 21

najmaf najma
najmaf najma 2011년 4월 28일
thie code give me this result:
{
fid = fopen('arabe.txt','r');
inputtext = char(native2unicode(fread(fid)));
fclose(fid)
ans =
0
>> inputtext
inputtext =
ï
»
¿
Ø
§
Ø
§
Ø
.....
}
Walter Roberson
Walter Roberson 2011년 4월 29일
Then I need more of the file to go on. You can find my email address on my user profile by clicking on my name.
najmaf najma
najmaf najma 2011년 4월 30일
I sent you an Arabic document on your email address.
please send me a confirmation of receipt.
and thank you
Walter Roberson
Walter Roberson 2011년 4월 30일
Received. I'm looking at it now.
Walter Roberson
Walter Roberson 2011년 4월 30일
fid = fopen('arabe.txt','r');
inputtext = native2unicode(fread(fid,'*uint8'),'UTF-16') .';
fclose(fid);
The text can then be seen by looking at inputtext
Note: you must be using a font that supports Arabic, such as Ariel Regular
Note: if applicable, your terminal must be set to decode UTF-8 . For example my terminal was set to interpret ISO-LATIN-1 by default and the characters did not come out right.
With the system I am using at the moment, the terminal automatically detected that the characters were Arabic and wrote them right to left.
I do not have a Windows system with MATLAB to test this out on; I am using a Linux-64 Matlab displaying to MAC OS-X.
najmaf najma
najmaf najma 2011년 5월 1일
I sent you an overview of the code matches that you have suggested in your email address
Walter Roberson
Walter Roberson 2011년 5월 1일
I looked at the image you sent. I cannot tell from that image which font you have used.
najmaf najma
najmaf najma 2011년 5월 2일
I change the encoding to: ISO-8859-1, and I use
{fid = fopen('arabe.txt','r');
inputtext = native2unicode(fread(fid,'*uint8'),'UTF-8') .';
fclose(fid);
}
with utf-8 not utf-16, and I managed to read the file,
the problem I have is to go through this file.
when I made ??InputText or InputText (1) it gives me nothing (empty)
najmaf najma
najmaf najma 2011년 5월 3일
I sent you an overview of the code in your email address
Walter Roberson
Walter Roberson 2011년 5월 3일
Please send a copy of the file with the changed encoding.
I do not have MATLAB for Windows, so I am not able to check using the same setup you are using.
najmaf najma
najmaf najma 2011년 5월 3일
I sent you the file in your email address.
and thank you very much for your help.
Walter Roberson
Walter Roberson 2011년 5월 3일
The command you used, slCharacterEncoding, is for Simulink; without simulink, the technique is to exit MATLAB, change the encoding, and re-start MATLAB.
http://www.mathworks.com/support/solutions/en/data/1-4TKQUB/index.html?solution=1-4TKQUB
Which locale are you normally in?
najmaf najma
najmaf najma 2011년 5월 5일
I don't understand your question
Walter Roberson
Walter Roberson 2011년 5월 5일
http://www.mathworks.com/help/techdoc/matlab_env/brj_w4w-2.html
najmaf najma
najmaf najma 2011년 5월 5일
thank you very much. I solved the problem.
I have another question because I have a java class, and I need the called from matlab.
is that you have an idea?
Walter Roberson
Walter Roberson 2011년 5월 5일
Please start a new Question for that topic.
Also, I think people would appreciate if you could post the solution you came up with for this one.
najmaf najma
najmaf najma 2011년 5월 6일
yes it's true, agree.
I solved the problem by changing system:
http://www.mathworks.com/help/techdoc/matlab_env/brj_w4w-2.html.
thank you very much for your help.
Walter Roberson
Walter Roberson 2011년 5월 6일
Which variable did you end up having to change, and what did you change it from and what did you change it to?
najmaf najma
najmaf najma 2011년 5월 6일
I use the following code:
{
fid=fopen('arabe.txt','rt');
inputtext = native2unicode(fread(fid,'*uint8'),'UTF-8') .';
fclose(fid);
i=textscan(inputtext,'%s');
}
i sent you an image in your email adress for the changing system.
Walter Roberson
Walter Roberson 2011년 5월 6일
It appears that najmaf changed the Windows Regional Language settings.
najmaf najma
najmaf najma 2011년 5월 7일
exactly.
when I change the format parameter in Arabic, the text is afiche

댓글을 달려면 로그인하십시오.

najmaf najma
najmaf najma 2011년 4월 28일

0 개 추천

I read the file with:
fid=fopen('arabe.txt','r','n','UTF-8');
doc=textscan(fid,'%s');
fclose(fid);
doc{1}
The result is:
'ااض'
'اي'
'يمضحث'
najmaf najma
najmaf najma 2011년 4월 28일

0 개 추천

no its not working, I use other formats than UTF-8 ', but its not working well. I'm really stuck on this level.
najmaf najma
najmaf najma 2011년 4월 28일

0 개 추천

i use it: fid = fopen('arabe.txt','rt'); dec2hex(0 + fread(fid, 32, '*uint8')) fclose(fid); the result is: { ans =
EF BB BF D8 A7 }
this is the file arabe: ??? ?? ????? ??? ???? ???? ???? ??? ???? ?????
and thank you

댓글 수: 1

Walter Roberson
Walter Roberson 2011년 4월 28일
I needed you to use
fid = fopen('arabe.txt','r');
dec2hex(0 + fread(fid, 32, '*uint8'));
fclose(fid);
You used 'rt' instead. I don't know if that makes a difference.

댓글을 달려면 로그인하십시오.

najmaf najma
najmaf najma 2011년 4월 28일

0 개 추천

sorry, I wanted to send you the file contents of my Arabic, but its not working
najmaf najma
najmaf najma 2011년 4월 28일

0 개 추천

you can use any document to test, for the resulta, I sent you that gives me, it gives me character hexadecimal

댓글 수: 2

Walter Roberson
Walter Roberson 2011년 4월 28일
Yes, and I need to see _what_ those hexadecimal values are.
Wait -- is the first character of the file 0x0627, 'alif ? If so, then the file appears to be a UTF-8 encoding of a UTF-16 byte stream. The file appears to have been encoded twice!
najmaf najma
najmaf najma 2011년 4월 28일
exactly, the first character is the 'alif'

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Data Type Conversion에 대해 자세히 알아보기

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by