필터 지우기
필터 지우기

Parsing a Binary File (NOT using fread)

조회 수: 1 (최근 30일)
Paul Huter
Paul Huter 2012년 11월 9일
I am trying to put together a (generic) binary parsing tool over the weekend that I can modify with specifics for a particular binary file (if someone could explain why there are different "types" of binary files, I would appreciate it...) at work on Monday. It is my understanding that parsing a binary file is faster than parsing plaintext, so to speed up my program, I would like to parse the binary file which has previously been converted to plaintext for parsing. I know about fread, but that seems to be something to read binary by converting it to plaintext - this is not what I want to do.
Any help with parsing a binary file (ie looking for specific values/phrase, which I will likely have to convert from plaintext to binary...) will be much appreciated.
  댓글 수: 2
Walter Roberson
Walter Roberson 2012년 11월 9일
What do you mean by "plaintext" in this situation?
Image Analyst
Image Analyst 2012년 11월 9일
편집: Image Analyst 2012년 11월 9일
What have you got against fread?
And, do you know about endian? http://en.wikipedia.org/wiki/Endian
An example of what you're starting with and what you want to end up with would help. Just one line or something.

댓글을 달려면 로그인하십시오.

채택된 답변

Jan
Jan 2012년 11월 10일
The term "conversion from plaintext to binary" is unclear. fread() does not convert binary to plaintext.
The main difference is, that "text"-Files use ASCII encoding, e.g.
3.14159265358979
These are 16 characters to store PI. Parsing it and converting it to a double requires 15 multiplications by 10, an expensive power operation might be needed also for strings like '3.14e-15'.
In opposite to this, storing pi in binary format uses 8 bytes:
-DTû! @
This looks strange, but it is a stream of these bytes: [24, 45, 68, 84, 251, 33, 9, 64]. This can be copied directly to the memory and no further arithmetic is needed.
Binary files can have different types. E.g. the above byte sequence [24, 45, 68, 84, 251, 33, 9, 64] can be one double value, but it could be 2 single values also: [3.370281e+012, 2.142699], because a single uses 4 bytes per element. Therefore the software has to know the type of each variable stored in a binary file.
I suggest to copy a JPEG file, which is an example of a binary format. Then change the extension to ".txt" and open the file in the editor. This does not convert anything, but the byte sequences store in the file are interpreted as characters now, while the same contents is handled differently, when the computer assumes that this is a jpeg encoded picture. Finally you can change the extension to ".mp3", another example of a binary format. Of course your player will fail, because it will have a malformed contents.
Does the difference between "binary" and "text" become more clear now?

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 String Parsing에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by