How to store sequence of bits as a bit stream and use the least possible memory ?

조회 수: 9 (최근 30일)
Teo Protoulis
Teo Protoulis 2020년 1월 10일
편집: Matt J 2020년 1월 10일
I am implementing a huffman encoding procedure which for example encodes the decimal number -5 as '100010'. Until now I have implemented some code which correctly produces the encoded sequence. The encoded sequence is stored as a string. However, I want the bit sequence to be stored as a bit stream and use only the neccesary amount of space for storage. For example, I want the encoded sequence for -5, which is '100010', to take 6 bits of memory. Is that possible ?
  댓글 수: 1
Guillaume
Guillaume 2020년 1월 10일
Note that in all modern processors and certainly all that run matlab (x86/x64) the minimum unit of memory is 8 bits and everything is in multiple of 8.

댓글을 달려면 로그인하십시오.

답변 (3개)

Matt J
Matt J 2020년 1월 10일
편집: Matt J 2020년 1월 10일
You could chain all your strings together as a logical column vector and use bwpack, see
Example:
>> A='10001010001010001011010100011000100010100010';
>> B=bwpack((A-'0').');
>> whos A B
Name Size Bytes Class Attributes
A 1x44 88 char
B 2x1 8 uint32

Stephane Dauvillier
Stephane Dauvillier 2020년 1월 10일
You can use the functions:
  1. dec2bin to tranform decimal number into binary,
  2. string to transform the output of the dec2bin from char to string
  3. regexprep to replace the first 0 by empty
  4. strjoin to join everything
For instance:
strjoin(regexprep(string(dec2bin([66,8,19,7])),"^0+","","once"),"")

Guillaume
Guillaume 2020년 1월 10일
A bit similar to Matt's suggestion, but may be slightly more memory efficient since bwpack uses multiples of 32-bit, is to:
  • split your stream into block of 8-bits
  • pad the last block to 8 bits if shorter than 8.
  • convert each block to uint8, the smallest unit of storage on x86/x64
eg.
bitstream = char(randi(double('01'), 1, 19)) %random stream of 19 0-1 characters
encoded = bitstream - '0'; %convert to binary vector. Uses 4 times as much memory as bitstream
encoded = [encoded, zeros(1, mod(-numel(bitstream), 8))]; %pad with zeros at the to multiple of 8
encoded = uint8(sum(reshape(encoded, 8, []) .* 2.^(7:-1:0)')) %convert each sequence of 8 bits into 8-bit integer. 1st bit is MSB
You can easily convert back to the (padded) bitstream:
padded_origstream = reshape(dec2bin(encoded, 8)', 1, [])

카테고리

Help CenterFile Exchange에서 Large Files and Big Data에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by