good,
I previously had a binary sequence and my purpose was the creation of substrings of various lengths, eg length 4:
Sequence
1(1), 0(2), 1(3), 1(4), 0(5), 0(6), 1(7), 0(8), 0(9), 1(10), 1(11), 1(12),
1(13), 0(14), 0(15), 0(16), 1(17), 1(18), 1(19), 0(20)
Substrings
01: 1(01), 0(02), 1(03), 1(04) -> [1,0,1,1],
02: 1(01), 1(03), 0(05), 1(07) -> [1,1,0,1],
03: 1(01), 1(04), 1(07), 1(10) -> [1,1,1,1],
04: 1(01), 0(05), 0(09), 1(13) -> [1,0,0,1],
05: 1(01), 0(06), 1(11), 0(16) -> [1,0,1,0],
06: 1(01), 1(07), 1(13), 1(19) -> [1,1,1,1],
07: 0(02), 1(03), 1(04), 0(05) -> [0,1,1,0],
08: 0(02), 1(04), 0(06), 0(08) -> [0,1,0,0],
09: 0(02), 0(05), 0(08), 1(11) -> [0,0,0,1],
10: 0(02), 0(06), 1(10), 0(14) -> [0,0,1,0],
11: 0(02), 1(07), 1(12), 1(17) -> [0,1,1,1],
12: 0(02), 0(08), 0(14), 0(20) -> [0,0,0,0],
13: 1(03), 1(04), 0(05), 0(06) -> [1,1,0,0],
14: 1(03), 0(05), 1(07), 0(09) -> [1,0,1,0],
15: 1(03), 0(06), 0(09), 1(12) -> [1,0,0,1],
16: 1(03), 1(07), 1(11), 0(15) -> [1,1,1,0],
17: 1(03), 0(08), 1(13), 1(18) -> [1,0,1,1],
18: 1(04), 0(05), 0(06), 1(07) -> [1,0,0,1],
19: 1(04), 0(06), 0(08), 1(10) -> [1,0,0,1],
20: 1(04), 1(07), 1(10), 1(13) -> [1,1,1,1],
21: 1(04), 0(08), 1(12), 0(16) -> [1,0,1,0],
22: 1(04), 0(09), 0(14), 1(19) -> [1,0,0,1],
23: 0(05), 0(06), 1(07), 0(08) -> [0,0,1,0],
24: 0(05), 1(07), 0(09), 1(11) -> [0,1,0,1],
25: 0(05), 0(08), 1(11), 0(14) -> [0,0,1,0],
26: 0(05), 0(09), 1(13), 1(17) -> [0,0,1,1],
27: 0(05), 1(10), 0(15), 0(20) -> [0,1,0,0],
28: 0(06), 1(07), 0(08), 0(09) -> [0,1,0,0],
29: 0(06), 0(08), 1(10), 1(12) -> [0,0,1,1],
30: 0(06), 0(09), 1(12), 0(15) -> [0,0,1,0],
31: 0(06), 1(10), 0(14), 1(18) -> [0,1,0,1],
32: 1(07), 0(08), 0(09), 1(10) -> [1,0,0,1],
33: 1(07), 0(09), 1(11), 1(13) -> [1,0,1,1],
34: 1(07), 1(10), 1(13), 0(16) -> [1,1,1,0],
35: 1(07), 1(11), 0(15), 1(19) -> [1,1,0,1],
36: 0(08), 0(09), 1(10), 1(11) -> [0,0,1,1],
37: 0(08), 1(10), 1(12), 0(14) -> [0,1,1,0],
38: 0(08), 1(11), 0(14), 1(17) -> [0,1,0,1],
39: 0(08), 1(12), 0(16), 0(20) -> [0,1,0,0],
40: 0(09), 1(10), 1(11), 1(12) -> [0,1,1,1],
41: 0(09), 1(11), 1(13), 0(15) -> [0,1,1,0],
42: 0(09), 1(12), 0(15), 1(18) -> [0,1,0,1],
43: 1(10), 1(11), 1(12), 1(13) -> [1,1,1,1],
44: 1(10), 1(12), 0(14), 0(16) -> [1,1,0,0],
45: 1(10), 1(13), 0(16), 1(19) -> [1,1,0,1],
46: 1(11), 1(12), 1(13), 0(14) -> [1,1,1,0],
47: 1(11), 1(13), 0(15), 1(17) -> [1,1,0,1],
48: 1(11), 0(14), 1(17), 0(20) -> [1,0,1,0],
49: 1(12), 1(13), 0(14), 0(15) -> [1,1,0,0],
50: 1(12), 0(14), 0(16), 1(18) -> [1,0,0,1],
51: 1(13), 0(14), 0(15), 0(16) -> [1,0,0,0],
52: 1(13), 0(15), 1(17), 1(19) -> [1,0,1,1],
53: 0(14), 0(15), 0(16), 1(17) -> [0,0,0,1],
54: 0(14), 0(16), 1(18), 0(20) -> [0,0,1,0],
55: 0(15), 0(16), 1(17), 1(18) -> [0,0,1,1],
56: 0(16), 1(17), 1(18), 1(19) -> [0,1,1,1],
57: 1(17), 1(18), 1(19), 0(20) -> [1,1,1,0],
using the following code
if true
% code
N = 20;
n = 4;
A = hankel(1:N-n+1,N-n+1:N);
k = 0:n-1;
c = ceil((N - A(:,end) + 1)/k(end));
i2 = cumsum(c);
i1 = i2 - c + 1;
idx = zeros(i2(end),n);
for jj = 1:N-n+1
idx(i1(jj):i2(jj),:) = bsxfun(@plus,A(jj,:),(0:c(jj)-1)'*k);
end
[j1,j2,j2] = unique(s(idx),'rows')
out = [j1, histc(j2,1:max(j2))/i2(end)]; % This row corrected
end
and at the end get a count of the times to repeat each pattern and their relative frequency:
0 0 0 0------ 161697-- 0,0606515378844711
0 0 0 1------ 163593-- 0,0613627156789197
0 0 1 0------ 164201-- 0,0615907726931733
0 0 1 1------ 166680-- 0,0625206301575394
0 1 0 0------ 164105-- 0,0615547636909227
0 1 0 1------ 166501-- 0,0624534883720930
0 1 1 0------ 167099-- 0,0626777944486122
0 1 1 1------ 168835-- 0,0633289572393098
1 0 0 0------ 164086-- 0,0615476369092273
1 0 0 1------ 166963-- 0,0626267816954239
1 0 1 0------ 166931-- 0,0626147786946737
1 0 1 1------ 169470-- 0,0635671417854464
1 1 0 0------ 166622-- 0,0624988747186797
1 1 0 1------ 169326-- 0,0635131282820705
1 1 1 0------ 169251-- 0,0634849962490623
1 1 1 1------ 170640-- 0,0640060015003751
The problem that arises is that when I processed this way I only processes some 4000 data and need to process many more. I have 4GB of RAM and Matlab 2012. What I thought is this: Assign each patron an integer:
0 0 0 0------ 1
0 0 0 1-------2
0 0 1 0-------3
0 0 1 1-------4
0 1 0 0-------5
0 1 0 1-------6
0 1 1 0-------7
0 1 1 1-------8
1 0 0 0-------9
1 0 0 1-------10
1 0 1 0-------11
1 0 1 1-------12
1 1 0 0-------13
1 1 0 1-------14
1 1 1 0-------15
1 1 1 1-------16
and set as a counter to assign the number of times to repeat that integer. In this way perhaps get as many data processing. thank you very much

답변 (1개)

Walter Roberson
Walter Roberson 2013년 10월 25일

0 개 추천

If you are going to do that, consider using accumarray() to do the additions.
If B is the array of bits, such as
B = [0 0 0 0; 1 0 0 0; 0 1 0 0; 1 0 0 0]
then
counts = accumarray( B(:,1) * 8 + B(:,2) * 4 + B(:,3) * 2 + B(:,4) * 1 + 1, 1 );

댓글 수: 16

FRANCISCO
FRANCISCO 2013년 10월 25일
Do not quite understand what you mean. I'm trying to transform each of the 16 patterns to an integer so you can process more data. From there to count and relative frequency calculation. But I find it hard creating substrings from integers and the order established
You have some existing logic that can figure out the 1 0 0 0 part of your
1 0 0 0------ 164086-- 0,0615476369092273
line, for each combination you are trying to process. Convert that existing logic slightly to produce a row-oriented matrix (Samples by 4) of these decoded values. The accumarray() call that I showed will then convert the 4 bits into an integer subscript and accumarray() will do the totaling for you.
The result will be a vector of (probably) 16 elements, one count per element. The bit patterns corresponding are the binary representations of (the index minus 1). So [0 0 0 0] for the first vector entry, [0 0 0 1] for the second vector entry, and so on.
FRANCISCO
FRANCISCO 2013년 10월 27일
편집: Walter Roberson 2013년 10월 27일
I tried to do it but I did want to verify correctly. Have if I understand correctly:
I have a long sequence of 1 and 0, probably about 171000 data. I using the following code:
if true
% code
accumArray counts = (B (:, 1) * 8 + B (:, 2) * 4 + B (:, 3) * 2 + B (:, 4) * 1 + 1, 1);
end
get the times to repeat each pattern, where the pattern is represented by integers.
If I wanted to build substrings of length 5, transforming them to integers and count the times that repeat as you would in the expression above?.
thank you very much
B(:,1) * 16 + B(:,2) * 8 + B(:,3) * 4 + B(:,4) * 2 + B(:,5) * 1 + 1
Notice the pattern, [8 4 2 1]. You can calculate that pattern for substrings of length N, and do not need to represent it explicitly:
B * (2.^fliplr(1:N)).' + 1
Note: that is * and not .* as it is matrix multiplication.
okei, I'm beginning to understand. I for example I have the following binary sequence:
s = [1 0 1 1 0 0 1 0 1 0 0 0];
and I want to calculate how many times are patterns of length 4 in that sequence, according to the sequence established at the beginning of the thread. For this I use:
if true
% code
accumArray counts = (s (:, 1) * 8 + s (:, 2) * 4 + s (:, 3) * 2 + s (:, 4)
  • 1 + 1, 1).
end
Here I would count:
1-0000
2-0001
.
.
16-1111.
To get the count of substrings of length 5 apply:
if true
% code
s * (2. ^ fliplr (1: N)). '+ 1
end
to get the count of length 6:
if true
% code
s * (2. ^ fliplr (1: N)). '+ 1
end
Walter Roberson
Walter Roberson 2013년 10월 27일
The s * (2. ^ fliplr (1: N)). '+ 1 form can be used for N = 4 as well.
FRANCISCO
FRANCISCO 2013년 10월 28일
I arrived at the solution. What should I do to from the sequence of binary numbers:
s = [0 1 0 0 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0];
count the number of occurrences of substrings of length 4,5,6, .... 20, but due to the amount of data, I do I just count patterns and not to store the substrings as otherwise processed a total of 171000 would not reach as data processing produce.
FRANCISCO
FRANCISCO 2013년 10월 28일
some solution??
Walter Roberson
Walter Roberson 2013년 10월 28일
편집: Walter Roberson 2013년 10월 28일
accumarray( (s(1:4:end) * 8 + s(2:4:end) * 4 + s(3:4:end) * 2 + s(4:4:end) * 1 + 1) .', 1)
FRANCISCO
FRANCISCO 2013년 10월 28일
I just applied but i dont i get the same result as the previous code. I think the processing of these data is impossible 171000
Walter Roberson
Walter Roberson 2013년 10월 28일
In your original code, how do you handle the boundary cases at the end, such as when there are only 3 bits left ?
If you could upload a .txt file with your bit pattern, I will run it through a couple of different counting methods and see if I get agreement.
FRANCISCO
FRANCISCO 2013년 10월 28일
I will send two. Binary sequences are approximately 200,000 data. Post one in the form s = [0 1 0 0 1 ....] and another in the form s = s' From this I have to create substrings sequence 4,5,6 ... 20. and enumeration of patterns. The problem is that not all data processed because of insufficient memory. But if I could treat the data differently, I have not created substrings need storage, but the count of patterns if needed storage. thank you very much
FRANCISCO
FRANCISCO 2013년 10월 29일
Any solution?? Many thanks
Walter Roberson
Walter Roberson 2013년 10월 29일
Sorry, I have been busy, and now I need to go sleep.
FRANCISCO
FRANCISCO 2013년 10월 29일
I tried several ways but it is impossible. Maybe I should use c #
FRANCISCO
FRANCISCO 2013년 10월 29일
Walter, you know c #?. I have the code in c # but I would like to build it in matlab but nose if possible

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Matrix Indexing에 대해 자세히 알아보기

제품

질문:

2013년 10월 25일

댓글:

2013년 10월 29일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by