Hello there, I have some trajectory data and is hoping to use a markov model to make prediction. The next step im going to do is to transfer each location point to a corresponding state. But I have no idea how to create a transition probability matrix from such massive data (>10000). The following is the current form of my data struct. Would appreticate if somebody can give some advice. Thank you

create transition probability matrix from massive data

Walter Roberson 2019년 7월 30일

Where in that data structure are the states that you need to create the statistics for? Is the cluster number the state number? Are states 2 and 3 both terminal states and that is why they only have 1 row and have flag 0 ? But if so then I would tend to expect that there would be five clusters, not four, with the five items labeled cluster 1 indicating state 1 transitioning to state 1, state 1 transitioning to state 2, state 1 transitioning to state 3, state 1 transitioning to state 4, and the fifth one being state 1 transitioning to a state 5 that is not listed as a cluster number for some reason.

Manpeng He 2019년 7월 30일

The flag is just used for the clustering. It has no meaning after this. The cluster numbers show the cluster whcih a sequence of data is in. Each (x,y) pair will be transfer to a state. A sequence of (x,y) is then a sequence of states and I wish to create a transition matrix based on the data in the same cluster.

Walter Roberson 2019년 7월 30일

편집: Walter Roberson 2019년 7월 30일

So are there 5 states or 4? Are stated 2 and 3 terminal or do they always transition to state 1?

I do not understand what the x y pairs represent?

Walter Roberson 2019년 7월 30일

I also do not see where the 10000 is from? I see at most 50*10+6+3 entries?

Manpeng He 2019년 7월 30일

x y represent longitude and latitude separately. There are 16851 sequences of x y pairs with different length and I pick first 50 pairs from each sequence to run a test. About the states, for example, (x1,y1) -(x2,y2)- (x3,y3) -(x4,y4) will become 2 - 4 - 3 -1, showing the path it goes. Sequences with the same cluster number have similar pattern. I am sorry if there is any unclear description before.

Walter Roberson 2019년 7월 30일

What I am getting out of what you are saying is that you have a list of lat long pairs that are gathered together in the first group associated with state 1, and that when any given lat/long that is currently associated with state 1 is looked up then if it is found in the first subgroup then it is to remain in state 1 (and then because this is not probabilistic, it would get caught in state 1).

Likewise if the lat/long associated with state 1 is found in the second subgroup marked as state 1, then on the next step that pair is to be considered associated with state 2.

Then on the second step, with the pair now being associated with state 2, it would be looked up in the first (only) subgroup associated with state 2, and if it is found there then it should transition to being associated with state 1. But if it is not found there in the first subgroup, then because there are no further subgroups, that trajectory should stop??

And this process should continue for a fixed number of steps or until all of the entries have fallen out of the table??

And the end result desired is a probability matrix, as if the transitions were random, even though the model data is effectively fixed transition?

What is your intended way to handle the fact that your x y are floating point and that exact comparisons of floating point numbers are seldom a good idea?

create transition probability matrix from massive data

댓글 수: 6
이전 댓글 4개 표시 이전 댓글 4개 숨기기

답변 (0개)

카테고리

제품

태그

Community Treasure Hunt

create transition probability matrix from massive data

댓글 수: 6 이전 댓글 4개 표시 이전 댓글 4개 숨기기

답변 (0개)

카테고리

제품

태그

참고 항목

Community Treasure Hunt

댓글 수: 6
이전 댓글 4개 표시 이전 댓글 4개 숨기기