Generating two classes of points

조회 수: 2 (최근 30일)
Deepayan Bhadra
Deepayan Bhadra 2018년 2월 20일
댓글: Deepayan Bhadra 2018년 2월 20일
Hi,
I want to create a 100x2 feature vector set, with roughly half in class '1' and the rest in class '-1'. (As in, when I make a scatter plot of the x-y coordinates, the points should be almost linearly separable). I suppose rand or randn wouldn't help much here.
Thanks for your inputs.

채택된 답변

John D'Errico
John D'Errico 2018년 2월 20일
Why would rand or randn NOT be of use here?
For example, what do you know about a bivariate random sample? How far out can you expect points to deviate from the mean, if you know the variances? In the case of using randn, the default covariance matrix would be an identity matrix. So you would expect almost the entire distribution to lie within 3*sigma.
So if you adjust the mean of the two set of samples to be separate by a sufficient amount, they will be distinct. You even know how much to offset the two distributions, based on those variances. Adding a constant value to normally distributed samples just adjusts the mean. So WTP?
Likewise, had you used rand, it produces uniformly distributed samples. In 2-dimensions, they will thus live in a unit square. Again, WTP? add a constant to one set.
  댓글 수: 1
Deepayan Bhadra
Deepayan Bhadra 2018년 2월 20일
The idea is that the 100x2 matrix I am manually generating will reflect a real-world data set. Some of these 100 points will lie on one side of a line and the rest on the other. The problem will assign to each point a 'class' of +1 or -1, accordingly. Also, when a new 'point' is generated, we can classify which side of the line to assign it to. The 'random' data-generation with such strict separation is the focal issue here.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Image Analyst
Image Analyst 2018년 2월 20일
No, but randperm() can help. Try this:
m = zeros(100, 2); % Initialize to all zero.
indexes = randperm(numel(m), numel(m)/2) % Get half of the indexes to change to 1.
m(indexes) = 1 % Change them to 1.
  댓글 수: 1
Deepayan Bhadra
Deepayan Bhadra 2018년 2월 20일
I ran this but it didn't really relate well. As I mentioned above, the idea is that the 100x2 matrix I am manually generating will reflect a real-world data set. Some of these 100 points will lie on one side of a line and the rest on the other.
The problem will assign to each point a 'class' of +1 or -1, accordingly. Also, when a new 'point' is generated, we can classify which side of the line to assign it to. The 'random' data-generation with such strict separation is the focal issue here.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Random Number Generation에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by