categorical형 데이터 플로팅하기

이 예제에서는 categorical형 배열의 데이터를 플로팅하는 방법을 보여줍니다.

샘플 데이터 불러오기

100명의 환자로부터 수집한 샘플 데이터를 불러옵니다. patients MAT 파일에서 배열의 데이터형과 크기를 표시합니다.

load patients
whos

  Name                            Size            Bytes  Class      Attributes

  Age                           100x1               800  double               
  Diastolic                     100x1               800  double               
  Gender                        100x1             11412  cell                 
  Height                        100x1               800  double               
  LastName                      100x1             11616  cell                 
  Location                      100x1             14208  cell                 
  SelfAssessedHealthStatus      100x1             11540  cell                 
  Smoker                        100x1               100  logical              
  Systolic                      100x1               800  double               
  Weight                        100x1               800  double

categorical형 배열 생성하기

작업 공간 변수 Location은 환자의 상태를 조사한 3개의 고유한 의료 시설을 나열합니다.

데이터를 더욱 쉽게 액세스하고 비교하기 위해 Location을 categorical형 배열로 변환합니다.

Location = categorical(Location);

categorical형 배열을 요약합니다. 요약에는 Location에 각 범주가 나타나는 횟수가 표시됩니다.

summary(Location)

     County General Hospital       39 
     St. Mary's Medical Center      24 
     VA Hospital                   37

39명의 환자가 County General Hospital에서 관찰되고, 24명이 St. Mary's Medical Center에서 관찰되었으며, 37명이 VA Hospital에서 관찰되었습니다.

작업 공간 변수 SelfAssessedHealthStatus는 Excellent, Fair, Good, Poor라는 4개의 고유한 값을 포함합니다.

범주에 수학적 정렬(Mathematical Ordering) Poor < Fair < Good < Excellent가 적용되는 순서형 categorical형 배열로 SelfAssessedHealthStatus를 변환합니다.

SelfAssessedHealthStatus = categorical(SelfAssessedHealthStatus,...
                           ["Poor","Fair","Good","Excellent"],"Ordinal",true);

categorical형 배열 SelfAssessedHealthStatus를 요약합니다.

summary(SelfAssessedHealthStatus)

     Poor           11 
     Fair           15 
     Good           40 
     Excellent      34

히스토그램 플로팅

SelfAssessedHealthStatus에서 직접 히스토그램 막대 플롯을 생성합니다. 이 categorical형 배열은 순서형 categorical형 배열입니다. 범주에는 정렬 Poor < Fair < Good < Excellent가 적용되어 있으므로, 이에 따라 플롯의 x축에서의 범주의 순서가 결정됩니다. histogram 함수는 4개의 범주 각각에 대한 범주 개수를 플로팅합니다.

figure
histogram(SelfAssessedHealthStatus)
title("Self Assessed Health Status From 100 Patients")

Figure contains an axes object. The axes object with title Self Assessed Health Status From 100 Patients contains an object of type categoricalhistogram.

건강 상태가 Fair 또는 Poor로 평가된 환자에 대한 병원 위치만 나타내는 히스토그램을 생성합니다.

figure
histogram(Location(SelfAssessedHealthStatus <= "Fair"))
title("Location of Patients in Fair or Poor Health")

Figure contains an axes object. The axes object with title Location of Patients in Fair or Poor Health contains an object of type categoricalhistogram.

원형 차트 생성하기

categorical형 배열에서 직접 원형 차트를 생성합니다.

figure
pie(SelfAssessedHealthStatus);
title("Self Assessed Health Status From 100 Patients")

함수 pie는 categorical형 배열 SelfAssessedHealthStatus를 받아 4개의 범주를 나타내는 원형 차트를 플로팅합니다.

파레토 차트 생성

4개의 SelfAssessedHealthStatus 범주 각각에 대한 범주 개수를 기반으로 하여 파레토 차트를 생성합니다.

figure
A = countcats(SelfAssessedHealthStatus);
C = categories(SelfAssessedHealthStatus);
pareto(A,C);
title("Self Assessed Health Status From 100 Patients")

Figure contains 2 axes objects. Axes object 1 with title Self Assessed Health Status From 100 Patients contains 2 objects of type bar, line. Axes object 2 is empty.

pareto에 대한 첫 번째 입력 인수는 벡터여야 합니다. categorical형 배열이 행렬이거나 다차원 배열인 경우 countcats 및 pareto를 호출하기 전에 벡터로 형태 변경하십시오.

산점도 플롯 생성

자가 검진한 건강 상태가 혈압 측정값과 관련이 있는지 확인합니다. 두 환자 그룹에 대한 Diastolic 측정값과 Systolic 측정값으로 구성된 산점도 플롯을 생성합니다.

먼저, 두 환자 그룹의 혈압 측정값으로 구성된 x 배열과 y 배열을 생성합니다. 첫 번째 환자 그룹은 자신의 건강 상태를 Poor 또는 Fair로 평가한 사람들로 구성됩니다. 두 번째 환자 그룹은 자신의 건강 상태를 Good 또는 Excellent로 평가한 사람들로 구성됩니다.

categorical형 배열 SelfAssessedHealthStatus를 사용하여 논리형 인덱스를 만들 수 있습니다. 논리형 인덱스를 사용하여 Diastolic과 Systolic의 값을 서로 다른 배열에 추출합니다.

X1 = Diastolic(SelfAssessedHealthStatus <= "Fair");
Y1 = Systolic(SelfAssessedHealthStatus <= "Fair");

X2 = Diastolic(SelfAssessedHealthStatus >= "Good");
Y2 = Systolic(SelfAssessedHealthStatus >= "Good");

X1 및 Y1은 건강 상태가 Poor 또는 Fair인 환자의 데이터를 포함하는 26×1 숫자형 배열입니다.

X2 및 Y2는 건강 상태가 Good 또는 Excellent인 환자의 데이터를 포함하는 74×1 숫자형 배열입니다.

두 환자 그룹에 대한 혈압 측정값으로 구성된 산점도 플롯을 생성합니다. 플롯에는 두 그룹 간의 차이를 의미할 만한 결과가 나타나지 않습니다. 이는 혈압과 자신의 건강에 대한 환자의 평가 사이에는 관련이 없음을 의미할 수 있습니다.

figure
h1 = scatter(X1,Y1,"o");
hold on
h2 = scatter(X2,Y2,"x");

title("Blood Pressure for Groups of Patients Assessing Self Health");
xlabel("Diastolic (mm Hg)")
ylabel("Systolic (mm Hg)")
legend("Poor or Fair","Good or Excellent")

Figure contains an axes object. The axes object with title Blood Pressure for Groups of Patients Assessing Self Health, xlabel Diastolic (mm Hg), ylabel Systolic (mm Hg) contains 2 objects of type scatter. These objects represent Poor or Fair, Good or Excellent.

참고 항목