removeLongWords

문서 또는 bag-of-words 모델에서 긴 단어 제거

페이지 내 모두 축소

구문

newDocuments = removeLongWords(documents,len)

newBag = removeLongWords(bag,len)

설명

예제

newDocuments = removeLongWords(documents,len)은 documents에서 길이가 len 이상인 단어를 제거합니다.

예제

newBag = removeLongWords(bag,len)은 bagOfWords 객체 bag에서 길이가 len 이상인 단어를 제거합니다.

예제

모두 축소

문서에서 긴 단어 제거하기

라이브 스크립트 열기

문서에서 7자 이상으로 이루어진 단어를 제거합니다.

document = tokenizedDocument("An example of a short sentence");
newDocument = removeLongWords(document,7)

newDocument = 
  tokenizedDocument:

   4 tokens: An of a short

bag-of-words 모델에서 긴 단어 제거하기

라이브 스크립트 열기

bag-of-words 모델에서 7자 이상으로 이루어진 단어를 제거합니다.

documents = tokenizedDocument([ ...
    "an example of a short sentence"
    "a second short sentence"]);
bag = bagOfWords(documents);
newBag = removeLongWords(bag,7)

newBag = 
  bagOfWords with properties:

          Counts: [2x5 double]
      Vocabulary: ["an"    "of"    "a"    "short"    "second"]
        NumWords: 5
    NumDocuments: 2

입력 인수

모두 축소

`documents` — 입력 문서
`tokenizedDocument` 배열

입력 문서로, tokenizedDocument 배열로 지정됩니다.

`bag` — 입력 bag-of-words 모델
`bagOfWords` 객체

입력 bag-of-words 모델로, bagOfWords 객체로 지정됩니다.

`len` — 제거할 단어의 최소 길이
양의 정수

제거할 단어의 최소 길이로, 양의 정수로 지정됩니다. 이 함수는 문자 개수가 len 이상인 단어를 제거합니다.

출력 인수

모두 축소

`newDocuments` — 출력 문서
`tokenizedDocument` 배열

출력 문서로, tokenizedDocument 배열로 반환됩니다.

`newBag` — 출력 bag-of-words 모델
`bagOfWords` 객체

출력 bag-of-words 모델로, bagOfWords 객체로 반환됩니다.

버전 내역

R2017b에 개발됨

참고 항목

removeLongWords

구문

설명

예제

문서에서 긴 단어 제거하기

bag-of-words 모델에서 긴 단어 제거하기

입력 인수

documents — 입력 문서 tokenizedDocument 배열

bag — 입력 bag-of-words 모델 bagOfWords 객체

len — 제거할 단어의 최소 길이 양의 정수

출력 인수

newDocuments — 출력 문서 tokenizedDocument 배열

newBag — 출력 bag-of-words 모델 bagOfWords 객체

버전 내역

참고 항목

도움말 항목

`documents` — 입력 문서
`tokenizedDocument` 배열

`bag` — 입력 bag-of-words 모델
`bagOfWords` 객체

`len` — 제거할 단어의 최소 길이
양의 정수

`newDocuments` — 출력 문서
`tokenizedDocument` 배열

`newBag` — 출력 bag-of-words 모델
`bagOfWords` 객체