Main Content

이 번역 페이지는 최신 내용을 담고 있지 않습니다. 최신 내용을 영문으로 보려면 여기를 클릭하십시오.

모델링 및 예측

토픽 모델과 단어 임베딩을 사용하여 예측 모델 개발

LSA, LDA, 단어 임베딩 같은 머신러닝 기법과 모델을 사용하여 고차원 텍스트 데이터셋에서 군집을 찾고 특징을 추출할 수 있습니다. Text Analytics Toolbox™에서 생성한 특징은 다른 데이터 소스의 특징과 결합할 수 있습니다. 결합된 특징을 사용하여 텍스트, 숫자 등 다양한 유형의 데이터를 활용하는 머신러닝 모델을 빌드할 수 있습니다.

함수

모두 확장

bagOfWordsBag-of-words model
bagOfNgramsBag-of-n-grams model
addDocumentbag-of-words 모델 또는 bag-of-n-grams 모델에 문서 추가
removeDocumentbag-of-words 모델 또는 bag-of-n-grams 모델에서 문서 제거
removeInfrequentWordsbag-of-words 모델에서 개수가 적은 단어 제거
removeInfrequentNgramsRemove infrequently seen n-grams from bag-of-n-grams model
removeWordsRemove selected words from documents or bag-of-words model
removeNgramsRemove n-grams from bag-of-n-grams model
removeEmptyDocumentsRemove empty documents from tokenized document array, bag-of-words model, or bag-of-n-grams model
topkwordsMost important words in bag-of-words model or LDA topic
topkngramsMost frequent n-grams
encodeEncode documents as matrix of word or n-gram counts
tfidfTerm Frequency–Inverse Document Frequency (tf-idf) matrix
joinCombine multiple bag-of-words or bag-of-n-grams models
vaderSentimentScoresSentiment scores with VADER algorithm
ratioSentimentScoresSentiment scores with ratio rule
fastTextWordEmbeddingPretrained fastText word embedding
wordEncodingWord encoding model to map words to indices and back
doc2sequenceConvert documents to sequences for deep learning
wordEmbeddingLayer딥러닝 신경망을 위한 단어 임베딩 계층
word2vecMap word to embedding vector
word2indMap word to encoding index
vec2wordMap embedding vector to word
ind2wordMap encoding index to word
isVocabularyWordTest if word is member of word embedding or encoding
readWordEmbedding파일에서 단어 임베딩 읽어오기
trainWordEmbeddingTrain word embedding
writeWordEmbeddingWrite word embedding file
wordEmbeddingWord embedding model to map words to vectors and back
extractSummary문서에서 요약 추출
rakeKeywordsExtract keywords using RAKE
textrankKeywordsExtract keywords using TextRank
bleuEvaluationScoreEvaluate translation or summarization with BLEU similarity score
rougeEvaluationScoreEvaluate translation or summarization with ROUGE similarity score
bm25SimilarityDocument similarities with BM25 algorithm
cosineSimilarityDocument similarities with cosine similarity
textrankScoresDocument scoring with TextRank algorithm
lexrankScoresDocument scoring with LexRank algorithm
mmrScoresDocument scoring with Maximal Marginal Relevance (MMR) algorithm
fitldaFit latent Dirichlet allocation (LDA) model
fitlsaFit LSA model
resumeResume fitting LDA model
logpDocument log-probabilities and goodness of fit of LDA model
predictPredict top LDA topics of documents
transformTransform documents into lower-dimensional space
ldaModelLatent Dirichlet allocation (LDA) model
lsaModelLatent semantic analysis (LSA) model
wordcloudCreate word cloud chart from text, bag-of-words model, bag-of-n-grams model, or LDA model
textscatter2-D scatter plot of text
textscatter33-D scatter plot of text

도움말 항목

분류 및 모델링

단순 전처리 함수 만들기

이 예제에서는 분석할 텍스트 데이터를 정리하고 전처리하는 함수를 만드는 방법을 보여줍니다.

Create Simple Text Model for Classification

This example shows how to train a simple text classifier on word frequency counts using a bag-of-words model.

Analyze Text Data Using Multiword Phrases

This example shows how to analyze text using n-gram frequency counts.

Analyze Text Data Using Topic Models

This example shows how to use the Latent Dirichlet Allocation (LDA) topic model to analyze text data.

Choose Number of Topics for LDA Model

This example shows how to decide on a suitable number of topics for a latent Dirichlet allocation (LDA) model.

Compare LDA Solvers

This example shows how to compare latent Dirichlet allocation (LDA) solvers by comparing the goodness of fit and the time taken to fit the model.

감성 분석 및 키워드 추출

텍스트에 내포된 감성 분석하기

이 예제에서는 VADER(Valence Aware Dictionary and sEntiment Reasoner) 알고리즘을 사용하여 감성 분석을 수행하는 방법을 보여줍니다.

Generate Domain Specific Sentiment Lexicon

This example shows how to generate a lexicon for sentiment analysis using 10-K and 10-Q financial reports.

Train a Sentiment Classifier

This example shows how to train a classifier for sentiment analysis using an annotated list of positive and negative sentiment words and a pretrained word embedding.

Extract Keywords from Text Data Using RAKE

This example shows how to extract keywords from text data using Rapid Automatic Keyword Extraction (RAKE).

Extract Keywords from Text Data Using TextRank

This example shows to extract keywords from text data using TextRank.

딥러닝

Classify Text Data Using Deep Learning

This example shows how to classify text data using a deep learning long short-term memory (LSTM) network.

Classify Text Data Using Convolutional Neural Network

This example shows how to classify text data using a convolutional neural network.

Classify Out-of-Memory Text Data Using Deep Learning

This example shows how to classify out-of-memory text data with a deep learning network using a transformed datastore.

Sequence-to-Sequence Translation Using Attention

This example shows how to convert decimal strings to Roman numerals using a recurrent sequence-to-sequence encoder-decoder model with attention.

딥러닝을 사용하여 텍스트 생성하기 (Deep Learning Toolbox)

이 예제에서는 텍스트를 생성하도록 딥러닝 장단기 기억(LSTM) 신경망을 훈련시키는 방법을 보여줍니다.

오만과 편견 그리고 MATLAB

이 예제에서는 문자 임베딩을 사용하여 텍스트를 생성하도록 딥러닝 LSTM 신경망을 훈련시키는 방법을 보여줍니다.

Word-By-Word Text Generation Using Deep Learning

This example shows how to train a deep learning LSTM network to generate text word-by-word.

Classify Text Data Using Custom Training Loop

This example shows how to classify text data using a deep learning bidirectional long short-term memory (BiLSTM) network with a custom training loop.

Generate Text Using Autoencoders

This example shows how to generate text data using autoencoders.

Define Text Encoder Model Function

This example shows how to define a text encoder model function.

텍스트 디코더 모델 함수 정의하기

이 예제에서는 텍스트 디코더 모델 함수를 정의하는 방법을 보여줍니다.

언어 지원

Language Considerations

Information on using Text Analytics Toolbox features for other languages.

Japanese Language Support

Information on Japanese support in Text Analytics Toolbox.

Analyze Japanese Text Data

This example shows how to import, prepare, and analyze Japanese text data using a topic model.

German Language Support

Information on German support in Text Analytics Toolbox.

Analyze German Text Data

This example shows how to import, prepare, and analyze German text data using a topic model.

추천 예제