Skip to content
MathWorks - Mobile View
  • MathWorks 계정에 로그인합니다.MathWorks 계정에 로그인합니다.
  • Access your MathWorks Account
    • 내 계정
    • 나의 커뮤니티 프로필
    • 라이선스를 계정에 연결
    • 로그아웃
  • 제품
  • 솔루션
  • 아카데미아
  • 지원
  • 커뮤니티
  • 이벤트
  • MATLAB 받기
MathWorks
  • 제품
  • 솔루션
  • 아카데미아
  • 지원
  • 커뮤니티
  • 이벤트
  • MATLAB 받기
  • MathWorks 계정에 로그인합니다.MathWorks 계정에 로그인합니다.
  • Access your MathWorks Account
    • 내 계정
    • 나의 커뮤니티 프로필
    • 라이선스를 계정에 연결
    • 로그아웃

비디오 및 웨비나

  • MathWorks
  • 비디오
  • 비디오 홈
  • 검색
  • 비디오 홈
  • 검색
  • 영업 담당 문의
  • 평가판 신청
4:43 Video length is 4:43.
  • Description
  • Full Transcript
  • Related Resources

Applied Machine Learning, Part 3: Hyperparameter Optimization

From the series: Applied Machine Learning

Machine learning is all about fitting models to data. This process typically involves using an iterative algorithm that minimizes the model error. The parameters that control a machine learning algorithm’s behavior are called hyperparameters. Depending on the values you select for your hyperparameters, you might get a completely different model. So, by changing the values of the hyperparameters, you can find different, and hopefully better, models.    

This video walks through techniques for hyperparameter optimization, including grid search, random search, and Bayesian optimization. It explains why random search and Bayesian optimization are superior to the standard grid search, and it describes how hyperparameters relate to feature engineering in optimizing a model.

Machine learning is all about fitting models to data. The models consist of parameters, and we find the value for those through the fitting process. This process typically involves some type of iterative algorithm that minimizes the model error. That algorithm has parameters that control how it works, and those are what we call hyperparameters.

In deep learning, we also call the parameters that determine the layer characteristics hyperparameters. Today, we’ll be talking about techniques for both.

So, why do we care about hyperparameters?  Well, it turns out that most machine learning problems are non-convex. This means that depending on the values we select for the hyperparameters, we might get a completely different model. By changing the values of the hyperparameters, we can find different, and hopefully better, models.  

Ok, so we know that we have hyperparameters, and we know we want to tweak them, but how do we do that? Some hyperparameters are continuous, some are binary, and others might take on any number of discrete values. This makes for a tough optimization problem. It is almost always impossible to run an exhaustive search of the hyperparameter space, since it takes too long.  

So, traditionally, engineers and researchers have used techniques for hyperparameter optimization like grid search and random search. In this example, I’m using a grid search method to vary 2 hyperparameters – Box Constraint and Kernel Scale – for an SVM model.  As you can see, the error of the resulting model is different for different values of the hyperparameters. After 100 trials, the search has found 12.8 and 2.6 to be the most promising values for these hyperparameters.

Recently, random search has become more popular than grid search. 

 “How could that be?” you may be asking.

Wouldn’t grid search do a better job of evenly exploring the hyperparameter space?  

Let’s imagine you have 2 hyperparameters, “A” and “B”. Your model is very sensitive to “A,” but not sensitive to “B.”  If we did a 3x3 grid search, we would only ever evaluate 3 different values of “A.” But if we did a random search, we would probably get 9 different values of “A”, even though some may be close together. As a result, we have a much better chance of finding a good value for “A.”  In machine learning, we often have many hyperparameters. Some have a big influence over the results, and some don’t.  So random search is typically a better choice.

Grid search and random search are nice because it’s easy to understand what’s going on.  However, they still require many function evaluations. They also don’t take advantage of the fact that, as we evaluate more and more combinations of hyperparameters, we learn how those values affect our results. For that reason, you can use techniques that create a surrogate model – or an approximation of the error as a function of the hyperparameters.

Bayesian optimization is one such technique. Here we see an example of a Bayesian optimization algorithm running, where each dot corresponds to a different combination of hyperparameters. We can also see the algorithm’s surrogate model, shown here as the surface, which it is using to pick the next set of hyperparameters.

One other really cool thing about Bayesian optimization is that it doesn’t just look at how accurate a model is. It can also take into account how long it takes to train.  There could be sets of hyperparameters that cause the training time to increase by factors of 100 or more, and that might not be so great if we’re trying to hit a deadline. You can configure Bayesian optimization in a number of ways, including expected improvement per second, which penalizes hyperparameter values that are expected to take a very long time to train.

Now, the main reason to do hyperparameter optimization is to improve the model.  And, although there are other things we could do to improve it, I like to think of hyperparameter optimizations as being a low-effort, high-compute type of approach. This is in contrast to something like feature engineering, where you have higher effort to create the new features, but you need less computational time. It’s not always obvious which activity is going to have the biggest impact, but the nice thing about hyperparameter optimization is it lends itself well to “overnight runs,” so you can sleep while your computer works.

That was a quick explanation of hyperparameter optimization. For more information, check out the links in the description.

 

 

 

Related Products

  • Statistics and Machine Learning Toolbox

Learn More

Bayesian Optimization Workflow
Model Building and Assessment
Bayesian Optimization Documentation
What Is AutoML?

3 Ways to Speed Up Model Predictive Controllers

Read white paper

A Practical Guide to Deep Learning: From Data to Deployment

Read ebook

Bridging Wireless Communications Design and Testing with MATLAB

Read white paper

Deep Learning and Traditional Machine Learning: Choosing the Right Approach

Read ebook

Hardware-in-the-Loop Testing for Power Electronics Control Design

Read white paper

Predictive Maintenance with MATLAB

Read ebook

Electric Vehicle Modeling and Simulation - Architecture to Deployment : Webinar Series

Register for Free

How much do you know about power conversion control?

Start quiz
Related Information
Related Information
MATLAB for Machine Learning

Feedback

Featured Product

Statistics and Machine Learning Toolbox

  • Request Trial
  • Get Pricing

Up Next:

Walk through several key techniques and best practices for running your machine learning model on embedded devices. 
2:30
Part 4: Embedded Systems
View full series (4 Videos)

Related Videos:

34:34
Machine Learning Made Easy
5:36
Machine Learning for Predictive Modelling (Highlights)
44:37
Machine Learning for Predictive Modelling
41:25
Machine Learning with MATLAB
34:31
Machine Learning with MATLAB: Getting Started with...

View more related videos

MathWorks - Domain Selector

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

  • Switzerland (English)
  • Switzerland (Deutsch)
  • Switzerland (Français)
  • 中国 (简体中文)
  • 中国 (English)

You can also select a web site from the following list:

How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

Americas

  • América Latina (Español)
  • Canada (English)
  • United States (English)

Europe

  • Belgium (English)
  • Denmark (English)
  • Deutschland (Deutsch)
  • España (Español)
  • Finland (English)
  • France (Français)
  • Ireland (English)
  • Italia (Italiano)
  • Luxembourg (English)
  • Netherlands (English)
  • Norway (English)
  • Österreich (Deutsch)
  • Portugal (English)
  • Sweden (English)
  • Switzerland
    • Deutsch
    • English
    • Français
  • United Kingdom (English)

Asia Pacific

  • Australia (English)
  • India (English)
  • New Zealand (English)
  • 中国
    • 简体中文Chinese
    • English
  • 日本Japanese (日本語)
  • 한국Korean (한국어)

Contact your local office

  • 영업 담당 문의
  • 평가판 신청

MathWorks

Accelerating the pace of engineering and science

MathWorks는 엔지니어와 과학자들을 위한 테크니컬 컴퓨팅 소프트웨어 분야의 선도적인 개발업체입니다.

활용 분야 …

제품 소개

  • MATLAB
  • Simulink
  • 학생용 소프트웨어
  • 하드웨어 지원
  • File Exchange

다운로드 및 구매

  • 다운로드
  • 평가판 신청
  • 영업 상담
  • 가격 및 라이선스
  • MathWorks 스토어

사용 방법

  • 문서
  • 튜토리얼
  • 예제
  • 비디오 및 웨비나
  • 교육

지원

  • 설치 도움말
  • MATLAB Answers
  • 컨설팅
  • 라이선스 센터
  • 지원 문의

회사 정보

  • 채용
  • 뉴스 룸
  • 사회적 미션
  • 고객 사례
  • 회사 정보
  • Select a Web Site United States
  • 신뢰 센터
  • 등록 상표
  • 정보 취급 방침
  • 불법 복제 방지
  • 애플리케이션 상태
  • 매스웍스코리아 유한회사
  • 주소: 서울시 강남구 삼성동 테헤란로 521 파르나스타워 14층
  • 전화번호: 02-6006-5100
  • 대표자 : 이종민
  • 사업자 등록번호 : 120-86-60062

© 1994-2022 The MathWorks, Inc.

  • Naver
  • Facebook
  • Twitter
  • YouTube
  • LinkedIn
  • RSS

대화에 참여하기