Computer Vision Toolbox Model for Vision Transformer Network
Implementation of several variants of the vision transformer (ViT) model.
다운로드 수: 1.3K
업데이트 날짜:
2025/9/17
The Vision Transformer (ViT) model is a pretrained transformer model for image classification. It is also used as a backbone for other computer vision tasks such as object detection. The support package consists of three variants of the ViT model:
- Base-16 model
- Small-16 model
- Tiny-16 model
Here, “base”, “small” and “tiny” represent the model architecture and size, and 16 represents the patch size hyper-parameter. Each variant has been pretrained on ImageNet data set with input resolution of 384 and is stored as a .MAT file.
MATLAB 릴리스 호환 정보
개발 환경:
R2023b
R2023b에서 R2025b까지의 릴리스와 호환
플랫폼 호환성
Windows macOS (Apple Silicon) macOS (Intel) Linux태그
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!