이 제출물을 팔로우합니다
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다
This educational Live Script explores the transformer, the neural network architecture behind current large language models. A language model assigns a probability to each character that could continue a piece of text, and generates text by repeatedly drawing the next character from that distribution. We build a complete character-level model, train it on text of Shakespeare, and examine the attention operation that lets each position in the text depend on the positions before it.
The model built here has roughly 1.1e5 adjustable parameters. Its architecture is similar to that of frontier models with order 1e11 to 1e12 parameters. The construction follows the nanoGPT model of A. Karpathy [1].
This script may interest students and instructors of physics and other fields. It is appropriate for a first course that includes neural networks and assumes familiarity with a basic classifier network of the kind developed in Identify Objects Acoustically with a Neural Network [2]. A Background Information section describe the transformer, and interactive 'Try this' suggestions, coding 'Challenges,' and references are included for further exploration. Additional educational Live Scripts by the author may be found here.
인용 양식
Duncan Carlsmith (2026). nanoGPT Explorer (https://kr.mathworks.com/matlabcentral/fileexchange/183953-nanogpt-explorer), MATLAB Central File Exchange. 검색 날짜: .
| 버전 | 퍼블리시됨 | 릴리스 정보 | Action |
|---|---|---|---|
| 1.0.0 |
