성능

코드 생성 문제 해결, 코드 실행 시간 개선, 생성 코드의 메모리 사용량 줄이기

GPU Coder™에서 생성된 코드가 예상대로 작동하지 않는 가장 일반적인 이유 몇 가지는 다음과 같습니다.

다음 항목에서는 이러한 증상의 일반적인 원인을 자세히 설명하고 내장 스크리너 함수를 활용하여 이러한 문제를 감지하는 방법을 설명합니다. 이러한 문제를 해결하고 보다 효율적인 CUDA 코드를 생성하는 방법에 대한 정보를 확인할 수 있습니다.

앱

GPU Coder	MATLAB 코드에서 CUDA 코드 생성
GPU 환경 검사	GPU 코드 생성 환경에 대한 확인과 설정

`codegen`	MATLAB 코드에서 C/C++ 코드 생성
`gpucoder`	GPU Coder 앱 열기
`gpuPerformanceAnalyzer`	Analyze and optimize performance of the generated code (R2023a 이후)
`gpuprofile`	Profile execution time for generated CUDA code (R2024a 이후)

`coder.gpu.kernel`	Pragma that maps `for`-loops to GPU kernels
`coder.gpu.kernelfun`	함수를 GPU 커널에 매핑하는 프라그마
`coder.gpu.nokernel`	Pragma to disable kernel creation for loops

`coder.gpuConfig`	Configuration parameters for CUDA code generation from MATLAB code by using GPU Coder
`coder.CodeConfig`	MATLAB 코드에서 C/C++ 코드를 생성하기 위한 구성 파라미터
`coder.EmbeddedCodeConfig`	Configuration parameters for C/C++ code generation from MATLAB code with Embedded Coder
`coder.gpuEnvConfig`	Configuration object for checking the GPU code generation environment

Code Generation Reports
Create and view reports generated during code generation.
Trace Between Generated CUDA Code and MATLAB Source Code
Highlight sections of MATLAB^® code that runs on the GPU.
Generating a GPU Code Metrics Report for Code Generated from MATLAB Code
Create and explore GPU static code metrics report.
GPU Performance Analyzer
Visualize code metrics and identify optimization and tuning opportunities in your code.
Analyzing Network Performance Using the Deep Learning Dashboard
Investigate the performance of deep learning networks and layers in generated code using the Deep Learning Dashboard. (R2025a 이후)
Kernel Analysis
Recommendations for generating efficient CUDA kernels.
Memory Bottleneck Analysis
Reduce memory bottleneck issues when using GPU Coder.
Optimize Kernels That Contain Loops
Rewrite loops in MATLAB to avoid generated code kernels that contain loops. (R2025a 이후)
Prevent Kernel Launches Inside Loops
Parallelize loops that launch kernels to execute them on the GPU. (R2025a 이후)
Minimize Memory Copy Events in Generated Code Loops
Rewrite loops to minimize the number of data transfers between the CPU and GPU in generated CUDA code. (R2025a 이후)