Why does layerNormalizationLayer in Deep Learning Toolbox include T dimension into the batch?

John Smith

2023 3월 13

2 답변

답변 채택됨

조회 수: 6 (30일)

1 개 추천

Hello,

While implementing a ViT transformer in Matlab, I found at that the layerNormalizationLayer does include the T dimension in the statistics calculated for each sample in the batch. This is problematics when implementing a transformer, since tokens correspond to the T dimension and reference implementations calculate the statistics separately for each token.

Thx

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

채택된 답변

John Smith 2023년 3월 24일

0 개 추천

It seems Mathworks have listened and changed the behavior of layerNormalizationLayer in R2023a.:

https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.layernormalizationlayer.html

Starting in R2023a, by default, the layer normalizes sequence data over the channel and spatial dimensions. In previous versions, the software normalizes over all dimensions except for the batch dimension (the spatial, time, and channel dimensions). Normalization over the channel and spatial dimensions is usually better suited for this type of data. To reproduce the previous behavior, set OperationDimension to "batch-excluded".

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Matt J 2023년 3월 13일

0 개 추천

Perhaps you can fold your T dimension into the C dimension and use a groupNormalizationLayer instead, with the groups defined so that different T belong to different groups.

댓글 수: 7
이전 댓글 5개 표시 이전 댓글 5개 숨기기

John Smith 2023년 3월 15일

Perhaps lamenting would cause someone from Mathworks to take notice and add the capability to the code base. Sigh ...

Matt J 2023년 3월 15일

That happens sometimes, but usually you have to submit a formal enhancement request.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

카테고리

도움말 센터 및 File Exchange에서 Deep Learning Toolbox에 대해 자세히 알아보기

제품

Deep Learning Toolbox

릴리스

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

Why does layerNormalizationLayer in Deep Learning Toolbox include T dimension into the batch?

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

추가 답변 (1개)

댓글 수: 7 이전 댓글 5개 표시 이전 댓글 5개 숨기기

카테고리

제품

릴리스

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 7
이전 댓글 5개 표시 이전 댓글 5개 숨기기