Sai

Last seen: 9개월 전 | 2025년부터 활동

Followers: 0 Following: 0

통계

Feeds

질문

How do I apply a padding mask(B*T) variable into a training of a self-attention transformer decoder, from a 3D Matrix(B*T*C)
While attempting to train my neural network that uses a self attention layer in its transformer block. I have been struggling to...

12개월 전 | 답변 수: 1 | 0

1

답변