Questions re: the DeepNorm implementation · labmlai/annotated_deep_learning_paper_implementations · Discussion #277 · GitHub

Skip to content

hanqin
Nov 14, 2024

in

annotated_deep_learning_paper_implementations/labml_nn/normalization/deep_norm/__init__.py

Line 112 in 90e21b5

return self.layer_norm(x + self.alpha * gx)

the DeepNorm is calculated as
return self.layer_norm(x + self.alpha * gx)

should this be
return self.layer_norm(self.alpha * x + gx)?

this is implementation from torchscale lib
https://github.com/microsoft/torchscale/blob/4d1e0e82e5adf86dd424f1463192635b73fc8efc/torchscale/architecture/decoder.py#L130

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment