Sorry for the late response here. Yes, I intentionally left off the masked attention for the sake of simplicity. In fact, *every* decoder layer would need to include masked attention -- not just the… - Frank Odom - Medium

Thanks for the well explained post.
2
3
mandar kulkarni
Frank Odom
·Follow
Dec 23, 2021
--
Sorry for the late response here. Yes, I intentionally left off the masked attention for the sake of simplicity. In fact, *every* decoder layer would need to include masked attention -- not just the first one.
--
--
Written by Frank Odom384 Followers
·57 Following
No responses yet
Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams