Junior R F Junior

JoseRFJunior

AI & ML interests

https://github.com/JoseRFJuniorLLMs/Jumento-LLMs

Organizations

AI FILMS's profile picture video-p2p-library's profile picture Gradio-Themes-Party's profile picture scikit-learn's profile picture Open-Source AI Meetup's profile picture lora concepts library's profile picture Kornia AI's profile picture Tune a video concepts library's profile picture Keras Dreambooth Event's profile picture Stable Diffusion Dreambooth Concepts Library's profile picture Musika's profile picture Blog-explorers's profile picture ICCV2023's profile picture ICML2023's profile picture huggingPartyParis's profile picture The Collectionists's profile picture ZeroGPU Explorers's profile picture GooGolPlex's profile picture MLX Community's profile picture Narra's profile picture Social Post Explorers's profile picture Dev Mode Explorers's profile picture

Posts 1

view post
Post
1687
JoseRFJunior/TransNAR
https://github.com/JoseRFJuniorLLMs/TransNAR
https://arxiv.org/html/2406.09308v1
TransNAR hybrid architecture. Similar to Alayrac et al, we interleave existing Transformer layers with gated cross-attention layers which enable information to flow from the NAR to the Transformer. We generate queries from tokens while we obtain keys and values from nodes and edges of the graph. The node and edge embeddings are obtained by running the NAR on the graph version of the reasoning task to be solved. When experimenting with pre-trained Transformers, we initially close the cross-attention gate, in order to fully preserve the language model’s internal knowledge at the beginning of training.