zolicsaki (Zoltan Csaki)

Reacted to kz919's post with 🚀 2 months ago

Post

1263

Just for the meme.

But the clear lesson I learnt from building these demos are, the more powerful the underlying base model is, the closer you will get to GPT4o1. CoT is nothing more than simply inducing the latent reasoning capability from the model.

kz919/GPT4-O1-Proximas

Reacted to KingNish's post with 👍 2 months ago

Post

3086

A super good and fast image inpainting demo is here.
Its' super cool and realistic.

Demo by @OzzyGT (Must try):
OzzyGT/diffusers-fast-inpaint

posted an update 2 months ago

Post

1247

We’ve open-sourced an app, powered by SambaNova Cloud and Llama 405B, that intelligently detects when a web search is needed—then answers directly or with RAG.

sambanovasystems/auto-web-search

🥚 A hidden Easter egg is that Auto Search detection is already trained into Llama 3.1 checkpoints. Simply use the tool usage system prompt below, and the model will either respond with a web search query if it deems necessary or respond to the query directly.🥚

Environment: IPython
Tools: Brave Search
Knowledge Cutoff Date: December 2023
Today's Date: September 2024
You are a helpful assistant. Reminder:
Search function calls MUST follow the specified format: "brave_search.call(query)"

You can see the documentation here
https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1#built-in-tooling
and read about how the tool usage was trained into Llama3.1 models in section 4.3.5 here https://arxiv.org/pdf/2407.21783

posted an update 3 months ago

Post

1288

Fast inference is no longer a nice-to-have demo; it will be the driving force behind future frontier models. Time to switch over to custom AI hardware and short Nvidia.

Try out SambaNova's lightning fast API for free at https://sambanova.ai/fast-api?api_ref=444868

Reacted to kz919's post with 🧠🤯🤗🔥🚀😎 3 months ago

Post

1584

Spent a few minutes to build an alternative to Character AI on top of llama3.1 405B through SambaNova's super fast inference API

Space: kz919/Persona-AI
API referral link: https://sambanova.ai/fast-api?api_ref=907266

3 replies

·

Reacted to their post with 🤗 3 months ago

Post

1809

You can run Llama405B at over 100 tokens per second for free using SambaNova's API! https://sambanova.ai/fast-api?api_ref=444868

I have been able to generate some high quality synthetic data and use it as an LLM as a judge instead of the slower and more expensive alternatives like openAI or Anthropic.

2 replies

·

replied to their post 3 months ago

@gghfez all you need is a valid email, I think they send out the API keys once a day when they approve you. They approve everyone unless they think its a spam trying to get more then one key

Reacted to their post with 🚀 3 months ago

Post

1809

You can run Llama405B at over 100 tokens per second for free using SambaNova's API! https://sambanova.ai/fast-api?api_ref=444868

I have been able to generate some high quality synthetic data and use it as an LLM as a judge instead of the slower and more expensive alternatives like openAI or Anthropic.

2 replies

·

posted an update 3 months ago

Post

1809

You can run Llama405B at over 100 tokens per second for free using SambaNova's API! https://sambanova.ai/fast-api?api_ref=444868

I have been able to generate some high quality synthetic data and use it as an LLM as a judge instead of the slower and more expensive alternatives like openAI or Anthropic.

2 replies

·

Reacted to akhaliq's post with 👍 6 months ago

Post

20839

Chameleon

Mixed-Modal Early-Fusion Foundation Models

Chameleon: Mixed-Modal Early-Fusion Foundation Models (2405.09818)

We present Chameleon, a family of early-fusion token-based mixed-modal models capable of understanding and generating images and text in any arbitrary sequence. We outline a stable training approach from inception, an alignment recipe, and an architectural parameterization tailored for the early-fusion, token-based, mixed-modal setting. The models are evaluated on a comprehensive range of tasks, including visual question answering, image captioning, text generation, image generation, and long-form mixed modal generation. Chameleon demonstrates broad and general capabilities, including state-of-the-art performance in image captioning tasks, outperforms Llama-2 in text-only tasks while being competitive with models such as Mixtral 8x7B and Gemini-Pro, and performs non-trivial image generation, all in a single model. It also matches or exceeds the performance of much larger models, including Gemini Pro and GPT-4V, according to human judgments on a new long-form mixed-modal generation evaluation, where either the prompt or outputs contain mixed sequences of both images and text. Chameleon marks a significant step forward in a unified modeling of full multimodal documents.

posted an update 7 months ago

Post

889

SambaNova just released a revolutionary paper about how the SN40L AI chip can host many LLMs on a single node and run inference so efficiently that it enables running a "composition of experts." These experts can be interconnected via a router, resulting in remarkable accuracy. This method allows you to take open source expert models from HuggingFace and continuously build and integrate them into a composition of experts.

I am also super excited about the possibilities that SN40Ls unlock for LLM agent workflows and pipelined calls. With the release of GPT4o, it seems that monolithic LLMs are starting to reach a plateau, and I believe that the next wave of AI will be driven by pipelined LLM calls and agent workflows. Most pipelined LLM workflows are bottlenecked by prohibitively expensive compute and high latency, but the SN40L provides a one stop shop solution for this. We need to get the word out to the community that this hardware exists, because it will open up a realm of possibilities that developers working with Nvidia hardware did not know exist.

SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts (2405.07518)

Reacted to not-lain's post with 🔥👍 7 months ago

Post

2286

🔥 **transformers** 4.40.0 is out 🔥

⚙️ you can now push your custom pipelines easily to 🤗, allowing people to easily use your model in a more friendly, unified way.

🤓 I already updated my blog to match the new feature https://huggingface.co/blog/not-lain/custom-architectures-with-huggingface.

📃A list of some repos that have custom pipelines :
* briaai/RMBG-1.4
* p1atdev/siglip-tagger-test-3
* sgugger/test-dynamic-pipeline

Reacted to their post with 🚀 7 months ago

Post

2794

We posted new SOTA SambaLingo 70B parameter models for Arabic, Thai and Hungarian!

Check out the models here sambanovasystems/sambalingo-65e25770f2037c85ad35ca77

and our paper
https://arxiv.org/abs/2404.05829

posted an update 7 months ago

Post

2794

We posted new SOTA SambaLingo 70B parameter models for Arabic, Thai and Hungarian!

Check out the models here sambanovasystems/sambalingo-65e25770f2037c85ad35ca77

and our paper
https://arxiv.org/abs/2404.05829

Zoltan Csaki

AI & ML interests

Recent Activity

Organizations

zolicsaki's activity