Shangzhi Zhang's picture

1 7 16

Shangzhi Zhang

Snorlax

·

AI & ML interests

None yet

Recent Activity

updated a model 1 day ago

Snorlax/ppo-Pyramids

updated a model 1 day ago

Snorlax/ppo-SnowballTarget

updated a model 1 day ago

Snorlax/Reinforce-PixelCopter

View all activity

Organizations

Snorlax's activity

updated 3 models 1 day ago

Snorlax/ppo-Pyramids

Reinforcement Learning • Updated 1 day ago • 19

Snorlax/ppo-SnowballTarget

Reinforcement Learning • Updated 1 day ago • 25

Snorlax/Reinforce-PixelCopter

Reinforcement Learning • Updated 1 day ago

updated 2 models 2 days ago

Snorlax/Reinforce-CartPole-v1

Reinforcement Learning • Updated 2 days ago

Snorlax/dqn-SpaceInvadersNoFrameskip-v4

Reinforcement Learning • Updated 2 days ago • 6

updated 2 models 7 days ago

Snorlax/q-Taxi-v3

Reinforcement Learning • Updated 7 days ago

Snorlax/q-FrozenLake-v1-4x4-noSlippery

Reinforcement Learning • Updated 7 days ago

liked a Space 7 days ago

Huggy

updated a model 7 days ago

Snorlax/ppo-Huggy

Reinforcement Learning • Updated 7 days ago • 49

updated a model 8 days ago

Snorlax/LunarLander_PPO

Reinforcement Learning • Updated 8 days ago • 3

liked a Space 4 months ago

Qwen2-VL-72B

liked a model 4 months ago

InstantX/FLUX.1-dev-Controlnet-Union

Updated Aug 26 • 20.4k • 357

upvoted an article 5 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 279

liked a Space 6 months ago

Running on Zero

glm-4v-9b

Chat with glm-4v-9b multimodal

New activity in openbmb/MiniCPM-Llama3-V-2_5 6 months ago

Potential bug when batch inferencing with left padding.

#26 opened 6 months ago by

liked a dataset 7 months ago

SWHL/ChineseOCRBench

Viewer • Updated Apr 30 • 3.41k • 101 • 17

reacted to georgewritescode's post with 👍 7 months ago

Post

1006

Visualization of GPT-4o breaking away from the quality & speed trade-off curve the LLMs have followed thus far ✂️

Key GPT-4o takeaways
‣ GPT-4o not only offers the highest quality, it also sits amongst the fastest LLMs
‣ For those with speed/latency-sensitive use cases, where previously Claude 3 Haiku or Mixtral 8x7b were leaders, GPT-4o is now a compelling option (though significantly more expensive)
‣ Previously Groq was the only provider to break from the curve using its own LPU chips. OpenAI has done it on Nvidia hardware (one can imagine the potential for GPT-4o on Groq)

👉 How did they do it? Will follow up with more analysis on this but potential approaches include a very large but sparse MoE model (similar to Snowflake's Arctic) and improvements in data quality (likely to have driven much of Llama 3's impressive quality relative to parameter count)

Notes: Throughput represents the median across providers over the last 14 days of measurements (8x per day)

Data is present on our HF leaderboard: ArtificialAnalysis/LLM-Performance-Leaderboard and graphs present on our website

1 reply

·

updated a collection 9 months ago

LLMs

27 items • Updated Mar 4

updated a collection 10 months ago

LLMs

27 items • Updated Mar 4