Friedrich Marty

Smorty100

https://gitlab.com/users/Marty_Friedrich/projects

AI & ML interests

I'm most interested in content rerouting between LLM and VLLM agens for automation possibilities. Using templates for each agent which is then filled in by another agents inputs seems really useful.

Recent Activity

liked a model 3 days ago

Qwen/QwQ-32B-Preview

liked a Space 3 days ago

PR-Puppets/PR-Puppet-Sora

Reacted to reach-vb's post with ❤️ 6 days ago

View all activity

Organizations

None yet

Smorty100's activity

liked a model 3 days ago

Qwen/QwQ-32B-Preview

Text Generation • Updated 1 day ago • 14.8k • • 769

liked a Space 3 days ago

Running

566

👁

PR Puppet Sora

Reacted to reach-vb's post with ❤️ 6 days ago

Post

2360

Massive week for Open AI/ ML:

Mistral Pixtral & Instruct Large - ~123B, 128K context, multilingual, json + function calling & open weights
mistralai/Pixtral-Large-Instruct-2411
mistralai/Mistral-Large-Instruct-2411

Allen AI Tülu 70B & 8B - competive with claude 3.5 haiku, beats all major open models like llama 3.1 70B, qwen 2.5 and nemotron
allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5
allenai/tulu-3-datasets-673b8df14442393f7213f372

Llava o1 - vlm capable of spontaneous, systematic reasoning, similar to GPT-o1, 11B model outperforms gemini-1.5-pro, gpt-4o-mini, and llama-3.2-90B-vision
Xkev/Llama-3.2V-11B-cot

Black Forest Labs Flux.1 tools - four new state of the art model checkpoints & 2 adapters for fill, depth, canny & redux, open weights
reach-vb/black-forest-labs-flux1-6743847bde9997dd26609817

Jina AI Jina CLIP v2 - general purpose multilingual and multimodal (text & image) embedding model, 900M params, 512 x 512 resolution, matroyoshka representations (1024 to 64)
jinaai/jina-clip-v2

Apple AIM v2 & CoreML MobileCLIP - large scale vision encoders outperform CLIP and SigLIP. CoreML optimised MobileCLIP models
apple/aimv2-6720fe1558d94c7805f7688c
apple/coreml-mobileclip

A lot more got released like, OpenScholar ( OpenScholar/openscholar-v1-67376a89f6a80f448da411a6), smoltalk ( HuggingFaceTB/smoltalk), Hymba ( nvidia/hymba-673c35516c12c4b98b5e845f), Open ASR Leaderboard ( hf-audio/open_asr_leaderboard) and much more..

Can't wait for the next week! 🤗

Reacted to fracapuano's post with ❤️👍 11 days ago

Post

995

Sharing what we have built over the course of the weekend at the @llamameta hackathon, by Cerebral Valley in London 🇬🇧 👇

@gabrycina @calebgcc and I competed with 200+ participants and 50+ teams for a 24-hrs sprint centered around hacking for impact! We focused on applications of robotics to those in need of assisted living, moving our focus to enable greater autonomy and accessibility of robotics in everyday life.

complete list of assets 👇
🤗 trained robotics policies
v1:
- fracapuano/moss-pills
- fracapuano/moss-cup
v2:
- fracapuano/meta-grasp

🤗 datasets
v1:
- fracapuano/pills
- fracapuano/cup
v2:
- fracapuano/cupim

You can find a live demo of our submission at: https://x.com/_fracapuano/status/1858102728691458554

If you want to know more about how we collected 100GB+ of data, trained multiple RL-policies using @lerobot and used Llama-3.2 models to handle user interactions and switch between tasks, go ahead and have a look! Also, don't be a stranger, and reach out 🦾

Our project is fully open-source, for the community (and ourselves, 👨‍🍳) to build! A huge thank you to @cadene for the help (and the robot 🤭) - truly feeling these hugs-vibes 🤗 , and to @thomwolf and @clem for sharing our work across

Little extra:
➡️ Our 🧠EEG waves🧠-based control of the 🦾robotic arm🦾

New activity in huggingchat/chat-ui 16 days ago

[MODELS] Discussion

435

#372 opened 9 months ago by

victor

liked a Space 16 days ago

Running

👄

Lip

replied to m-ric's post 18 days ago

Shouldn't the takeaway from scaling laws be mostly negative?
The fact that scaling compute so much improves output quality by so little seems unintuitive.

One could argue that this is still positive, as there is still room to grow, but I see it as much more exciting to see some new training technique in action, or good results on smaller compute training.

Reacted to m-ric's post with 🔥 18 days ago

Post

782

𝗔𝗿𝗲 𝘀𝗰𝗮𝗹𝗶𝗻𝗴 𝗹𝗮𝘄𝘀 𝗼𝘃𝗲𝗿? 𝗔 𝗿𝗲𝗽𝗼𝗿𝘁 𝗳𝗿𝗼𝗺 𝘁𝗵𝗲 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻 𝗮𝗻𝗻𝗼𝘂𝗻𝗰𝗲𝗱 𝘁𝗵𝗮𝘁 𝗢𝗽𝗲𝗻𝗔𝗜 𝗶𝘀 𝘀𝗲𝗲𝗶𝗻𝗴 𝗱𝗶𝗺𝗶𝗻𝗶𝘀𝗵𝗶𝗻𝗴 𝗿𝗲𝘁𝘂𝗿𝗻𝘀 𝗳𝗿𝗼𝗺 𝘀𝗰𝗮𝗹𝗶𝗻𝗴 𝘂𝗽 𝘁𝗵𝗲 𝗻𝗲𝘅𝘁 𝗚𝗣𝗧 𝗺𝗼𝗱𝗲𝗹𝘀.

📊 What are scaling laws? These are empiric laws that say "Every time you increase compute spent in training 10-fold, your LLM's performance will go up by a predictable tick". Of course, they apply only if you train your model with the right methods.

The image below illustrates it: they're from a paper by Google, "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation", and they show how quality and instruction following of models improve when you scale the model up (which is equivalent to scaling up the compute spent in training).

➡️ These scaling laws have immense impact: they triggered the largest gold rush ever, with companies pouring billions into scaling up theiur training. Microsoft and OpenAI spent 100B into their "Startgate" mega training cluster, due to start running in 2028.

🤔 So, what about these reports of scaling laws slowing down?

If they are true, they would mean a gigantic paradigm shift, as the hundreds of billions poured by AI companies into scaling could be a dead-end. ⛔️

But I doubt it: until the most recent publications, scaling laws showed no signs of weakness, and the researchers at the higher end of the scale-up seems to imply the scaling up continues.

Wait and see!

1 reply

replied to fdaudens's post 18 days ago

This is not really a surprise.
Generations from big providers are somehow not as restructed as one would expect them to be.
Cooperations tend to have way more money than open source projects, which can lead to better performance. They also tend to have all the big GPUs, so I think this just makes sense.

If they (as in, big tech co) wanted to make generations more safe, you would probably pass the prompt through a safety LLM.

Most open source models are also tailored to local use "at home", meaning, their sizes are usually on the smaller side.

Reacted to alielfilali01's post with ➕ 18 days ago

Post

2096

Unpopular opinion : o1-preview is more stupid than 4o and Qwen2.5-72B-Instruct in extremely underrated !

2 replies

replied to fdaudens's post 18 days ago

I'm sure this is nothing new, but I'll share anyway

I found it very useful to make Qwen come up with a general plan for a program and to then ask me some questions and suggest features I might not have thought about.
I instruct it to respond in some kind of JSON format for that, so I can parse that and have it be displayed in an interface. Here an example of a prompt I like to use

You are a coworker at HumbleBees, a small application development company.
{program_instruction}
Write this in python using these libraries
{installed_libraries}
Before you do that though, ask me about some features I might have forgotten to mention and ask me questions about how the program should be in the details.
Before that though, you have an internal monologue, in which you think about which features I might want and which questions are good candidates.
Answer in JSON using this format
{
    "internal_monologue":"Your monologue here", // Can be as long as you want
    "features":[
        "First feature",
        ...
    ]
    "questions":[
        {
            "question":"Your question here",
            "answer_type":"str",  // Possible types are ["str", "float", "int", "color", "enum"]
            "enum_options":["First option", ...] // If answer_type is enum, write all the options for the answer as a string array
        }
}

Reacted to fuzzy-mittenz's post with 😔 22 days ago

Post

1300

So... Finally getting A pitch-kit ready for Intelligent Estate, a DC-MD-VA based AI firm offering air-gapped on site secure services for small businesses and families, looking for motivated team members. There are other businesses under the holdings which can make use of scientific/mathematics or sales skills as well. Virtual or flexible positions and great people of all walks of life are welcome apply to intelligentestate@gmail.com or join the intelligent estate group if you have any questions.. Work on a contract or paid basis, with shares available, as well, for partners. The Frontier is here and we're fighting to emancipate the power of AI. @fuzzy-mittenz