Friedrich Marty

Smorty100
ยท

AI & ML interests

I'm most interested in content rerouting between LLM and VLLM agens for automation possibilities. Using templates for each agent which is then filled in by another agents inputs seems really useful.

Recent Activity

liked a model 3 days ago
Qwen/QwQ-32B-Preview
liked a Space 3 days ago
PR-Puppets/PR-Puppet-Sora
Reacted to reach-vb's post with โค๏ธ 6 days ago
Massive week for Open AI/ ML: Mistral Pixtral & Instruct Large - ~123B, 128K context, multilingual, json + function calling & open weights https://huggingface.co/mistralai/Pixtral-Large-Instruct-2411 https://huggingface.co/mistralai/Mistral-Large-Instruct-2411 Allen AI Tรผlu 70B & 8B - competive with claude 3.5 haiku, beats all major open models like llama 3.1 70B, qwen 2.5 and nemotron https://huggingface.co/collections/allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5 https://huggingface.co/collections/allenai/tulu-3-datasets-673b8df14442393f7213f372 Llava o1 - vlm capable of spontaneous, systematic reasoning, similar to GPT-o1, 11B model outperforms gemini-1.5-pro, gpt-4o-mini, and llama-3.2-90B-vision https://huggingface.co/Xkev/Llama-3.2V-11B-cot Black Forest Labs Flux.1 tools - four new state of the art model checkpoints & 2 adapters for fill, depth, canny & redux, open weights https://huggingface.co/collections/reach-vb/black-forest-labs-flux1-6743847bde9997dd26609817 Jina AI Jina CLIP v2 - general purpose multilingual and multimodal (text & image) embedding model, 900M params, 512 x 512 resolution, matroyoshka representations (1024 to 64) https://huggingface.co/jinaai/jina-clip-v2 Apple AIM v2 & CoreML MobileCLIP - large scale vision encoders outperform CLIP and SigLIP. CoreML optimised MobileCLIP models https://huggingface.co/collections/apple/aimv2-6720fe1558d94c7805f7688c https://huggingface.co/apple/coreml-mobileclip A lot more got released like, OpenScholar (https://huggingface.co/collections/OpenScholar/openscholar-v1-67376a89f6a80f448da411a6), smoltalk (https://huggingface.co/datasets/HuggingFaceTB/smoltalk), Hymba (https://huggingface.co/collections/nvidia/hymba-673c35516c12c4b98b5e845f), Open ASR Leaderboard (https://huggingface.co/spaces/hf-audio/open_asr_leaderboard) and much more.. Can't wait for the next week! ๐Ÿค—
View all activity

Organizations

None yet

Smorty100's activity

Reacted to reach-vb's post with โค๏ธ 6 days ago
view post
Post
2360
Massive week for Open AI/ ML:

Mistral Pixtral & Instruct Large - ~123B, 128K context, multilingual, json + function calling & open weights
mistralai/Pixtral-Large-Instruct-2411
mistralai/Mistral-Large-Instruct-2411

Allen AI Tรผlu 70B & 8B - competive with claude 3.5 haiku, beats all major open models like llama 3.1 70B, qwen 2.5 and nemotron
allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5
allenai/tulu-3-datasets-673b8df14442393f7213f372

Llava o1 - vlm capable of spontaneous, systematic reasoning, similar to GPT-o1, 11B model outperforms gemini-1.5-pro, gpt-4o-mini, and llama-3.2-90B-vision
Xkev/Llama-3.2V-11B-cot

Black Forest Labs Flux.1 tools - four new state of the art model checkpoints & 2 adapters for fill, depth, canny & redux, open weights
reach-vb/black-forest-labs-flux1-6743847bde9997dd26609817

Jina AI Jina CLIP v2 - general purpose multilingual and multimodal (text & image) embedding model, 900M params, 512 x 512 resolution, matroyoshka representations (1024 to 64)
jinaai/jina-clip-v2

Apple AIM v2 & CoreML MobileCLIP - large scale vision encoders outperform CLIP and SigLIP. CoreML optimised MobileCLIP models
apple/aimv2-6720fe1558d94c7805f7688c
apple/coreml-mobileclip

A lot more got released like, OpenScholar ( OpenScholar/openscholar-v1-67376a89f6a80f448da411a6), smoltalk ( HuggingFaceTB/smoltalk), Hymba ( nvidia/hymba-673c35516c12c4b98b5e845f), Open ASR Leaderboard ( hf-audio/open_asr_leaderboard) and much more..

Can't wait for the next week! ๐Ÿค—
Reacted to fracapuano's post with โค๏ธ๐Ÿ‘ 11 days ago
view post
Post
995
Sharing what we have built over the course of the weekend at the @llamameta hackathon, by Cerebral Valley in London ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‘‡

@gabrycina @calebgcc and I competed with 200+ participants and 50+ teams for a 24-hrs sprint centered around hacking for impact! We focused on applications of robotics to those in need of assisted living, moving our focus to enable greater autonomy and accessibility of robotics in everyday life.

complete list of assets ๐Ÿ‘‡
๐Ÿค— trained robotics policies
v1:
- fracapuano/moss-pills
- fracapuano/moss-cup
v2:
- fracapuano/meta-grasp

๐Ÿค— datasets
v1:
- fracapuano/pills
- fracapuano/cup
v2:
- fracapuano/cupim


You can find a live demo of our submission at: https://x.com/_fracapuano/status/1858102728691458554

If you want to know more about how we collected 100GB+ of data, trained multiple RL-policies using @lerobot and used Llama-3.2 models to handle user interactions and switch between tasks, go ahead and have a look! Also, don't be a stranger, and reach out ๐Ÿฆพ

Our project is fully open-source, for the community (and ourselves, ๐Ÿ‘จโ€๐Ÿณ) to build! A huge thank you to @cadene for the help (and the robot ๐Ÿคญ) - truly feeling these hugs-vibes ๐Ÿค— , and to @thomwolf and @clem for sharing our work across

Little extra:
โžก๏ธ Our ๐Ÿง EEG waves๐Ÿง -based control of the ๐Ÿฆพrobotic arm๐Ÿฆพ
New activity in huggingchat/chat-ui 16 days ago

[MODELS] Discussion

435
#372 opened 9 months ago by victor
liked a Space 16 days ago
replied to m-ric's post 18 days ago
view reply

Shouldn't the takeaway from scaling laws be mostly negative?
The fact that scaling compute so much improves output quality by so little seems unintuitive.

One could argue that this is still positive, as there is still room to grow, but I see it as much more exciting to see some new training technique in action, or good results on smaller compute training.

Reacted to m-ric's post with ๐Ÿ”ฅ 18 days ago
view post
Post
782
๐—”๐—ฟ๐—ฒ ๐˜€๐—ฐ๐—ฎ๐—น๐—ถ๐—ป๐—ด ๐—น๐—ฎ๐˜„๐˜€ ๐—ผ๐˜ƒ๐—ฒ๐—ฟ? ๐—” ๐—ฟ๐—ฒ๐—ฝ๐—ผ๐—ฟ๐˜ ๐—ณ๐—ฟ๐—ผ๐—บ ๐˜๐—ต๐—ฒ ๐—œ๐—ป๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ฎ๐—ป๐—ป๐—ผ๐˜‚๐—ป๐—ฐ๐—ฒ๐—ฑ ๐˜๐—ต๐—ฎ๐˜ ๐—ข๐—ฝ๐—ฒ๐—ป๐—”๐—œ ๐—ถ๐˜€ ๐˜€๐—ฒ๐—ฒ๐—ถ๐—ป๐—ด ๐—ฑ๐—ถ๐—บ๐—ถ๐—ป๐—ถ๐˜€๐—ต๐—ถ๐—ป๐—ด ๐—ฟ๐—ฒ๐˜๐˜‚๐—ฟ๐—ป๐˜€ ๐—ณ๐—ฟ๐—ผ๐—บ ๐˜€๐—ฐ๐—ฎ๐—น๐—ถ๐—ป๐—ด ๐˜‚๐—ฝ ๐˜๐—ต๐—ฒ ๐—ป๐—ฒ๐˜…๐˜ ๐—š๐—ฃ๐—ง ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€.

๐Ÿ“Š What are scaling laws? These are empiric laws that say "Every time you increase compute spent in training 10-fold, your LLM's performance will go up by a predictable tick". Of course, they apply only if you train your model with the right methods.

The image below illustrates it: they're from a paper by Google, "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation", and they show how quality and instruction following of models improve when you scale the model up (which is equivalent to scaling up the compute spent in training).

โžก๏ธ These scaling laws have immense impact: they triggered the largest gold rush ever, with companies pouring billions into scaling up theiur training. Microsoft and OpenAI spent 100B into their "Startgate" mega training cluster, due to start running in 2028.

๐Ÿค” So, what about these reports of scaling laws slowing down?

If they are true, they would mean a gigantic paradigm shift, as the hundreds of billions poured by AI companies into scaling could be a dead-end. โ›”๏ธ

But I doubt it: until the most recent publications, scaling laws showed no signs of weakness, and the researchers at the higher end of the scale-up seems to imply the scaling up continues.

Wait and see!
  • 1 reply
ยท
replied to fdaudens's post 18 days ago
view reply

This is not really a surprise.
Generations from big providers are somehow not as restructed as one would expect them to be.
Cooperations tend to have way more money than open source projects, which can lead to better performance. They also tend to have all the big GPUs, so I think this just makes sense.

If they (as in, big tech co) wanted to make generations more safe, you would probably pass the prompt through a safety LLM.

Most open source models are also tailored to local use "at home", meaning, their sizes are usually on the smaller side.

Reacted to alielfilali01's post with โž• 18 days ago
view post
Post
2096
Unpopular opinion : o1-preview is more stupid than 4o and Qwen2.5-72B-Instruct in extremely underrated !
  • 2 replies
ยท
replied to fdaudens's post 18 days ago
view reply

I'm sure this is nothing new, but I'll share anyway

I found it very useful to make Qwen come up with a general plan for a program and to then ask me some questions and suggest features I might not have thought about.
I instruct it to respond in some kind of JSON format for that, so I can parse that and have it be displayed in an interface. Here an example of a prompt I like to use

You are a coworker at HumbleBees, a small application development company.
{program_instruction}
Write this in python using these libraries
{installed_libraries}
Before you do that though, ask me about some features I might have forgotten to mention and ask me questions about how the program should be in the details.
Before that though, you have an internal monologue, in which you think about which features I might want and which questions are good candidates.
Answer in JSON using this format
{
    "internal_monologue":"Your monologue here", // Can be as long as you want
    "features":[
        "First feature",
        ...
    ]
    "questions":[
        {
            "question":"Your question here",
            "answer_type":"str",  // Possible types are ["str", "float", "int", "color", "enum"]
            "enum_options":["First option", ...] // If answer_type is enum, write all the options for the answer as a string array
        }
}
Reacted to fuzzy-mittenz's post with ๐Ÿ˜” 22 days ago
view post
Post
1300
So... Finally getting A pitch-kit ready for Intelligent Estate, a DC-MD-VA based AI firm offering air-gapped on site secure services for small businesses and families, looking for motivated team members. There are other businesses under the holdings which can make use of scientific/mathematics or sales skills as well. Virtual or flexible positions and great people of all walks of life are welcome apply to intelligentestate@gmail.com or join the intelligent estate group if you have any questions.. Work on a contract or paid basis, with shares available, as well, for partners. The Frontier is here and we're fighting to emancipate the power of AI. @fuzzy-mittenz
ยท
New activity in huggingface/inference-playground 24 days ago

[FEEDBACK] Inference Playground

19
#1 opened 2 months ago by victor
New activity in huggingchat/chat-ui 27 days ago

New Copy Paste System Problems.

2
#603 opened about 1 month ago by M-I-O-H
Reacted to Alignment-Lab-AI's post with ๐Ÿง  27 days ago
view post
Post
941
remember boys and girls, always keep all your data, its never a waste of time!
Reacted to Muhammadreza's post with ๐Ÿค— 27 days ago
view post
Post
2580
Hey guys.
This is my first post here on huggingface. I'm glad to be a part of this amazing community!
  • 2 replies
ยท
Reacted to ariG23498's post with ๐Ÿ‘€ about 1 month ago