Maxi PRO

maxiw

AI & ML interests

Computer Agents | VLMs

Recent Activity

Organizations

maxiw's activity

Reacted to merve's post with πŸ‘€ 1 day ago
view post
Post
2063
Small yet mighty! πŸ’«

We are releasing SmolVLM: a new 2B small vision language made for on-device use, fine-tunable on consumer GPU, immensely memory efficient 🀠

We release three checkpoints under Apache 2.0: SmolVLM-Instruct, SmolVLM-Synthetic and SmolVLM-Base HuggingFaceTB/smolvlm-6740bd584b2dcbf51ecb1f39

Learn more from our blog here: huggingface.co/blog/smolvlm
This release comes with a demo, fine-tuning code, MLX integration and TRL integration for DPO πŸ’
Try the demo: HuggingFaceTB/SmolVLM
Fine-tuning Recipe: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
Also TRL integration for DPO πŸ’—
Reacted to luigi12345's post with πŸ‘€ 1 day ago
view post
Post
3521
MinimalScrap
Only Free Dependencies. Save it.It is quite useful uh.


!pip install googlesearch-python requests
from googlesearch import search
import requests
query = "Glaucoma"
for url in search(f"{query} site:nih.gov filetype:pdf", 20):
    if url.endswith(".pdf"):
        with open(url.split("/")[-1], "wb") as f: f.write(requests.get(url).content)
        print("βœ…" + url.split("/")[-1])
print("Done!")

posted an update 1 day ago
view post
Post
1602
You can now try out computer use models from the hub to automate your local machine with https://github.com/askui/vision-agent. πŸ’»

import time
from askui import VisionAgent

with VisionAgent() as agent:
    agent.tools.webbrowser.open_new("http://www.google.com")
    time.sleep(0.5)
    agent.click("search field in the center of the screen", model_name="Qwen/Qwen2-VL-7B-Instruct")
    agent.type("cats")
    agent.keyboard("enter")
    time.sleep(0.5)
    agent.click("text 'Images'", model_name="AskUI/PTA-1")
    time.sleep(0.5)
    agent.click("second cat image", model_name="OS-Copilot/OS-Atlas-Base-7B")


Currently these models are integrated with Gradio Spaces API. Also planning to add local inference soon!

Currently supported:
- Qwen/Qwen2-VL-7B-Instruct
- Qwen/Qwen2-VL-2B-Instruct
- AskUI/PTA-1
- OS-Copilot/OS-Atlas-Base-7B
  • 2 replies
Β·
Reacted to prithivMLmods's post with πŸ”₯ 2 days ago
view post
Post
2465
HF Posts Receipts πŸ†πŸš€

[ HF POSTS RECEIPT ] : prithivMLmods/HF-POSTS-RECEIPT

πŸ₯ The one thing that needs to be remembered is the 'username'.

πŸ₯ And yeah, thank you, @maxiw , for creating the awesome dataset and sharing them here! πŸ™Œ

πŸ₯ [ Dataset ] : maxiw/hf-posts

.
.
.
@prithivMLmods
New activity in onnx-community/Florence-2-base 3 days ago

Export code

2
#6 opened 5 months ago by Vasanth
New activity in OS-Copilot/ScreenSpot-v2 8 days ago

Dataset Format

#1 opened 8 days ago by maxiw