Mohamed Salama

Salama1429

AI & ML interests

NLP

Recent Activity

Organizations

Social Post Explorers's profile picture Hugging Face Discord Community's profile picture

Salama1429's activity

Reacted to their post with πŸ‘€πŸ€πŸ‘πŸ§ πŸ€—β€οΈπŸš€πŸ˜ŽπŸ”₯ 6 months ago
view post
Post
2442
πŸ“Ί Introducing the YouTube-Commons Dataset πŸ“Ί

🌐 Overview: The YouTube Commons Dataset is a comprehensive collection of 30 billion words from 15,112,121 original and automatically translated transcripts, drawn from 2,063,066 videos on YouTube.

πŸ”— License: All videos are shared under the CC-BY license, with the majority (71%) in English.

πŸ€– Applications: This dataset is ideal for training powerful AI models for converting speech to text (ASR) and translation models.

πŸ“Š Utilization: The text can be used for model training and is republishable for reproducibility purposes.

🀝 Collaboration: This dataset is the result of a collaboration between state start-up LANGU:IA, the French Ministry of Culture, and DINUM. It will be expanded in the coming months.

πŸ”— Explore the dataset here: https://lnkd.in/d_paWKFE

#YouTubeCommons #AIResearch #MachineLearning #OpenData #ArtificialIntelligence #NLP #Dataset #TechCollaboration #Innovation #DigitalTransformation
posted an update 6 months ago
view post
Post
2442
πŸ“Ί Introducing the YouTube-Commons Dataset πŸ“Ί

🌐 Overview: The YouTube Commons Dataset is a comprehensive collection of 30 billion words from 15,112,121 original and automatically translated transcripts, drawn from 2,063,066 videos on YouTube.

πŸ”— License: All videos are shared under the CC-BY license, with the majority (71%) in English.

πŸ€– Applications: This dataset is ideal for training powerful AI models for converting speech to text (ASR) and translation models.

πŸ“Š Utilization: The text can be used for model training and is republishable for reproducibility purposes.

🀝 Collaboration: This dataset is the result of a collaboration between state start-up LANGU:IA, the French Ministry of Culture, and DINUM. It will be expanded in the coming months.

πŸ”— Explore the dataset here: https://lnkd.in/d_paWKFE

#YouTubeCommons #AIResearch #MachineLearning #OpenData #ArtificialIntelligence #NLP #Dataset #TechCollaboration #Innovation #DigitalTransformation
Reacted to their post with πŸ§ πŸ€πŸ‘πŸ”₯β€οΈπŸ˜ŽπŸ€— 6 months ago
view post
Post
1299
Cohere's Aya 8B & 35B πŸ”₯
> Multilingual (23 languages), beats Mistral 7B and Llama3 8B in preferenceβ€”open weights.

capabilities:

🌍 **Multilingual Mastery**: Supporting 23 languages, including Arabic!

πŸ† **Top Performer**: Outperforms Mistral 7B and Llama3 8B in user preference.

πŸ” **Open Weights**: Access open weights for your research and projects.

πŸ”— **License**: CC-BY-NC with adherence to C4AI's Acceptable Use Policy.

πŸ’Ό **Developed by**: Cohere For AI and Cohere.


Check out Aya 23 on Hugging Face , link is in comments

#AI #MachineLearning #NLP #Multilingual #Arabic #TechInnovation #OpenSource #CohereAI #AyaModel
  • 2 replies
Β·
posted an update 6 months ago
view post
Post
1299
Cohere's Aya 8B & 35B πŸ”₯
> Multilingual (23 languages), beats Mistral 7B and Llama3 8B in preferenceβ€”open weights.

capabilities:

🌍 **Multilingual Mastery**: Supporting 23 languages, including Arabic!

πŸ† **Top Performer**: Outperforms Mistral 7B and Llama3 8B in user preference.

πŸ” **Open Weights**: Access open weights for your research and projects.

πŸ”— **License**: CC-BY-NC with adherence to C4AI's Acceptable Use Policy.

πŸ’Ό **Developed by**: Cohere For AI and Cohere.


Check out Aya 23 on Hugging Face , link is in comments

#AI #MachineLearning #NLP #Multilingual #Arabic #TechInnovation #OpenSource #CohereAI #AyaModel
  • 2 replies
Β·
Reacted to their post with 🧠 6 months ago
view post
Post
1424
Loving the new ChatGPT Mac app.

You can now turn a drawing into a working app in less than a minute!