Yikang Shen's picture

7 8 1

Yikang Shen PRO

YikangS

·

yikangshen

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

ibm-granite/granite-3.0-8b-instruct

upvoted a collection about 1 month ago

Granite 3.0 Language Models

View all activity

Organizations

YikangS's activity

liked a model about 1 month ago

ibm-granite/granite-3.0-8b-instruct

Text Generation • Updated Oct 23 • 50.8k • 180

upvoted a collection about 1 month ago

Granite 3.0 Language Models

A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated 24 days ago • 92

updated a model 2 months ago

ibm/PowerLM-3b

Text Generation • Updated Sep 16 • 13.8k • 17

updated a collection 3 months ago

Power-LM

Dense & MoE LLMs trained with power learning rate scheduler. • 4 items • Updated Oct 17 • 15

authored 4 papers 3 months ago

The infrastructure powering IBM's Gen AI model development

Paper • 2407.05467 • Published Jul 7 • 2

Scaling Granite Code Models to 128K Context

Paper • 2407.13739 • Published Jul 18 • 19

FlexAttention for Efficient High-Resolution Vision-Language Models

Paper • 2407.20228 • Published Jul 29 • 1

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23 • 22

upvoted a collection 3 months ago

Power-LM

Dense & MoE LLMs trained with power learning rate scheduler. • 4 items • Updated Oct 17 • 15

upvoted a paper 3 months ago

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23 • 22

updated a collection 3 months ago

Power-LM

Dense & MoE LLMs trained with power learning rate scheduler. • 4 items • Updated Oct 17 • 15