Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 9 months ago

716 Bytes

	Sharding strategy
	FSDP offers a number of sharding strategies to select from:

	FULL_SHARD - shards model parameters, gradients and optimizer states across workers; select 1 for this option
	SHARD_GRAD_OP- shard gradients and optimizer states across workers; select 2 for this option
	NO_SHARD - don't shard anything (this is equivalent to DDP); select 3 for this option
	HYBRID_SHARD - shard model parameters, gradients and optimizer states within each worker where each worker also has a full copy; select 4 for this option
	HYBRID_SHARD_ZERO2 - shard gradients and optimizer states within each worker where each worker also has a full copy; select 5 for this option

	This is enabled by the fsdp_sharding_strategy flag.