Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 9 months ago

395 Bytes

	The two optimizations in the fastpath execution are:

	fusion, which combines multiple sequential operations into a single "kernel" to reduce the number of computation steps
	skipping the inherent sparsity of padding tokens to avoid unnecessary computation with nested tensors

	BetterTransformer also converts all attention operations to use the more memory-efficient scaled dot product attention.