418 B

Raw Permalink Blame History

Loading Model weights with fastsafetensors

Using fastsafetensors library enables loading model weights to GPU memory by leveraging GPU direct storage. See their GitHub repository for more details. For enabling this feature, set the environment variable USE_FASTSAFETENSOR to true

418 B Raw Permalink Blame History

Loading Model weights with fastsafetensors

418 B

Raw Permalink Blame History