TIL - How to run Hugging Face models with Ollama
You can now deploy any GGUF model with Ollama, in just a few clicks!
You can use any GGUF quants created by the community on Hugging Face directly with Ollama, without the need to create a new Modelfile
. It works like a charm with all llama.cpp compatible models, with all sizes, from 0.1B up to 405B parameters.
Simply filter GGUF models, select the quant type as per your requirement, and it is done!
ollama run hf.co/bartowski/Qwen2.5.1-Coder-7B-Instruct-GGUF:Q5_K_L