view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 lysandre, ArthurZ, cyrilvallez, reach-vb • Dec 1, 2025 • 311
view article Article Groq on Hugging Face Inference Providers 🔥 +3 benank-groq, hozen, celinah, Wauplin, sbrandeis • Jun 16, 2025 • 44
view article Article Cohere on Hugging Face Inference Providers 🔥 +5 reach-vb, burtenshaw, merve, celinah, alexrs, julien-c, sbrandeis • Apr 16, 2025 • 129
deepseek-ai/DeepSeek-V3-0324 Text Generation • 685B • Updated Mar 27, 2025 • 607k • • 3.12k
Running 3.86k The Ultra-Scale Playbook 🌌 3.86k The ultimate guide to training LLM on large GPU Clusters
view article Article Finally, a Replacement for BERT: Introducing ModernBERT +13 bwarner, NohTow, bclavie, orionweller, ohallstrom, staghado, alexisgallagher, rbiswasfc, fladhak, tomaarsen, ncoop57, griffin, jph00, johnowhitaker, iacolippo • Dec 19, 2024 • 743