Udayan Kumar My public notes

Small Models, Big Savings!

TL;DR: For narrow, repetitive NLP tasks at high volume, a small fine-tuned model can be much cheaper (10x is possible) than a frontier LLM. The savings can be large, but mostly when the task is constrained & frequent enough to justify the extra engineering work.