Why small models work: They’re faster, cheaper, and can run on consumer hardware (laptops, even Raspberry Pi-level devices with optimisations) while giving acceptable quality for narrow, well-defined tasks. Pairing them with a vector store (e.g., Chroma, Weaviate, Milvus) for RAG can dramatically boost usefulness without increasing model size.
Use cases
Text Processing & Automation
- Template filling – e.g., generating structured responses, filling in report fields.
- Summarisation – condensing meeting transcripts or local documents without sending data to the cloud.
- Classification – tagging or categorising requests, tickets, or files.
- Text cleaning – grammar correction, standardising language for logs or reports.
Domain-Specific Models
- Fine-tune a small LLM for:
- Industry jargon translation (e.g., maintenance logs → plain English).
- Technical troubleshooting guides.
- Incident classification in operations or engineering.
- Works well when paired with RAG (Retrieval-Augmented Generation) from a local knowledge base.
Edge & Offline Scenarios
- Field work in remote areas (e.g., utilities, scientific expeditions).
- IoT devices with natural language interfaces.
- Portable knowledge assistants for technicians, inspectors, or surveyors.
Educational & Training Tools
- Interactive Q&A tutors for company onboarding.
- Scenario-based training simulations where the model plays a role.