Choosing the right modelling strategies for your AI applications

joydeepml2020
Dec 30, 2025
3 min read

We are witnessing a rapid surge in the adoption of large language model (LLM)-based applications, commonly referred to as Generative AI. There is no doubt that LLMs are powerful technologies with tremendous potential to create competitive advantages in today’s business landscape. Organisations are already integrating them to align products with AI strategies or to enhance traditional offerings with AI-driven capabilities. The LLM revolution is fundamentally transforming natural language processing.

To gain a competitive edge, many organisations are actively developing LLM- and AI-powered applications, often relying on leading platforms such as Google Gemini, OpenAI GPT, Meta LLaMA, and Anthropic. These models deliver exceptional performance and support a wide range of use cases. However, to ensure your AI strategy truly benefits your business, it is important to ask a critical question:

Do you really need these highly capable, multitask LLMs for your specific application?

A useful analogy is large-scale construction. A 30-story condominium project may require a cantilever tower crane — a highly powerful and specialised tool. But if you’re constructing a single-family home, do you still need such heavy machinery? Would it be practical — or financially sensible — to use the same crane for a small row-house project?

Likewise, AI should be viewed through the lens of fitness for purpose. The right strategy depends on application needs, budget, infrastructure constraints, organisational capability, and — most importantly — data availability.

Before choosing a modelling approach, organisations should thoughtfully evaluate:

Do we have sufficient unstructured or proprietary business data?
Do we possess the required technical expertise internally?
Do we have the infrastructure and engineering bandwidth to support development and operations?
What level of budget can we commit — both initially and on an ongoing basis?
What is our core objective — to build long-term AI capability or to enable targeted business outcomes such as automation or insights?

Based on these considerations, the optimal application development strategy should be selected. Some of the most common strategic approaches include:

1. API-Based Access to Hosted LLMs

In this approach, applications consume APIs from state-of-the-art hosted LLMs (such as GPT-4, GPT-3.5 Turbo, Gemini Pro, or Claude Sonnet). Responses are then wrapped with ethical AI controls, guardrails, validation layers, and business-specific formatting.

This model enables rapid time-to-value, as most of the technical complexity is handled by the platform provider. It requires comparatively less AI expertise, making it suitable for organisations prioritising speed and ease of integration.

2. Building LLMs from the Ground Up

Developing a custom LLM is a massive undertaking that demands substantial investment, deep research capability, large-scale datasets, and significant compute infrastructure. It is typically feasible only when there is a very strong, strategic business incentive.

Due to the high cost, risk, and complexity, this approach is largely limited to a handful of global technology leaders and research institutions.

3. Retrieval-Augmented Generation (RAG)

RAG offers a balanced middle-ground. Instead of relying solely on the LLM’s internal knowledge, the model retrieves information from a local or domain-specific knowledge base to produce grounded, organisation-relevant responses — similar to taking an open-book exam.

This approach is highly effective when domain accuracy, compliance, and business context are critical. I will discuss the step-by-step process of building RAG pipelines from first principles in an upcoming article.

4. Fine-Tuning Pre-Trained Models

Fine-tuning adapts an existing LLM to perform better on domain-specific tasks by training it on organisational data. Rather than training a model from scratch, parameters are refined to specialise the model’s behavior.

Both supervised and reinforcement-based fine-tuning approaches exist. Modern techniques such as LoRA, QLoRA, and efficient frameworks like Unsloth significantly reduce cost and compute requirements — in some cases enabling fine-tuning on a single GPU.

Fine-tuning can be thought of as studying past exam papers before a test — focusing on relevant knowledge to perform better in a specific context. This approach works particularly well due large language model’s intrinsic dimensionality. refer this paper for further understanding. We are going to discuss this topic in subsequent articles based on different supervised fine tuning techniques.

5. Small Language Models (SLMs)

An emerging area of exploration is the development of small language models — typically ranging from 10M to a few 100 million parameters — capable of generating coherent and contextually relevant text.

Research such as TinyStories: How Small Can Language Models Be and Still Speak Coherent English? demonstrates the potential of compact models, particularly for use cases involving proprietary content such as market research reports, financial documents, or legal datasets.

For organisations with rich domain-specific knowledge, SLMs can provide a cost-efficient and highly differentiated foundation for AI capability — enabling deeper control, privacy, and strategic advantage.

Conclusion

There is no one-size-fits-all AI strategy. Just as construction projects require different cranes based on scale and purpose, AI applications demand modeling choices aligned with business goals, data readiness, budget, and capability maturity.

The organizations that succeed will be those that choose — not the most powerful technology — but the most appropriate one