When building products with Large Language Models (LLM) or integrating them into your existing products, one of the first things you will have to do is pick a Model. It’s useful to know what your options are and which one to pick for your use case, company resources, and product requirements.
Options
Here are three different options that we will further detail below:
Use a Hosted LLM - such as OpenAI GPT, Google Bard, or AWS Bedrock through their APIs
Create your own LLM* - by developing and training it from scratch
Use/Fine-Tune a pre-trained LLM* - Choose a pre-trained LLM (like Meta LLaMA2). This option also allows you to improve the LLM with your domain-specific data through a process called Fine-Tuning
* For the options marked with a star, you will need to operate the LLM in your own data center or on a public cloud provider such as AWS, Google Cloud, or Microsoft Azure which is an important aspect to consider
1. Use a Hosted LLM
Overview
Hosted LLMs offer a convenient way to leverage powerful language models without managing infrastructure or training data and thus require a much smaller team. Popular hosted LLMs include OpenAI GPT, Google Bard, and AWS Bedrock. These services provide APIs that allow developers to quickly integrate LLM capabilities into their applications and get the fastest time to market with minimum total cost of ownership.
Simplified Architecture
AI Skills Required
This option requires the less AI Skills and those can be existing software developers in your team who grow and upskill in this role
Prompt Engineers: Will be responsible for integrating the Hosted LLM APIs and refining interactions with the Hosted LLM by continuously monitoring and improving prompt generation and performance.
2. Use/Fine-Tune a Pre-trained LLM
Overview
Pre-trained LLMs offer a balance between convenience and customization. Companies can select a pre-trained model that aligns with their task requirements and fine-tune it further if needed. Fine-tuning a pre-trained LLM is the AI equivalent of forking a software repo and it involves adjusting the model's parameters to better suit your specific use case and data.
Simplified Architecture
Note: In case you do fine-tuning, the architecture will look closer to Option 3 but the expertise and costs required will be much less.
AI Skills Required
The AI Skills required would be the ones from Option 1 plus the following:
DevOps/ML Ops: To manage the infrastructure required for fine-tuning and deploying the LLM, including cloud services.
Data Scientists/Machine Learning Engineers*: These professionals will need skills in machine learning to understand how to fine-tune the model with domain-specific data.
Data Analysts*: To help in understanding the domain-specific data, cleaning it, and preparing it for fine-tuning.
*The last two are only required if you want to fine-tune the pre-trained LLM with your domain-specific data
3. Create your own LLM
Overview
Creating an LLM from scratch allows maximum customization and control over the model's architecture, training parameters, and training data for specific needs that cannot be met by pre-existing models. This approach demands significant capital, resources, unique datasets, and expertise in machine learning and natural language processing.
Simplified Architecture
AI Skills Required
The AI Skills required would be the ones from Option 2 with Fine-Tuning plus the following:
Research Scientists (NLP and Machine Learning): Deep expertise in NLP and machine learning is required to develop a language model from scratch. This includes knowledge of model architectures, training algorithms, and optimization techniques.
Data Engineers: To handle large-scale data processing and pipeline management, which is critical for training a custom LLM.
High-Performance Computing Specialists: Given the computational demands of training a custom LLM, specialists in high-performance computing are needed to manage and optimize compute resources.
Ethical AI and Bias Mitigation Experts: To ensure the model is developed responsibly, addressing issues of bias, fairness, and ethical use.
So which one is right for my Product?
The choice among these options depends on your specific requirements, resources, and the level of control you need over the language model but here is some guidance for each option:
Use a Hosted LLM: This is the right option if you want to quickly integrate natural language processing capabilities into your product without the cost and time required for extensive training, infrastructure management, and hiring a dedicated AI Team. It's convenient and suitable for a wide range of general tasks. However, keep in mind that you might have limitations on customization and control over the model's behavior. If you’re legally enforced in your customer contracts not to share, with a 3rd party provider, some of the data you gather from them, then you need to use one of the two other options. Assuming your MVP North Star Metrics are green and the cost of using a 3rd party provider is too high or the lack of flexibility/customization endangers your product defensive moat, you may want to consider moving to the next option
Use/Fine-tune a Pre-trained LLM: This option strikes a balance between the two other options. Pre-trained models (e.g., Meta LLaMA2, Mistral 7B) have already learned a lot about language and can be fine-tuned on your specific domain or task using your data. This is often more practical than training from scratch, as it requires less data, computational resources, and less highly skilled profiles. It's a good choice for applications that need some degree of customization but don't require a completely new model.
Create Your Own LLM: Training a language model from scratch is a resource-intensive task that requires a large amount of data, computational power, and a very skillful team. If you don’t have at least half a million dollars in the bank, according to the current state of the art, a tremendous amount of data (which can be open-source datasets) and a valid use case to make enough profits to justify the investment, it’s probably not worth considering. If Option 2 can’t fit your needs, you may want to first consider contributing to an Open Source Model and if that’s not possible or enough and the model itself is a differentiating factor for your business, this option provides the maximum customization.
What’s Up Next?
In the next article, we will continue to stay high level and understand what are the techniques to integrate the chosen LLM into your product and cover concepts such as Prompt Engineering, Vector Embeddings, Vector Embeddings Databases, and Retrieval Augmented Generation.
Please subscribe below to avoid missing the next article unless you’re already subscribed ;)