In this article we delve into essential considerations that companies should take into account to capitalize on the potential of AI at the current time. The considerations that follow are by no means exhaustive or follow a particular order. Smaller companies like startups may approach this list differently than larger enterprises. Each will prioritize factors such as cost, time to market, privacy and regulation according to their business objectives and policies.
General vs Specialized
With the release of Chat GPT and the underlying large language models, the perception of one large model to rule them all entered into the conversation. There are only a handful of companies with the resources required to develop and run proprietary large language models. They are currently providing controlled access that businesses are able to integrate with. Over the last few months, that thinking has evolved thanks in large part to the release of many open source, commercially licensed LMs by the likes of Meta with LLAMA2, Stability AI with Stable Diffusion or MosaicML with MPT-7B model to name a few.
The conversation has now shifted to the possibility of building smaller, highly specialised models, trained on domain specific data. There is growing evidence that these models can perform just as well as larger language models in the domain they operate in, without the overhead of infrastructure required for training and inference.
When considering these options, companies will need to evaluate cost, technical capabilities, scale, privacy , intellectual property and more. Not surprisingly, each approach continues to evolve to match the capabilities of a competing model. Open AI is actively expanding the options for augmenting the model for a particular business domain or use case. In the remainder of the article, we will explore these factors in more detail.
Training
As the LMs evolve, so does the tooling and the services offered by cloud vendors and emerging startups. The three most commonly used methods for modifying the behaviour of a large language model are pre-training, fine-tuning and in-context learning.
Pre-training
Pre-training is a rather significant commitment of financial and technical resources. However, the hardware space continues to evolve at an incredibly fast pace. In Aug of 2023 Nvidia announced its latest generation of GH 200 GPU that promises to significantly reduce the cost of training and inference. Such developments should bring capital infrastructure costs down.
Bloomberg is a prominent example of a company who made the investment in order to capitalise on the differentiator in the form of an extensive archive of English financial documents. With a comprehensive 363 billion token dataset, augmented with a 345 billion token public dataset, a 50-billion parameter language model was created. This new, streamlined model outperformed existing open models of a similar size on financial tasks by large margins, while still performing on par or better on general NLP benchmarks.
Bloomberg assigned close to 1.3 million hours of training time for BloombergGPT on Nvidia’s A100 GPUs in Amazon’s AWS cloud. The training was done in 64 GPU clusters, each with eight Nvidia A100 GPUs.
Fine-tuning
Another option is to fine-tune a pre-trained model. Open AI evolved their GPT-3 model by fine-tuning for code and instruction following, to make improvements in base model’s zero-shot performance. In that setting, no prior examples of a task have to be provided to the model. Using its own data set of zero-shot inputs and high quality outputs, the models have been tuned from text completion to instruction following https://openai.com/research/instruction-following.
The ChatGPT model was further improved through reinforcement learning human feedback (RLHF) on conversations.
Language models are most commonly shared through model hubs such as Hugging Face who at the time of this article had close to 200k models available. From hundreds and up to thousands of documents, businesses can fine-tune the base models for their domain specific knowledge, significantly shortening the time and lowering the compute cost required.
MosaicML has been leading the way in finding compute efficient recipes that significantly reduce the cost of fine tuning using private or cloud infrastructure.
Their open-source Composer training library is purpose-built to make it simple to bring state-of-the-art algorithms to speed up model training, lower the infrastructure cost and help improve model quality. Another notable factor of MosaicML’s offering is their algorithms work across cloud vendors, avoiding a vendor lock-in, providing the ability to run training across multiple cloud providers, which can become a factor in the era of high demand for GPUs. This flexibility in such a fast developing space is invaluable for businesses seeking to be economically effective.
Cloud vendors have also built services for fine-tuning. Azure AI service contains a broad array of base models to select from, train and validate using specialised data. The training datasets are in json format with tools available for preparing, validating and improving quality. The customised model can then be deployed and made available for testing to users either through chat playground or by integrating with an external interface.
In-context learning
The third option is in-context learning. At its core, this augmentation provides the model with sample prompts it can learn from. This option might be especially appealing to businesses with smaller training datasets, low budgets, or shorter timeframes, but should not be limited by these factors as it is highly capable of addressing many viable use cases with an expanding array of prompt engineering techniques. While the context window length can be a limiting factor, it has been expanding rapidly over the recent months and across models. GPT-4 has an 8k context-length window while its highly anticipated GPT-4-32k variant (available on request in some regions) promises to extend it to 32k. For context, 8k length can fit a lengthy, multi-page newspaper article while 32k context length translates to about 48-50 pages. While these improvements are significant, they might not be enough for many use cases. Businesses often keep a lot more data in external systems such as databases or indexes they would like the model to search over.
To perform this task at scale, services such as Azure Machine Learning prompt flow or a highly popular open source langchain library, provide the ability to create and evaluate prompt tuning strategies, coauthor within larger teams as well as deploy and monitor customized models. To address the limitation of context length and keep cost under control (as most hosted APIs charge by the token) a common technique is to use a compact vector representation through embeddings to find relevant content that can be injected into the prompt. Utilizing this approach can represent significant cost savings.
As the tooling evolves and the size of context window increases, in-context learning can provide an effective and oftentimes much cheaper than fine-tuning alternative to model augmentation, empowering businesses to adopt LLMs to their use cases more broadly.
Having an appropriate, current understanding of the way each method works is imperative in being able to decide which way forward is most economically appropriate for any given business.
Privacy
For many businesses data became an expression of their intellectual property. The future where companies can monetize on their domain specific data either to build a competitive advantage or by licensing a specialised model is looking increasingly more likely.
The key question then becomes that of how to utilize this technology with private data such that intellectual property and confidential information is sufficiently protected from both competitive and legal standpoints.
Companies will need to navigate and understand the regulatory landscape they operate in as the process to create regulations and standards for trustworthy AI systems is underway in many regions around the world and will be a moving target for some time.
To comply with privacy laws, businesses, especially the ones operating in highly regulated industries, may be motivated to move in the direction of training and deploying their own model to maintain full control over its training and use. At the same time, cloud vendors such as Microsoft are actively responding to address the regulatory concerns of their clients. Azure Open AI service provides guarantees around not using customer’s data for training. In addition, Microsoft is hosting Open AI models within Azure infrastructure without the need to send the data to services operated by Open AI. This was possible by Microsoft forming a long-term partnership with OpenAI through a multibillion dollar investment which allowed for deployment of OpenAI’s models in an enterprise-grade platform trusted by businesses.
Furthermore, techniques such as encryption, anonymization or differential privacy can be applied to transform the data before it’s sent to a model. With anonymization, personally identifiable information (PII) can be masked or removed. Differential privacy adds noise or randomness to the data or the outputs of the AI models, preventing inference of individual information from the aggregated results.
Governance
Data governance is a critical aspect of modern organizations, as their value increasingly relies on the effective management and utilization of data. An effective governance strategy encompasses best practices and tools according to the business objectives and strategic goals of an organization. Furthermore, it helps protect data from unauthorized access and comply with regulatory requirements.
Businesses collect data from a variety of sources today. From cloud storage solutions, databases, message queues and APIs, data is being collected at ever increasing volumes in structured and unstructured forms. While unstructured data presents unique challenges in collecting, storing, and analyzing data at scale, businesses that invest in this process see a tremendous amount of value being unlocked.
Data quality is paramount in data governance. Poor-quality data can lead to inaccurate analytics, compromised decision-making, and significant financial costs. An effective data governance strategy must prioritize data quality, ensuring that data provenance is well-documented, rules are enforced, and changes are tracked. Addressing data security concerns is also integral to data governance, as it helps organizations achieve compliance, control data access, and track data usage.
The concept of "data lineage" is central to data governance, as it encompasses the entire journey of data from its source to its ultimate insights. This includes capturing metadata, transformation history, data set sources, creators, and more. With proper data management tools, organizations gain a comprehensive view of how data is transformed and flows through data ecosystem internally as well as externally with their customers and partners.
As data governance has evolved, it has extended into the realm of machine learning (ML) governance. ML governance establishes policies and procedures for the development and application of ML models within an organization. Proper ML governance ensures that models comply with regulatory and ethical standards, are reproducible, transparent, monitored, and properly documented. This governance framework is essential for unlocking consistent value from ML investments while mitigating risks associated with regulatory compliance and model performance.
A wide variety of tools exist that provide solutions for managing aspects of data governance. An organization utilizing Azure services might look at Azure tools and services for ease of access and integration. At the same time, Databricks provides vendor agnostic solutions that integrate with all major cloud providers. As organizations assess how well their data and technology stack measures against the requirements of the industry or region they operate in, steps can be taken to realign towards compliance and data-driven culture. This will position the company well as the Generative AI applications and solutions continue its fast evolution.
Explore to Stay Ahead
In the world of innovation and efficiency, the integration of AI, particularly Large Language Models (LLMs), has emerged as a transformative force.
This article presents a few considerations business should evaluate to harness the potential of AI. While risks must be acknowledged, they should be weighed against the risk of falling behind in a landscape reshaped by technological advancements.
With proper oversight, companies can now undertake steps to explore, evaluate and in many cases deploy compliant AI solutions.