Decoding the total cost of ownership for OpenAI: an examination of AI costs.
As artificial intelligence (AI) continues to reshape industries, OpenAI has established itself as the leading API provider with solutions such as ChatGPT and DALL-E. However, to fully integrate OpenAI's services into business operations, an in-depth understanding of the total cost of ownership (TCO) of these AI initiatives is essential. In order to maximize return on investment (ROI), organizations must consider a wealth of factors including subscription and variable costs.
In this article, we will delve into the various components that contribute to the total cost of ownership (TCO) associated with OpenAI. The article presents both a conceptual framework and a practical case study estimating the TCO of an edtech company chatbot. In particular, we will present a detailed cost structure of an AI assistant designed to embed user queries, retrieve old chats from Pinecone VectorDB, and then utilize the retrieved data, chat history, and user query as input for the generative model to produce a response.
According to Sam Altman, OpenAI is on track to hit $1 billion in revenue by 2024. If you’re reading this article, chances are you’re already contributing to that target and wondering if you can optimize your spend.
OpenAI has a comprehensive product suite, one of the standout offerings being ChatGPT. This solution, based on the GPT framework, is designed for engaging and interactive conversations. It has been instrumental in powering chatbots, enabling them to support context-aware dialogues, understand complex prompts, and deliver relevant responses.
The impact of ChatGPT has been profound, as it reached over 100 million active users in January '23, barely two months after launch. This makes it the fastest-adopted consumer application ever, surpassing previous records set by platforms such as Instagram (2.5 years) and Facebook (4.5 years).
Beyond its conversational models, OpenAI has expanded its offerings with tools such as Whisper and DALL-E. Whisper is an automatic speech recognition system that translates spoken language into written text, while DALL-E generates images based on text descriptions.
Total Cost of Ownership (TCO) provides a complete view of all costs associated with a product or system, making it a valuable tool for benchmarking projects or services that may not appear comparable at first glance.
For instance, let's say you're considering which of two projects to pursue. The first project costs a lot upfront but is cheaper to maintain, while the other one is inexpensive at the start but then requires expensive regular updates. TCO helps you compare these projects by looking beyond upfront costs, which can be deceiving.
In the AI space, TCO is more than just the visible costs of subscription and usage fees. In fact, it includes other expenses such as AI system integration, staff training, software maintenance, and future training and upgrades. By taking into account all these factors, organizations can make well-informed decisions about their AI investments and effectively track costs over time.
OpenAI's pricing is easy to understand and flexible to meet different needs. OpenAI uses a pay-as-you-go system with costs varying from service to service, so that users pay only for actual consumption. Upon signing up, users get a $5 credit valid over the first three months.
Prices vary by service. Language APIs are priced based on the model selected (larger models = higher price) and on the number of input and output tokens. For reference, one token is roughly equivalent to four characters or 0.75 words in English.
Image generation models such as DALL-E are instead priced by image resolution (higher resolution = higher price). Finally, audio models such as Whisper are priced based on the length of the audio being processed (longer audios = higher price).
OpenAI also lets users fine-tune, i.e. customize, language models with your own data. While that leads to better and differentiated performances, fine-tuning comes at a significant cost, as the price per token is up to 6x higher than for plain vanilla models.
Upon signing up, users are given a spending limit or quota that can be increased over time as applications demonstrate reliability over time. Users who need additional tokens must ensure that they request a higher quota well in advance of reaching the limit, as the process takes some time. Full details on OpenAI's pricing can be found at OpenAI's pricing webpage.
To paint a more detailed picture of the financial impact of using OpenAI, let's dive into a real use case. We will consider an edtech company - LearnOnline - planning to implement a chatbot to support students Q&A. Since every student can benefit from other students previous questions, a simple chatbot is not sufficient. LearnOnline needs its chatbot to remember previous conversations and therefore designed its AI assistant in the following way:
As a reminder, embeddings are semantic maps of customer inquiries. Stored in a vector database, embeddings allow the system to remember the context from past chats, overcoming one of the most troublesome limits of conventional chatbots. While there are lots of vector databases (OpenAI cookbook for vector databases), Pinecone’s managed service is a good way to get started as it cuts down time allocation to database setup and upkeep.
In terms of volumes, LearnOnline projects around 40,000 user chats per month. Given an average chat size of 1k tokens (around 750 words or 1.5 pages), this translates into a total consumption of around 40 million tokens per month. Based on empirical evidence, model generated tokens account for 63% of the mix (25 million per month), with user text tokens accounting for the difference (15 million per month). While software costs are expected to represent the largest cost driver, as a US-based company LearnOnline expects to incur significant personnel costs too. Annual salaries are estimated as follows: senior software engineer costs $150k, junior software engineer $100, other IT staff $70k per year.
With this context in mind, let’s drill down into the various cost items:
All-in, LearnOnline needs to invest $640K, most of which is going to be a recurring cost in subsequent years. As shown in the waterfall chart below, OpenAI’s generative API represents the largest single cost item (65%). Nevertheless, other costs - including embeddings and Pinecone - are non insignificant and neglecting them would result in costs shooting 50% above budget. The TCO approach helps overcoming this limitation, allowing businesses to make better informed decisions about their AI investments.
However, this analysis also clarifies that identifying and measuring each cost item is not straightforward. This is precisely where Nebuly comes into play, presenting an all-encompassing solution for monitoring, managing and optimizing AI project costs. Nebuly allows businesses to collect all their AI-related costs in one place, providing full visibility into all types of AI costs. In turn, historical data are instrumental in more effectively budgeting for future projects and ensure high-ROI AI investments happen.