Scaling RAW Labs "API OS" using SquirroGPT

April 19, 2024

Experience automated API building for yourself!

Start for free today.
No credit card required

See it in action for yourself!

100% non-binding.

In our previous discussion, we outlined the development of a conversational interface at RAW Labs, designed to respond to real-time inquiries across all aspects of our business. This advisor is equipped to handle a broad spectrum of topics, including real-time user activity, development ticket tracking, user feedback, operational issues, sales and marketing activities, HR inquiries, and more.

We initially constructed this advisor by integrating ChatGPT with our operational systems and SaaS products we use, in a setup we affectionately termed the RAW Labs "API Operating System" (API OS). This framework encompasses all our critical business functions. However, as our API OS expanded in capabilities and scale, we encountered limitations with our initial ChatGPT-dependent implementation. Fortunately, we were able to overcome these constraints using a different LLM-based product, called SquirroGPT.

Before delving into these limitations, let's first explore what the RAW Labs "API OS" is.

Why APIs are The Optimal Mechanism to Integrate Real-Time Operational Data with an LLM

Instructing an LLM about your company's data is the key enabler. However, most examples focus on "unstructured data sources" such as PDF documents, Word documents, or PowerPoint presentations. In reality, your company's most valuable data and insights are likely housed in its operational systems - the CRM systems, HR systems, relational databases, and data lakes that power your products and services. To provide a comprehensive and real-time view of your business, you need to tap into all its data in real-time, specifically, into your operational data, and make it accessible.

At RAW Labs, we developed a data virtualization platform that allows you to access any data directly from any location in real-time and transform it into coherent "data products" that you can then safely use or share with others. To expose our internal data to an LLM, we built a set of "data products" that cover the most critical parts of our business. These "data products" consist of a set of well-defined and curated "APIs" that roughly map to a business area such as marketing, HR, or "user activity". These data products are built as "HTTP REST APIs", which are developed and hosted as web services using our own RAW platform. Building these data products may seem a daunting task but it is in fact surprisingly easy, because our own RAW platform provides pre-existing templates and examples to get started. Moreover, RAW keeps these data products up-to-date, including their metadata and documentation, a feature that proves to be crucial for LLM integration as we shall see below.

Now that we have a set of well-defined data products, the next step is to configure an advanced LLM such as ChatGPT or SquirroGPT to be aware of their existence and use them to answer questions as needed. This involves a feature that LLMs commonly have, called "function calling", which allows LLMs to use HTTP REST APIs to discover new knowledge on-demand. It is one of the most reliable ways to "feed" new data into LLMs; in fact, it's one of the most common ways OpenAI itself recommends users to extend ChatGPT.

Linking operational data with an LLM can present specific challenges, as operational data often requires additional "context" for accurate interpretation. For instance, a simple value such as "sales: 4000" lacks context: sales of what product or over what period? To address this, operational data is best linked with an LLM by creating a data product, such as a REST API built in RAW, which provides additional context in the form of metadata. This metadata serves as a form of documentation for the data product, enabling the LLM to understand and interpret the data correctly. It plays a crucial role in mitigating the risk of hallucination by the LLM, as it provides the necessary context for accurate data interpretation.

Finally, since the REST API serves data from the source directly, the AI system consumes the most up-to-date information. Our initial implementation involved integrating over 40 operational datasets into ChatGPT, enabling us to start a GPT session, pose questions, and receive answers based on live data. Questions such as “How many customers registered this morning?” or “What is the development team working on this week?” are answered instantly based on the latest data. We can even delve deeper, assessing whether our development roadmap aligns with the board's priorities or have ChatGPT suggest features by comparing our existing products in RAW with those of our competitors.

Hitting the Limits of ChatGPT

However, as the API OS expanded, we encountered a significant technical limitation concerning the scalability of ChatGPT. Initially, we built an OpenAI GPT to enable this connectivity between ChatGPT and our data. This feature uses a mechanism known as the OpenAPI specification, which informs ChatGPT about the HTTP REST APIs - the "data products" - that exist within our organization and can be used to learn new information.

Unfortunately, this feature struggles to scale beyond approximately 30 APIs. The “context window” of ChatGPT is too limited, preventing it from accessing all existing APIs within our organization. In practice, this limitation means we cannot consolidate all types of inquiries into a single agent. Having only 30 "data sources" proved far too restricted. Our only alternative was to create separate GPTs for marketing, sales, or product development for example. Although this segmented approach can work, the ability for C-level executives to interconnect all business components is crucial, especially since the most insightful questions often span multiple organizational areas.

SquirroGPT to the Rescue!

SquirroGPT is an enterprise-ready alternative to ChatGPT that offers several distinguishing features, including enterprise-grade security and a focus on evidence-based answers. One of the features we appreciate is that each response from SquirroGPT is annotated with links that trace back to the source information from which the data was retrieved. However, the defining characteristic of SquirroGPT is its integration with a knowledge graph, which sets it apart from ChatGPT.

SquirroGPT allows users to define and build a "knowledge graph", allowing for a more controlled and structured approach to data organization. In practice, the knowledge graph allows us to map our data products to nodes within the graph, with each node representing a specific business area for example. In our case, the leaf nodes of the graph correspond to the various business areas we discussed earlier, such as marketing, HR, user activity, and more. When a user question is received, SquirroGPT transparently matches it to the most relevant parts or nodes of this graph, ensuring that the AI system processes the query in the appropriate context.

And on the other side, each node in the knowledge graph is associated with a specific set of APIs defined in RAW. This allows SquirroGPT to manage a vast number of APIs efficiently and scale well-beyond what ChatGPT can. This structured approach enables SquirroGPT to handle a large array of APIs effectively and provide precise, evidence-based answers across a multitude of business functions. As a result, the AI-driven querying process becomes vastly more scalable and reliable with SquirroGPT.

Once it's all connected, then the "magic happens". Take a look!

Scaling Advisors with Knowledge Graphs

We at RAW Labs are tremendously excited by the possibilities of scaling advisors with more and more information. SquirroGPT demonstrates how integrating knowledge graphs with data products allows us to scale the amount of data available to an advisor. It is just another example showing that the integration of AI with real-time operational data is set to transform the way businesses interact with their data and extract valuable insights. With advanced systems like SquirroGPT and the RAW Labs API OS, companies are now equipped to unlock groundbreaking insights and make more informed, data-driven decisions than ever before. As we continue to enhance and broaden our API OS, we look forward to embracing the exciting opportunities that lie ahead in this new era of AI integration.

Click here to learn more about our API OS and its construction, or here to request a demo, as we are eager to assist you in developing your own system. And go here for more information on SquirroGPT.

Learn more about ChatGPT prompt engineering to create API.

Miguel Branco

CEO & Co-Founder

Start for free today.
No credit card required.

Try now for Free

Still got questions?
Get a free custom consultation.

Book a Demo