Wrapping up Large Language Models

During 2023 and 2024 there has been intense interest, and investment in, how to deploy large language models (LLMs) in corporations, driven by a huge wave of excitement generated by ChatGPT and its rivals, such as Claude, Perplexity, Gemini and Llama. Companies have, for example, tried to deploy LLMs as automated chatbots for customer service, often with mixed results. Air Canada had to pay damages after its chatbot misled a customer about discount fares for bereavement. McDonalds gave up on an AI chatbot for its drive-through orders in June 2024, while DPD generated the wrong kind of publicity for its customer service chatbot.

There are several reasons for such missteps. To begin with, LLMs “hallucinate” regularly, producing plausible but wrong answers. Also, in order to be useful as chatbots, they need to be aware of the specifics of the company and customer, for example having access to customer order history, company policies etc. This is done by supplementing the raw LLM by exposing it to additional company-specific datasets such as policy documents or product specifications. This process is known as retrieval augmented generation (RAG). However, this process has its own issues. For example, care needs to be taken with data security, as the chatbot may inadvertently reveal sensitive data such as personally identifiable data or trade secrets that it has access to. There may also be differences in semantics between a customer’s question context and that of the LLM, with domain-specific terminology being used that the LLM is unfamiliar with or is ambiguous. In the case of structured data, the LLM needs to be aware of the database schema that it is accessing, so that it knows what column names and tables to access, yet this in itself can be a far from trivial process in the case of complex real-world corporate database schemas.

Data fabric vendor Fluree have recently released an interesting new feature that aims to address this problem. Fluree’s technology is based on a semantic graph database, where relationships between data and domain-specific terminology is defined and managed. Fluree puts an additional wrapper around a natural language interface. When a user types in a query, the software uses the semantic database to enrich or modify the query with any domain-specific terminology, and indeed may determine that the query is best handled by a regular SQL database inquiry rather than using a vector search used by LLMs with RAG. In this way the accuracy of the results is much improved, with early customer testing showing considerable improvements in the quality of the responses. As a bonus, the Fluree database has security policy support built in, so it can determine whether the user query may run across issues in terms of sensitive data before it is passed to the LLM.

This approach, effectively intercepting the user inquiry, interpreting it and either rerouting it or adding additional context for an LLM search, shows considerable promise. It is a practical approach to tackling the issue of enterprises trying to adopt LLMs and who are all too often finding that the wave of enthusiasm for the new technology runs up against the cold reality of that same technology’s limitations.