Knowledge graphs: RAG is NOT all you need
Docusign distinguished engineer Dan Selman runs test queries to detect which tools a large language model (LLM) uses to answer queries against knowledge graph, showing you there's more to it than Retrieval-Augmented Generation (RAG).
Table of contents
Over the past few weeks I’ve been researching and building a framework that combines the power of large language models for text parsing and transformation with the precision of structured data queries over knowledge graphs for explainable data retrieval.
In this fourth article of the series (one, two, three) I will show a generic web interface that helps explain how the LLM is using tools and graph queries to answer a wide variety of structured and unstructured questions.
I want to explore how the LLM uses automatically generated knowledge graph tools to answer a variety of questions. In this video I pose questions, and then review and explain how the LLM chose to use tools and which tools were used, and informally gauge the quality of the answers.
Conclusions
The general quality of answers was high, with most grounded by the data in the knowledge graph.
GPT-4o does a good job at choosing an appropriate tool and sequencing tools usage into a query plan.
When data is missing from the knowledge graph (or invalid queries are generated), the LLM degrades gracefully, indicating that it is using world knowledge.
It remains to be seen how well this will generalize to more rich domain models (ontologies).
Automatic creation and running of tools over the knowledge graph (model driven) means experiments can be performed very quickly.
More work needs to be done to summarise and explain how an answer was generated.
Adding streaming responses would make the user experience more engaging.
Additional resources
Dan Selman has over 25 years experience in IT, creating software products for BEA Systems, ILOG, IBM, Docusign and more. He was a Co-founder and CTO of Clause, Inc., acquired by Docusign in June 2021.