Delphi at 100% 🦾
Delphi produces the correct answer on the data.world benchmark 100% of the time
In short, it was shown that the more context and constraint given to the LLM when interfacing with the data, the better the results.
Text-to-SQL (16.7% accuracy) was inferior to using a knowledge graph with the data (54.2% accuracy), which was inferior to interfacing with a semantic layer (83% accuracy).
So how does Delphi compare? The answer was in the title of this post. Delphi is able to answer the questions in this benchmark with 100% accuracy.

You might say this sounds unrealistic - no-one gets 100% on anything. While the data model in this benchmark is complex (3NF, multi hop joins), the truth is that it is actually a very straightforward data model for Delphi. We’ve built Delphi to be able to handle semantic layers with much higher levels of complexity and lower levels of cleanliness than in this benchmark - this is how all semantic layers we’ve seen in production look after being deployed for a number of years.
Here are the details of the results, including generated queries.
also replied to Jason’s post:Delphi is able to handle questions which have entity names and proper nouns like: “List all the claims that were filed by Peyton Manning”.
If you’d like a demo or to speak to us, sign up at delphihq.com