Voyage AI is building RAG tools to make AI hallucinate less

Artificial Intelligence (AI) often generates inaccurate information, which can be problematic for businesses relying on its outputs for decisions, as errors could negatively impact their financial bottom line. According to a Salesforce survey, half of the employees expressed concern over the accuracy of answers generated by their company’s AI systems. While no solution completely prevents these inaccuracies, certain methods can mitigate them. For instance, retrieval-augmented generation (RAG) integrates an AI model with a knowledge base, supplying additional information to support its answers, acting as a fact-checking tool.

The demand for more reliable AI systems has led to entire businesses focusing on RAG technology. One such company is Voyage AI, established by Stanford professor Tengyu Ma in 2023. Voyage AI specializes in RAG systems for various companies, including Harvey, Vanta, Replit, and SK Telecom. As Ma explained to TechCrunch, Voyage aims to improve search and retrieval accuracy in enterprise AI by offering solutions tailored to specific domains like coding, finance, legal, and multilingual applications, all aligned with a company’s data needs.

Voyage implements RAG systems by training AI models to convert text and documents into numerical formats known as vector embeddings. These embeddings distill the meanings and relationships of data points, optimizing them for search-related tasks, particularly within the RAG framework. Voyage employs contextual embeddings, capturing not only the semantic meaning but also the context of data occurrences. For instance, the word "bank" would be represented differently in "I sat on the bank of the river" versus "I deposited money in the bank," with vectors reflecting the distinct meanings in context.

Voyage offers and customizes its models for use in on-premises, private, or public cloud environments, providing fine-tuning services for clients willing to pay for enhanced performance. While this approach isn’t unique—OpenAI also offers a similar embedding service—Ma asserts that Voyage’s models stand out by delivering superior performance with reduced costs. "In RAG," he stated, "we retrieve relevant information from a knowledge base, akin to a librarian selecting books from a library. Conventional RAG often loses context during encoding, impairing retrieval accuracy. Voyage’s models excel in accurate retrieval, enhancing overall RAG response quality."

Supporting Ma's bold claims, Anthropic, a notable rival of OpenAI, described Voyage’s models as "state of the art" in a support document. Ma further elaborated that Voyage’s methods utilize vector embeddings trained on the specific data of a company, ensuring contextually aware retrievals, which greatly improves accuracy. When asked about financial performance, Ma shared that Voyage, based in Palo Alto, has over 250 clients but did not disclose revenue details. In September, the company, comprising around twelve employees, secured a $20 million Series A funding led by CRV, with contributions from Wing VC, Conviction, Snowflake, and Databricks. This funding raised their total to $28 million, intended to facilitate the release of new embedding models and to double the company’s workforce.

Latest news

Let’s create your next big project together.