Invisible Ranks 61st Fastest-Growing Company on the 2024 Deloitte Technology Fast 500™. Read More.
Published by Invisible Technologies on May 23, 2024
Retrieval-Augmented Generation (RAG) – a technique that enhances the reliability of generative AI – is set to transform how knowledge workers access, use, and share information.
In the process, it will likely level the playing field in many highly skilled sectors, allowing smaller and mid-sized professional services enterprises the potential to compete with their much larger counterparts, says Rui Bai, Product Manager of AI and Machine Learning at Invisible Technologies.
“For knowledge-based workers, RAG can help overcome many of the challenges associated with AI, especially when it comes to relevance and accuracy,” Bai says. “Best of all, because it can leverage existing large language models (LLMs) it’s accessible, cost-effective, and easily adapted to business uses.”
In the past 18 months, LLMs such as OpenAI’s ChatGPT, Google’s Gemini, and Meta’s Llama3 have transformed our capacity to research, write, and generate ideas. However, many professionals in sensitive industries have been reluctant to use them in their work processes due to their need for accuracy and privacy.
“If you’re an investment bank or professional services firm, you operate in a ‘zero-error’ environment,” Bai explains. “You can’t have an LLM hallucinate and give you an unfactual answer when millions of dollars are at stake.”
Bai says these inaccuracies often happen because one of the greatest strengths of an LLM is also one of its greatest weaknesses.
“LLMs are trained on an incredible amount of data. This means they can answer questions about almost anything, but it also means the information they base those answers on will be of varying quality. Some of it is likely to be out of date, and some of it will be out of context.”
Bai also explains that another reason LLMs sometimes provide unfactual answers is that they recognize patterns based on statistics and probability rather than fact. This can lead them to make suggestions that are statistically plausible, even if they’re factually incorrect.
Bai says this is where RAG makes an impact. Instead of solely relying on a broad and variable dataset like an LLM, RAG combines the capabilities of an LLM with the retrieval of specific, verified information from internal or external databases—a synthesis that ensures its outputs are both contextually relevant and therefore improves accuracy.
For instance, a corporate law firm will have much of its intellectual capital tied up in precedents, which are essentially past or model contracts, letters, and written advice. Its lawyers will already be taking shortcuts and ensuring the quality of their work by drawing on these and using them to form the basis of new client work.
RAG could take this several steps further, improving both the quality of answers and providing them much faster. It could draw on the firm’s entire body of work to provide immediate and high-quality answers about contract drafting, specific legal questions, or even questions around which arguments worked, enabling lawyers to quickly find relevant cases that support their arguments, or providing immediate and high-quality answers during due diligence processes.
Similarly, an investment bank could use RAG to train AI on past financial reports, market analysis, and financial models to provide instant opinions and documentation for new deals or pitches.
“There is an obvious role for technology to play here in improving the quality and efficiency of professional advice,” Bai says. “Most professional and financial services firms recognize this and many have already attempted to build some kind of document synthesis platform on their own—often using predictive analytics, a forerunner to generative AI.”
Bai says Invisible has already helped an investment bank implement a proto-version of RAG using a model trained on an initial test set of the firm’s documents. Bankers can use a chat interface to get instant answers to any questions based on these documents.
“People are using it to get answers between their meetings. They want to go in fully informed or find a key piece of information. So they ask questions about it within the few minutes they have.” Bai explains.
“Unlike some LLM tools, our prototype also provides links to original sources so that users can verify the outputs and ensure the information is always correct.”
Bai says this has already significantly reduced the administrative and research time bankers would usually have to put in. However, it is still just the tip of the iceberg when it comes to RAG’s capabilities.
“If we were to include all of an organization’s historical data stretching back decades, RAG’s ability to recognize patterns within it and connect the dots will make it so extraordinarily powerful - enabling it to uncover hidden insights, or provide strategic recommendations that are deeply informed by the organization’s comprehensive historical context.”
Bai says that the path to this reality is already within the reach of more organizations than people might expect.
“There are many different models for implementing RAG at different price tags,” she explains.
“Some of the larger corporations have already teamed up with Anthropic, OpenAI, and other organizations to build their own systems from the ground up,” she explains. “But that may not be within the reach of some businesses, and it’s probably not really necessary.”
Instead, Bai says that most companies, if they had an ML team, could use RAG in combination with an open-source LLM by fine-tuning it using their proprietary data and information.
“Open source LLMs are also formidable, and the newer models are already streets ahead of those produced a year or two ago. The providers now also offer the opportunity to deploy these in a fully secure environment, so an organization’s sensitive information stays protected.”
Bai also says that the cost-effectiveness of this approach, and the efficiency with which a RAG-based model can be built, will mean that most knowledge-based organizations will move to RAG very quickly.
“RAG lets an organization fully exploit their untapped data and quickly draw out information and insights that weren’t possible without a significant investment in time and money,” she explains.
“For these reasons, we’re going to see RAG quickly adopted by tools that professionals use to work, with it eventually not just used to answer questions but also informing the way letters, advice, reports, and contracts are constructed.”
“I believe very soon they will be using RAG-enabled applications as an everyday tool in their work, just as accessible as Google Mail or the Microsoft suite of products.”