Tech Deep Dive: Getting answers with references with the Chat tool

Welcome to yet another article from our Tech Deep Dive series, where we unravel the intricate technology powering our tools. In this one, we will take a closer look at the technology behind our newest tool – the Chat. We will see how the tool generates answers, how we are dealing with hallucinations and what is the uniqueness of this solution.

Do you wanna be notified whenever we publish new articles? Subscribe to our newsletter!

What is the Chat tool?

Let’s start with the basics – what is the Chat tool? The Chat tool is the newest module in the Researcher Workspace platform. It leverages the capabilities of the Large Language Model (LLM) and is designed to answer user questions about information in the dataset with precision, backed by references and the ability to tap into general knowledge when needed.

Large Language Models

Large Language Models, or LLMs, are the backbone of natural language processing in the field of artificial intelligence. These sophisticated models are designed to understand, generate, and manipulate human-like text, making them invaluable tools for a wide range of applications. From chatbots and virtual assistants to content generation and language translation, LLMs play a crucial role in making human-computer interactions more intuitive and efficient. Iris.ai Chat tool is based on an open source model and specialized on scientific text.

So how exactly does it work

At the heart of the Chat Tool lies the Large Language Model, a large neural network model with billion parameters that enables the tool to understand and respond to user queries. This cutting-edge technology allows the tool to interpret questions, analyze context, and generate accurate responses, all in real-time. The Chat tool consists of two components based on RAG architecture – retrieval and generation, with unique adjustments.

RAG architecture

In a nutshell, the Retrieval Augmented Generation architecture enhances the ability of LLMs to answer user questions by providing external data sources which are relevant to the user query. To do this, it usually represents documents and queries as numeric vectors and uses efficient algorithms for computing vector similarity.

Our approach

Our current implementation differs from the commonly known RAG architecture. Instead of retrieving information using vector representations of documents and queries, we use fingerprinting and RV coefficient to rank documents to fulfill the context of the user’s intent.

On top of that, by default the Chat tool’s knowledge is limited to the documents in the dataset. By constraining the Chat tool’s knowledge to the provided dataset, we ensure that the responses are not only accurate but also contextually grounded. This deliberate limitation acts as a safeguard, reducing the risk of hallucinations.

When the user submits the query, the machine finds the top relevant articles from the dataset provided. This approach is similar to our Explore tool and context filters. After ranking the documents, we present the top results back to the model, which then generates the answer based on these top articles and presents it to the user.

References

What sets the Chat Tool apart is its commitment to transparency and credibility. With each response, the tool provides sources and references, empowering users to delve deeper into the subject matter. This feature not only aids in verification but also promotes a culture of academic rigor and accountability. The references are these top relevant articles found through fingerprinting. Each of the references have a percentage which is associated with the relevance score according to the query. So this is not an indication of what extent this document has been used to answer the question. But this is the criteria by which the document was selected to be used by the chatbot.

Using general knowledge

To select documents for the answer, the machine needs to find the documents with at least 50% relevance. In instances where the dataset doesn’t contain the necessary information, the Chat Tool asks for the permission to tap into general knowledge. This approach ensures that users receive comprehensive answers, even when the information lies beyond the curated dataset.

Why The Chat Tool Stands Out

Reliability: Users can trust the responses generated by the Chat Tool, thanks to its reliance on a curated dataset and the transparency of sourced references.
Versatility: The LLM technology enables the tool to understand nuances in language and provide precise, contextually relevant answers.
Adaptability: The Chat Tool’s ability to tap into general knowledge makes it a universal tool to answer your questions.

Future developments

Our key future development revolves around expanding the context. It requires partial retraining of the model to change the amount of text (number of words) the machine can process at once. Expanding the context goes hand in hand with the goal of facilitating smoother and more natural dialogues between users and the AI system as well as increasing the accuracy of the answers.

In the further future, we are thinking of creating our own Large Language Model trained specifically on scientific documents.

Conclusion

In summary, the Chat Tool, powered by the Large Language Model, is a cutting-edge solution utilizing the unique RAG architecture. Its precision and reliability stem from document ranking through fingerprinting and RV coefficient, ensuring contextually grounded answers. Transparency is prioritized, with sourced references provided for verification. The deliberate dataset limitation guards against hallucinations. The tool’s adaptability to tap into general knowledge, coupled with plans for future developments, such as handling full-text documents and a specialized scientific model, cements its position as a dynamic and essential resource for diverse inquiries.