Skip to content


Your Research


Smart AI/NLP tools for all your research document processing. 

Trained on your research field

With no human effort. (No, really.)

Full specialization of the Workspace on your exact field of research, without any human labeling, taxonomies or training. 

Adaptable tools for your process

Centered around your content, the tools allow for a wide variety of processes – by yourself or collaborating.

Any type of documents

The Researcher Workspace allows the addition of a variety of datasets, including research papers, patents, internal research documentation and much more.


Content based search

Content based search

Bypass keywords and explore interdisciplinary with a content-based recommendation engine.

Context filtering

Context filtering

Filter down a document set based on criteria you can explain in a sentence, but not put into one keyword.

Data filtering

Data filtering

Filter based on information extracted from your documents like specific entities, data points or data ranges. 

Extracting and systematizing data

Extracting and systematizing data

Automatically extract and systematize any key data points from text and tables into a table layout of your own design.

Document set analysis

Document set analysis

Analyze a large set of documents to form an overview of the content – and decide what to include and exclude. 



Ask the machine to summarize single or multiple documents to get a quick overview, or kickstart your writing. 

Chatting with dataset

Chatting with dataset

Interact with your datasets to distill meaningful insights and summarize findings.

New challenges?

New challenges?

Have a specific, scientific or medical text related problem you’re not seeing the solution to? Get in touch – we love a challenge!


We have spent the last 8 years building an award-winning AI engine for scientific text understanding. Our algorithms for text similarity, tabular data extraction, domain-specific entity representation learning and entity disambiguation and linking measure up to the best in the world. On top of that our machine builds a comprehensive knowledge graph containing all entities and their linkages to allow humans to learn from it, use it and also give feedback to the system. Applying these on scientific and technical text is a complicated challenge few others can achieve.

Watch the recent webinar introducing the Researcher Workspace 1.0


The Researcher Workspace

It is a powerful software suite where your research content is in the center. You can upload any collection of research documents, or connect directly to a live proxy data set such as a publisher, a patent authority, an internal repository or any other source relevant to your research.

When the content is added, you have access to a broad variety of smart tools that you can apply and combine as needed. Every research process is a little different, and your Research Workspace will enable any workflow.

Reinforced on your domain

The core Machine Learning engine is trained on a large set of scientific articles. The domain adaptation is important in order to increase the quality of the results in more narrow domains, by reinforcing the meaning of the respective words (“tokens”) within the particular domain. This happens with minimal to no human interaction.

All you need to do is provide 10-20 representative documents relevant of your field of research to the system. This can range from a broad representation of a larger field or a narrow subdomain.  The machine enlarges that from Open Access literature to a dataset of about 2000 documents, and then it builds a new vocabulary specific to your field and reinforces itself on this new vocabulary.

This means your version of the Researcher Workspace now has a clearer understanding of the terminology you and your fellow researchers are using, specifically in your context of research. 



In the Researcher Workspace, you can work with most any type of scientific and research content – whether it is open access or paywalled research papers, patents, internal documents, greypapers, whitepapers, tech specs – you name it. Doing federated searches across a wide variety of sources saves an enormous amount of time. 

The Researcher Workspace comes with a variety of data sets you can load directly into your own Workspace. This includes most of the world’s Open Access content from in addition to PubMed, the US Patent Office, and CORDIS, all EU-funded research projects. You can choose to integrate more of these proxy datasets (live collections of content), for example from Paywalled subscription content or your own storage of internal or external research documentation. These proxy datasets will then regularly be updated in your Workspace, and you can choose to be alerted to new documents that fit your processed criteria for each project. 

Each user can also simply choose to upload a list of documents – through direct upload from their computer, through their personal cloud storage, or through an e.g. BibTex file exported from their reference manager. These datasets are static, and can be further processed in the tools in the same way as a connected database of content. 



Context filtering

Some inclusion/exclusion criteria are straightforward: they can easily be expressed with a keyword or three. Unfortunately, this is not always the case – more often, the criteria could much easier be expressed in a description of a context. Examples of this would be context of application of a chemical, or intended use of a drug. 

With the Workspace’s Context filters, you can write your own context descriptions of ≈ 50-100 words, which is matched against every article in your content list. You can add as many as you want, and use them either for inclusion or exclusion. Think of a Venn diagram of contexts applied to your reading list, allowing you to rapidly filter down. 

Data filtering

For some filtering tasks, detailed specificity is needed. Whether you need to filter on a Named Entity, a specific Data point or Data range, the advanced filtering extracts and identifies the exact information from the documents and you can then use the identified variables in the articles to filter down the list.

Need to know, from the 500 documents in front of you, which one reports on steel with a tensile strength between 600-650 MPa? Or whether any of the papers in PubMed reports on Ibuprofen with prevalence of adverse effect of nausea above 5%? This is what the advanced filtering is there for. 


Data extraction

Manually extracting – and linking – the data you need from a PDF of free text, tables, graphs, figures and a plethora of layouts requires major effort from highly skilled manual labor.’s Extract tool fetches and links all the key data from these documents into a tabular, machine readable, systematic format. A full month of data extraction work can be done in minutes, at 90% accuracy.

The PDF containing the relevant data points to be extracted is sent to the system. This PDF can be a patent, a clinical trial report, a research paper or any other relevant type of scientific or technical content. It can be one simple document at a time, or hundreds or thousands of them in a batch.

The engine extracts the text and identifies all the domain-specific entities, then locates the tables and extracts the data from rows and columns, and links the data between the text and table. Then the engine populates a pre-defined output in a machine-readable format; an excel sheet, an integrated lab tool, a database or anywhere else your require.


Machine-generated summaries

The Workspace comes with a configurable summarization engine. It can rapidly produce summaries of multiple abstracts, one fulltext or multiple fulltext documents. These summaries are great for either rapidly reviewing larger document sets of similar documents – or to kickstart your scientific writing. 

The summary tool does abstractive summarization – meaning the tool actually writes its own summary, as opposed to copy-pasting sentences together (called extractive summarization). This means the summary flows quite well and contains the most important bits the document(s) have in common. The summaries can also be configured – is it a short, two-sentence summary of 10 documents you’re after, or a one-page summary of a 20-page document? 

Want summaries focused on a specific topic, or specific sections of your documents? Get in touch to discuss our next release of the summarization tool. 

Document set analysis

When you’re confronted with a set of search results – through the Explore tool or in another search tool before importing the content to – it can be daunting to know what to filter for, either to include or exclude. Dealing with unknown unknowns can demand a lot of trial and error, and reading of titles. With the Workspace’s document set analysis you will quickly get an overview of the content of a document set. 

After analyzing a document set – from just a few handfuls of documents up to 20,000 is supported – you will be  presented with a variety of results:

You will see the Topic groups of the literature list, both from a global topic (what topics do these articles fall within from an overall scientific level) as well as a specific topic (within this reading list, what topics do the articles fall within). As one article can be part of multiple topic groups, this is a helpful way to select groups for inclusion and exclusion without missing out on any relevant documents. 

You will also be able to explore both the most meaning-bearing words of the document set, the rare words which may carry special meaning in this context – and all their related synonyms. 

image (101)

Chatting with your dataset

After filtering your dataset, you can use the LLM-based Chat tool to distill insights.

Interact with your datasets to distill meaningful insights, summarize findings, translate text, and perform other operations, continuously improving results through lowering hallucinations and strengthening factual accuracy. The machine is able to answer your questions based on the documents in the dataset, therefore minimizing the possibility of hallucinations. If the answer to your question is outside of the dataset provided the machine can fetch the data from the general knowledge.

OUR COOKIE POLICYIn order to provide our services and give better more secure experience uses cookies. By continuing to browse the site you are agreeing to our use of cookies.