Iris.ai presents data extraction:
The Extract tool
Manually extracting – and linking – the data you need from a PDF of free text, tables, graphs, figures and a plethora of layouts requires major effort from highly skilled manual labor. Iris.ai’s Extract tool fetches and links all the key data from these documents into a tabular, machine readable, systematic format. A full month of data extraction work can be done in minutes, at 90% accuracy.
HOW EXTRACTING DATA WORKS
The PDF containing the relevant data points to be extracted is sent to the Iris.ai system. This PDF can be a patent, a clinical trial report, a research paper or any other relevant type of scientific or technical content. It can be one simple document at a time, or hundreds or thousands of them in a batch.
The Iris.ai engine extracts the text and identifies all the domain-specific entities, then locates the tables and extracts the data from rows and columns, and links the data between the text and table. Graphs, figures and other elements go through the same process.
Then the engine populates a pre-defined output in a machine-readable format; an excel sheet, an integrated lab tool, a database or anywhere else your researchers require.
The tool can be easily custom trained on your domain, and can thus be used for a range of topics such as chemistry, material science, pharmaceuticals, medical science and many other areas.
Iris.ai has spent the last 5 years building an award-winning AI engine for scientific text understanding. Our algorithms for text similarity, tabular data extraction, domain-specific entity representation learning and entity disambiguation and linking measure up to the best in the world. On top of that our machine builds a comprehensive knowledge graph containing all entities and their linkages to allow humans to learn from it, use it and also give feedback to the system. Applying these on scientific and technical text is a complicated challenge few others can achieve.