OCR & RPA: In-depth guide to data extraction with RPA

Hypatos Team
March 28, 2019
min. read

A survey of public RPA case studies revealed that more than half of all processes automated with RPA involve documents. Augmenting your RPA tool with specialized, deep learning based data extraction tools reduces development effort, enabling faster delivery and increases automation level...

Shift your operation teams to high-value tasks
By enabling Autonomous Finance
Free test demo

A survey of public RPA case studies revealed that more than half of all processes automated with RPA involve documents. Augmenting your RPA tool with specialized, deep learning based data extraction tools

  • reduces development effort, enabling faster delivery
  • increases automation level

Key Questions Answered

  • What are OCR alternatives in RPA tools?
  • How does Hypatos compare against RPA tools’ OCR offering?
  • How to integrate Hypatos into RPA?

What are OCR alternatives in RPA tools?

Though modern RPA tools provide necessary functionality to automate SME-level document processing, those tools lack the features necessary for enterprise level automation. Hypatos offers enterprise level automation functionality and can easily be integrated to any RPA tool.

RPA tools traditionally offered only legacy OCR technology that converted images into text. Then, RPA consultants programmed rules based code to extract data from the images. This was a fragile and time consuming approach. However, it enabled automation in an area where companies were completely relying on manual processing. It was also a flexible approach, enabling automation of various processes.

Modern RPA tools like UiPath offer improved OCR technology that returns structured data. While they can serve the needs of an SME, they are unlikely to support enterprise grade automation. This is because no RPA OCR product that we reviewed included

  • adding additional fields to extract from supported documents
  • human-in-the-loop functionality that enables continuous learning
  • document processing features beyond document extraction
  • support for a diverse set of documents beyond invoices and receipts
  • On-premise deployment

Hypatos is a specialized, deep learning based, enterprise grade document automation tool. While we provide an API only for data extraction, we also provide further document processing services (e.g. validation, enrichment) to our customers.

How does Hypatos compare against RPA tools’ OCR offering?

We shaped Hypatos offering during our work with Fortune 500 clients as we supported the Big 4 in their client engagements. Our offering was both shaped by these projects and by the Big 4’s experience about enterprise technology evaluation criteria for document automation technology.

In technology procurement, enterprises look for highly effective, future proof, enterprise ready solutions with low TCO and fast implementation time. Let’s break those down:


Software effectiveness is hard to measure, it is difficult to quantify how much time an accountant saves thanks to excel without getting that accountant to work without using excel for a test period. However, in case of document automation, there are some relatively simple metrics:

  • no touch automation rate: The share of documents that are completely automatically processed. This can be for a single step in the process such as data extraction or for the whole process. Sounds simple but it is called in so many different ways in the industry which causes confusion. Straight through processing (STP), zero touch rate all mean the same thing.
  • Workforce required to process the company’s document load: An automation solution is not doing its job if it does not allow employees to focus on higher value added tasks. With solutions that provide limited automation, enterprises are discovering that they still need to employ the same level of employees focusing on repetitive back-office tasks.

In these two metrics, Hypatos is the best performing automation tool according to our clients. This is because of a number of factors:

  • Human-in-the-loop functionality that enables continuous learning: Continuous learning is essential for effectiveness as pre-trained solutions rarely perform beyond 60-70% no touch rate.
  • Document processing features beyond document extraction: Data extraction is one step in several multi step processes such as purchase-to-pay or travel and expense management. With the exception of some BPO operations, almost no employee solely spend their time in data extraction, they support several reconciliation activities such as Invoice Received Goods Received Clearing.  Focusing on just data extraction rarely frees any employees to focus on higher value added tasks

Future proof

No company that we have met consider stopping their finance transformation or automation journey after automating accounts payable. Every CEO and CFO wants ambitious automation initiatives that are capable of enabling a significant percentage of their employees to focus on higher value added tasks. They needed such initiatives yesterday but would live with initiatives that deliver such results within a year.

At Hypatos, we focus on building custom models for enterprises that can process a wide variety of documents like receipts, offers, delivery notes, payment forms, tax forms, CVs etc. in addition to invoices.


Enterprise readiness involves support with SLAs and monitoring tools, observability via detailed log files, scalability and ability to deploy to client’s preferred environment. While RPA solutions deliver all of these features, some of their OCR tools require a public cloud connection which limits their enterprise acceptance. Hypatos delivers solutions both on the cloud and on-premise, enabling companies to fulfill their specific data privacy policies.


As EY points out, maintenance is one of the enterprises biggest concerns about RPA. These concerns are not unfounded. Most RPA installations rely on custom built code that will be subject to changes as market conditions, internal processes and external laws and regulations change. Hypatos provides a maintenance free solution since all technology maintenance is handled by the Hypatos team as part of our license fee.

This allows us to have the most transparent pricing model: With a simple price per document, enterprises are able to automate their document based processes. They don’t have to consider additional costs like training, consulting, implementation etc.

Fast implementation

RPA is renowned for its speedy deployment time frames. RPA becomes an even faster tool when enterprises use it for its initial area of focus: automating structured data operations between various software. Automating document processing with RPA takes time as developers need to build custom logic.

For example, invoices need to be checked for VAT compliance. In partnership with Deloitte, we built these VAT rules into our product so enterprises can check for VAT compliance while completing data extraction. VAT compliance rules are dynamic as regulation changes and we keep them up to date with support from Deloitte. In an RPA deployment without Hypatos, such rules would either need to be hand-coded or not coded at all. While coding these are time consuming, not including them opens companies up to significant financial and legal risk.

Hypatos, with its document specific functionality, enables a faster RPA deployment.

How to integrate Hypatos into RPA?

We provide an easy-to-integrate API endpoint that takes an image and returns structured data in JSON format. In addition, our UI for manual corrections can be integrated into any RPA project. Users can log in to Hypatos studio with their credentials to find extracted data points next to the document images and correct any extraction errors.

To integrate Hypatos into your RPA project, get your API credentials.

Unleash the potential of your people and business

Shift your operations teams to high-value tasks by enabling Autonomous Finance

Further stories from our blog