Ashley Keil, IBML VP sales, EMEA/APAC, discusses how artificial intelligence platforms can radically change how documents are handled and processed
Don’t panic! You might recall the famous inscription on the cover of The Hitch Hiker’s Guide to the Galaxy, the classic sci-fi book by the late Douglas Adams published in 1979, which charts the adventures of Arthur Dent and Ford Prefect after Vogons demolish Earth to make way for a new hyperspace bypass.
What might not be quite so well known is just how prescient Adams was in referencing technology that has subsequently been developed. The fictitious Hitch Hiker’s Guide was itself almost a precursor to the Kindle – a handheld electronic book able to serve a million pages via a four-inch square screen. The information stored in it is user-generated and constantly updated – exactly the approach adopted by Wikipedia. Then there is Babel Fish, which introduced the idea of putting something in your ear to translate languages – a concept brought to market in 2016 by Waverley Labs with the Pilot Smart Earbuds.
Talking of Babel Fish, rapid developments in self-learning artificial intelligence (AI) platforms are now enabling organisations to gain real insight from documents irrespective of the language in which they are written, the computer file format used or whether they contain machine print, cursive handwriting or both.
In particular, they radically change how organisations recognise and classify millions of documents and extract and validate information without any manual intervention, thereby increasing productivity, improving accuracy and saving money.
A vast amount has been invested in deploying traditional recognition technologies like OCR, ICR and intelligent word recognition to analyse the content of documents and boost automation.
However, these still have limitations. Many ICR/OCR engines struggle to process a mix of documents encompassing structured, semi-structured and unstructured data, along with cursive handwriting and old documents with poor legibility. The situation is exacerbated when volumes are high. And no traditional ICR/OCR engine can seamlessly process a variety of languages – jumping from documents in English to ones in Chinese, then German and so on.
With such variability, it’s tough to get more than 90-95% accuracy, forcing staff to rekey information manually. This is time-consuming, costly and requires trained employees to do (a crowd-sourcing approach, where snippets of data are sent to online entry clerks logged into an Internet-based system to check and input, is one possible work-around).
Could AI be the answer? Utilising neural networks, AI-driven document processing platforms have the potential to leapfrog traditional recognition technologies.
At the outset, a system is ‘trained’ so that a consolidated core knowledge base is created about a particular spoken language, form and/or document type – in AI jargon, this is known as the ‘inference’. This expands and grows over time as more and more information is fed into the system and as it self-learns through a ‘re-training loop’. When errors in the system are corrected, the inference (and the metadata underlying it) updates itself so that it is able to deal with similar situations when they next occur.
AI-based systems can be trained automatically to recognise specific forms; review specific content and its layout on the page; and convert cursive handwriting into standard electronic formats, such as PDF or JSON, for analysis or workflow purposes, with validation and verification.
This can also be done at field-level for key value extraction – admittedly, this is something ICR/OCR systems can also do, though they struggle to recognise cursive handwriting and require complex algorithms to find the fields. Key value extraction on a form could be a generic box for ‘name’ or ‘age’ (the key) and the specific values of ‘Mr John Smith’ and ‘50’. On an invoice, the keys would be items purchased and the values the prices paid for each one.
The benefits are clear for Governments, healthcare providers, banks and insurance firms that process vast numbers of handwritten forms with identical formats for questionnaires, applications, claims and other applications. Retrieving handwritten information and converting it into a digital format without human intervention reduces manual errors, lowers cost, allows big data analytics and delivers a faster turn-around. Anywhere up to 50,000 pages per hour can be processed using a single server – with bigger deployments and cloud delivery possible with added compute power.
One German insurance firm is planning to use an AI-powered system to process all claims under a certain value with no human involvement at all, with name, address, insurance number and other details about a given incident captured and checked automatically.
One of the consequences of the Covid-19 pandemic and its economic fallout is that many companies will want to improve efficiency in a bid to save money. Those that have a significant cost and operational overhead processing forms and other documentation may feel a sense of corporate anxiety or even alarm about how to do this.
As The Hitch Hiker’s Guide helpfully advised on its cover, don’t panic. AI is now a real-world performant and reliable option for companies tasked with grappling and dealing with millions of paper documents.