The Imperative Need for Machine Learning in the Public Sector

We’re excited to bring back Transform 2022 in person on July 19 and virtually from July 20-28. Join leaders in AI and data for in-depth discussions and exciting networking opportunities. Register today!

The sheer number of backlogs and backlogs in the public sector is troubling for an industry designed to serve voters. Last summer, the four-month waiting period to receive passports made headlines, a substantial increase from the pre-pandemic norm of 6-8 weeks of turnaround time. More recently, the Internal Revenue Service (IRS) announced that it is entering the 2022 tax season with 15 times the usual number of backlogs, alongside its plan moving forward.

These frequently publicized backlogs do not exist because of a lack of effort. The sector has made strides through technological advancements over the past decade. Yet legacy technology and outdated processes still plague some of our country’s most important departments. Agencies today must embrace digital transformation efforts designed to reduce data backlogs, improve citizen response times, and drive better agency results.

By adopting machine learning (ML) solutions and integrating advances in natural language processing (NLP), backlogs can be a thing of the past.

How ML and AI can connect the physical and digital worlds

From tax documents to passport applications, manually processing items is time-consuming and prone to error on both the sender and receiver side. For example, a sender may mistakenly check an incorrect box or the recipient may interpret the number “5” as the letter “S”. This creates unforeseen processing delays or, worse, inaccurate results.

But managing the growing problem of the government’s document and data backlog isn’t as simple and straightforward as uploading information to processing systems. The large number of documents and citizen information entering agencies in various formats and unstructured data states, often with poor readability, makes it nearly impossible to reliably and efficiently extract data for downstream decision-making.

Embracing artificial intelligence (AI) and machine learning in daily government operations, just as other industries have done in recent years, can provide the intelligence, agility, and advantage needed to streamline processes and enable end-to-end automation of document-centric processes.

Government agencies need to understand that real change and lasting success will not come with fast patchworks based on legacy Optical Character Recognition (OCR) or alternative automation solutions, given the large amount of incoming data.

Bridging the physical and digital worlds is possible through Intelligent Document Processing (IDP), which leverages proprietary ML models and human intelligence to classify and convert complex, human-readable document formats. PDFs, images, emails, and scanned forms can all be converted into structured, machine-readable information using IDP. It does this more accurately and efficiently than traditional alternatives or manual approaches.

In the case of the IRS, inundated with millions of documents such as 1099 forms and individual W-2s, sophisticated ML and IDP models can automatically identify the scanned document, extract printed and handwritten text, and structure it into a machine-readable format. . This automated approach speeds processing times, incorporates human assistance when needed, and is highly efficient and accurate.

Advancing ML Efforts with NLP

Along with automation and IDP, the introduction of ML and NLP technologies can significantly support the industry’s quest to improve processes and reduce backlogs. NLP is a field of computer science that processes and understands text and spoken words as humans do, traditionally grounded in computational linguistics, statistics, and data science.

The field has seen significant advances, such as the introduction of complex language models containing over 100 billion parameters. These models could power many complex word processing tasks, such as classification, speech recognition and machine translation. These advancements could support even greater data mining in a document-ridden world.

In the future, NLP is well on its way to reaching the level of text comprehension ability similar to that of a human knowledge worker, thanks to technological advances driven by deep learning. Similar advances in deep learning also allow the computer to understand and process other human-readable content, such as images.

For the public sector in particular, this may include images included in disability applications or other forms or applications that contain more than text. These advances could also improve downstream stages of public sector processes, such as ML-based decision-making for agencies determining unemployment assistance, Medicaid insurance, and other invaluable government services.

Not upgrading is no longer an option

While we’ve seen a handful of promising digital transformation improvements, the call for systemic change has yet to be fully answered.

Ensuring agencies go beyond patching and investing in various legacy systems is necessary to move forward today. Patchwork and investments in outdated processes fail to support new use cases, are fragile to change, and cannot handle unexpected increases in volume. Instead, introducing a flexible solution that can take the most complex and hard-to-read documents from input to output should be a no-brainer.

Why? Citizens deserve more from the organizations that serve them.

CF Su is Vice President of Machine Learning at Hyperscience.


Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including data technicians, can share data insights and innovations.

If you want to learn more about cutting-edge insights and up-to-date information, best practices, and the future of data and data technology, join us at DataDecisionMakers.

You might even consider writing your own article!

Learn more about DataDecisionMakers

Sherry J. Basler