Skip to main content

Documents Anonymisation

Electronic deletion of sensitive data

The anonymization module is a tool that is used to automate the hiding of personal data and other sensitive information from document images.

Since the entry of the EU data protection regulation - GDPR - data anonymization has become a well-known process. Its automation is crucial for many companies due to the amount of personal data processed. For this reason we have created an anonymizer that guarantees fast and efficient data anonymization

How does anonymisation work?

There are currently some competitive solutions available on the market that enable anonymization of documents, however they usually need to be in editable format. This also requires reading the entire document and manually rewriting all the words that need to be anonymized.

The anonymization module for JobRouter® created by e-MSI is a solution that also allows you to anonymize the documents in image format (scans, photos). Thanks to the use of intelligent search algorithms and data type recognition, the system automatically recognizes and conceals sensitive data.

The process

The document that needs to be anonymised is scanned and sent to the appointed e-mail address.

System monitors the e-mail inbox, devides document based on assigned bar codes, and then runs OCR (Optical Character Recognition) and other intelligent algorithms, which help to appoint the sensitive data, such as:

  • Names and Surnames (supported by name database),
  • Identity numbers, tax identification numbers and other numbers with fixed structure configured by administrator (data in a specific fromat or/and checksum),
  • Addresses,
  • Other data, which is placed in the client's domain systems (the system can be supplied with any database as a format pattern).

All recognised words for potential anonymisation are marked up and submitted for final user verification.

  • The system generates a task for the user with marked areas.
  • The user verifies the suggested data and marks additional content to be hidden (if there is any).
  • The user approves the final scope of anonymization and submits the document to the next step.

At a later stage, it is possible to accept the document in one or more stages.

In the final anonymised document, all verified sensitive data is blacked. Then the file is exported to an irreversible form and, depending on the needs, its copy is placed in the indicated folder or in a target system.


Currently, most documents are anonymized manually. Documents are scanned and then personal details are manually erased in graphics programs. Especially in public institutions, it is a process that generates a huge cost in the form of employees' time. The purpose of the system implementation is to eliminate the process of manual data erasure from documents, and thus - to reduce the costs of anonymization. The system additionally eliminates errors arising during manual anonymization of a large number of documents. The system, unlike the human eye, does not get tired and its effectiveness does not decrease with time.

The anonymization module is always individually tailored to the client's needs and it is possible to extend its functions with the substitution of expressions (e.g. replacing the name and surname with initials) on request.


  • Integration with scanners by automatically downloading scans from the e-mail box and network folders.
  • Dividing packages of scanned documents based on barcodes (mass file handling).
  • Automated search and recognition of personal data in a document.
  • Suggesting areas for blackening in the document - by analysing the content of the document - built-in content recognition system based on the OCR engine.
  • The system uses word dictionaries for anonymization - e.g. English database of names and surnames.
  • Possibility to connect internal databases to the list of words to be anonymized, e.g. residents database.
  • Learning algorithms - the system learns the words indicated by the user for anonymization - it has the ability to indicate different words for different types of documents, e.g. if the system has not indicated John as the first name to be anonymized, after the user indicates this word in subsequent documents, the name will be anonymized.
  • Possibility to define a multi-stage verification of document anonymization before creating the final version - the verification path may depend on the data in the system, e.g. on the type of document.
  • Intuitive interface of the verifier - marking words for anonymization by drawing black squares on the document.
  • No restrictions on the number of documents.
  • Up to 500 verifying users can work at the same time.


e-MSI Sp. z o.o.

  • Microsoft SQL Server
  • MySQL/MariaDB
  • English
  • German
  • Polish
Supported JobRouter® version
from version 5.0

Note: Please contact your support partner to check whether the product is compatible with your existing IT landscape.


  • DMS
  • Document Recognition
To top of the page