Content Discovery

Whatever your purpose, we find everything inside your documents

We find and capture the text content of relevant documents, images and files so they can be searched, analysed, understood and acted upon. Like identifying personal data for GDPR or uncovering hidden items (e.g. passport and credit card numbers) or finding key attributes to trigger workflows or generate metadata like document type. Whatever your purpose, we find the right content to satisfy IT. . .


Content Discovery is concerned with the text that is inside each document, image or file and in finding whatever it is you wish to find within that text for whatever purpose. From dealing with scanned paperwork, across into all existing digital documents & files and on into all incoming scanning output and ‘born digital’ documents, Content Discovery uncovers everything and then analyses it. And beyond the initial findings, it will continue to monitor and report upon any content changes, updates or deletions.                                                   

Various automated IT technologies & techniques are encompassed under the term ‘discovery’ such as OCR (Optical Character Recognition) applied to scanned paper so that the text content might be machine readable - the same with image files like TIFF or JPEG that commonly derive from previously scanned paperwork. Similarly there are sophisticated techniques to enable the capture of text content from within digital documents and files whatever their format.

Importantly, discovery extends into ‘content analytics’; namely, the identification, interpretation, classifying and reporting on the existence and nature of any text content to satisfy any investigative purpose. This involves the application of a powerful range of world class ‘content analytics’ techniques and technologies ranging from item & attribute dictionaries, taxonomy & metadata analysis, complex regular expressions and detailed pattern, context and proximity rules extending into very advanced areas of NLP (Natural Language Processing), Machine Learning, and Probabilistic Neural Topic Modelling using Deep Neural Networks (viz. ‘Artificial Intelligence’) as appropriate to the purpose.

The results from all of the above are reported through user engaging presentation facilities. And whilst these facilities provide tabular results as a baseline, they automatically extend into graphical representations, charts & similar for information insight. All results can also be exported on demand by a simple click of a button (e.g. CSV files) for in-house analysis using BI (Business Intelligence) tools whenever desired.

In addition to all the automated process and reporting, Content Discovery incorporates an ‘on demand’ or ad hoc capability to respond (instant Find, Retrieve & View) to specific cases as they arise from enquiries, requests, compliance sampling and similar. For example; this provides the critical ability to respond efficiently and effectively to the GDPR legal requirement for time limited disclosure (30 days) to an individual enquiring about their personal data (Subject Access Request).

Content Discovery

Whatever you want from a content investigation, if it exists anywhere we will find it, analyse it, classify it and report on the results, and provide direct access to each relevant document or file, irrespective of where that source exists across the organisation

Book A demo