File Content Extractor

Specify key words, phrases, or patterns of text to locate in files
Set input settings to filter which files are searched
Customise the output to appear on the screen, or output structured XML files for further analysis or ingestion into other systems via your own integration
Let the software scan through your files to locate the information you’re looking for

Potential Use-Cases Across Many Industries

Searching for evidence in legal cases
Finding medical history in patient documentation
Verifying information for auditing
Digital forensic investigations in policing
Locating information for remediation programmes
Classifying files for business processes
Many more!

Our solution is highly configurable and not tethered to a particular industry or use-case.

Support for Lots of Common File Types

PDFs, both structured text and scanned photocopies via optical character recognition
Microsoft Office, including Word, Excel, and PowerPoint
Emails, including attachments within the emails
Common image files for scanned documents, such as BMP, JPG, PNG, TIF, and GIF
Text files, including code files, web pages, XML, JSON, and CSV

Faster Than Any Human

Why assign a dozen staff members to spend weeks searching through files when you can licence software to do it for you at a fraction of the cost?
QWERTY Software Solutions’ File Content Extractor multi-threads file searches to efficiently read documents, turning a single desktop computer into a dedicated team of virtual file reviewers

No Model Training Required

Unlike solutions advertising as “AI” (Artificial Intelligence), File Content Extractor does not require training, and it does not require thousands of files pre-reviewed by people to provide the expected output
This solution is designed to locate text based on “actuals”, not “assumptions”
This isn’t AI. It’s better.

Security of your Data Comes First

Unlike some providers where you must move your data to their processing centre to work with documents, you leave your documents in your business domain, and run File Content Extractor in your own network, eliminating the threat of data leakages and privacy threats
File Content Extractor makes no API calls, sends no data externally, and only saves information you configure it to save.

Pay for a Licence, Get Free Access to Expertise

After helping many business projects and operational teams utilise our unique file extraction method, QWERTY Software Solutions can help you navigate implementation to minimise risk and maximise results
You also get free support for using the software, including configuration and training

Flexible Licencing Options

Each license purchased is tied to a single computer and user account (no registration is required, and no Internet access is required to apply the license)
Licences are available in 1 month, 3 months, 6 months, and 12 months options
As a special promotion until the end of December 2025, the 1-, 3-, and 6-month licenses purchased will automatically have its length doubled for FREE (i.e. pay for a 6-month licence, get a 12-month licence)!

Free Demonstration Before Deciding to Purchase

We tailor our solution package for every customer, so you cannot directly purchase a licence and download the software from the website at this time.
Contact us for a free demonstration online
We can even demonstrate using sample files you provide to us in advance, so you can verify the expected results when run within your environment

Frequently Asked Questions

Can I extract text from a PDF?

Yes, PDF files are supported, and text is extracted based on simple search terms (e.g. “Tax Invoice”) or pattern matching (e.g. “Dear [Customer Name],”).

Does it work if documents are photocopied?

Yes, optical character recognition is included and can convert photocopied documents to text, with reliability scaled to the quality of the document.

Does any of our data leave our network or organisation?

No. Instead of moving large amounts of data outside the boundaries of your organisation, our solution caters to leaving it within the boundaries of your organisation and bringing the computing resources in instead. This way, instead of exporting potentially gigabytes of data, you only import a small application which fits on a standard CD or USB drive.

How do I know it will work? Can I try it for free?

Before purchasing, we offer free demonstrations online and can include any test files you’re willing to share prior to the demonstration.

Do we need to train it to work for our files?

No, our solution is not based on training AI models. If your file is supported by the software and readable (i.e. not encrypted or corrupt) then it will work out-of-the-box.

What is the process for purchasing?

Contact us for a demonstration and to help us understand your use-case, so we can guide you if and how this technology can assist you
If you’re happy with the demonstration, send us a list of computer usernames and computer names so we can generate licenses for you, and send you an invoice
When the invoice is paid, we send the licenses to you and a copy of the application for installation on your computers.

We don’t want a desktop application. Do you have a web service instead?

No, we don’t support this for 2 reasons. One is that it’s more efficient when wanting to process potentially thousands of files or many gigabytes of data where it is hosted, as opposed to having to transfer it to an external service. The other reason is security of your data. By giving you the application, the data is as safe as it is in your environment, making it easier to use from a risk, privacy, legal, and operational point of view.

How long does it take to process files?

This is impossible to answer because the time taken depends on how many files there are, how many pages within each file, the file content itself (images versus text), the speed of your network (where applicable), the power of your computer, the number of computers you run it on. We have performance profiled and stress-tested the application and were able to process 200k (two hundred thousand) files (Excel workbooks with one worksheet per file) in under 4 minutes, on one computer, with the files read from a local hard drive, while searching for 2 pieces of information (one simple search and one pattern search).

How do we define the search criteria? What governance is recommended over the output?

Customers get free support to help write and test their search criteria. We can help you design governance and workflow based on our experience with previous use-cases and your specific needs. We generally advise running 3 partial runs as tests to fine-tune the search criteria before you run your content extractions and searches in bulk.

File Content Extractor

Frequently Asked Questions

Questions?