Let’s Learn: Document Indexing
The “Let’s Learn” series helps make navigating document scanning and document management space a little easier so you can get the best solution for your business challenge.
What is indexing?
Document indexing is the identification of specific attributes of a document to simplify and expedite accurate retrieval of a document. This is accomplished with an index, a system used to make finding information easier with descriptive data. It’s important that document indexing is done accurately otherwise it’s difficult, if not impossible, to get back to a scanned document.
What do I need to know about indexing?
There are some common terms you should know and understand about document indexing including:
- A database is an electronic collection of records stored in a central file and accessible by many users for many applications. It can also be a structured collection of records or data that is stored in a computer so that a program can consult it to answer queries quickly and flexibly.
- A relational database management system (RDBMS) is a database management system in which data and relationship among the data is stored in the form of tables.
- Index fields, or key fields, are database fields used to categorize and organize documents. They are usually user-defined and can be used to search and retrieve documents. Some examples include first name, last name, address, invoice number and date.
- Barcode is a machine-readable array of vertical lines and spaces representing data.
- Match and merge uses index information that already exists in other systems, like an accounting system, to populate indexing fields. It allows you to index one or more unique fields and populate the remaining fields automatically with matching data from a text file or table lookup provided from another system such as accounting or human resources system.
- Metadata is the data associated with documents that provide information on their contents, context and use, such as date created and file name.
Types of indexing
Indexing comes in many different flavors, from manual to mostly automated. One indexing method may be better than another depending on location of data on a document, legibility and what data needs to be captured. Let’s take a look at 3 different kinds of indexing.
Double key indexing
Double key indexing, or double blind verification, is the most traditional way of indexing. There are 3 double key indexing methods used today that involve either two keying operators or a keying operator and an automated indexing method.
Full-text indexing and search enables document retrieval by searching on a word or phrase found within a document. Every word in the document is indexed into a master word list with pointers to the documents and pages where each occurrence of the word appears.
Variable lookup indexing
Variable lookup indexing (VLI) minimizes exceptions by using multiple databases to populate index fields. The lower the exception rate, the more complete and accurate the indexed fields ensuring faster and more dependable retrieval of your scanned documents.
When choosing a document scanning company, ask how they will index your scanned images. Not every document scanning company indexes digital images with the high level of accuracy and quality control necessary to ensure you will be able to get back to your scanned documents.