|
Our patent-pending document conversion process includes: document preparation, scanning, OCR and image enhancement, indexing, quality control and packaging. PTFS has significant expertise in the handling and digitization of fragile and historic original materials and the capability to provide an archive quality original in addition to multiple client specified resolutions.
Image Enhancement Image enhancement, which typically includes cropping, despeckle and deskewing is performed prior to Optical Character Recognition (OCR) processing. The results from the enhancement help provide a higher quality "clean" image, which is used during the OCR process to increase accuracy. Image enhancement typically used on photographs can include color correction, contrast, sharpening etc.
Optical Character Recognition PTFS' multi-pass, voting, OCR conversion technology is applied to complete the digitization process. Using a level 5 voting OCR process, it is estimated that PTFS can increase OCR accuracy and reduce error rates by up to 80% over single pass OCR. In the same 2000 character page, OCR errors can be reduced to as low as 8 errors rather than the 40 errors consistent with conventional single pass OCR. Any image that fails to meet set parameters is auto-rejected and the entire folder is sent to image repair.
Metadata & Re-Keying PTFS offers low-cost metadata creation and re-keying. Small-point, cursive, or poor quality second-generation copies that cannot be OCRed accurately may require re-keying. PTFS provides accurate, clean text for search and display as well as to create full-text documents, metadata records. Single or double-key services can be combined with customized spell checking, stop words, and special dictionaries. Descriptive information can be captured during digitization to populate metadata fields.
Electronic Text Archives PTFS creates electronic archives from documents created via a wide variety of applications. Existing electronic documents (reports, theses, correspondence, etc.) can be indexed and added to a digital repository. This process allows searching across both newly digitized and born digital materials with a single search interface.
File Formats PTFS has extensive experience delivering numerous types of digital image file formats. These formats include JPEG 2000, TIFF, DjVu, PDF and many more. PTFS builds compound digital image with hidden text documents. When indexed with PTFS' ArchivalWare™, these document types (PDF, DjVu) allow the end-user to search on the full text and view the original image with graphics or embedded images. In addition, text "hits" are displayed on the image.

|