by chris

"As a NYC Firefighter, we have many volumes of bulletins to assist us in safety and training. I was wondering if there was a way I could transfer these books onto a CD, or DVD with hyperlinks for cross-referencing?"
- K.C. - Queens, N.Y.

What you ask for is certainly possible, but the answer depends on what form these bulletins are in. If these are paper documents, they will need to be scanned into the computer before you can use them. This means you would need a scanner, and the software to go with it, which will vary by manufacturer. Most scanners will come with at least a basic scanning utility, but some will include additional software that may do some, or all, of what you want.

If you were just to scan the documents, you will end up with an image file - a "picture" of the document. This would be similar to receiving a fax on your computer; it is a picture of what the page looked like, not the actual text data. The size of these picture files is also going to be much larger than a simple text file or word processing document that was used to produce it. Being a picture, you would have no simple way to search for the file, without some sort of database program that would store a directory path to the file. This would mean a lot of manual typing and additional data entry to summarize or associate keywords with the image files.

To be able to search the documents themselves, they will need to be processed with OCR (Optical Character Recognition) software. This software usually operates the scanner, captures the image from the scanner, then attempts to detect patterns of light and dark in the image that translate into text characters. Most OCR products will do a reasonable job of analyzing the results, usually with a better than 90% character recognition as long as the text is reasonably clear and similar to a number of common fonts. What the character recognition software doesn't catch, built-in spell check software usually will. Some scanners may include "lite" versions of OCR software, which should at least get the document into a text or word processor formatted file.

Once you have the documents in any electronic form, whether it is a word processor document, ASCII text or grayscale bitmap image, you can then copy them to CD for distribution or backup. Sorting and indexing on a CD could be a challenge unless you save or convert these into some common format like HTML web pages. You could easily create HTML index pages to link to the individual files, plus, with HTML being a text-based format, it can be read, viewed, or searched on just about any computer system (Macintosh, Windows, Linux, etc. all should be able to read a standard ISO CD/DVD format disk, and all have browsers to view the files with.) Both Mac and Windows system have a directory search capability to search for files containing specified text. Most search features will require having the documents or web page versions on a web server with some sort of search engine installed or enabled.

