Begin by creating a folder to contain the PDFs you want to index. All PDFs should be complete in both content and electronic features, such as links, bookmarks, and form fields. If the files to be indexed include scanned documents, make sure that the text is searchable. Break long documents into smaller, chapter-sized files, to improve search performance. You can also add information to a file’s document properties to improve the file’s searchability.
Before you index a document collection, it’s essential that you set up the document structure on the disk drive or network server volume and verify cross-platform file names. File names may become truncated and hard to retrieve in a cross-platform search. To prevent this problem, consider these guidelines:
Rename files, folders, and indexes using the MS-DOS file-naming convention (eight characters or fewer followed by a three-character file extension), particularly if you plan to deliver the document collection and index on an ISO 9660-formatted CD-ROM disc.
Remove extended characters, such as accented characters and non-English characters, from file and folder names. (The font used by the Catalog feature does not support character codes 133 through 159.)
Don’t use deeply nested folders or path names that exceed 256 characters for indexes that will be searched by Mac OS users.
If you use Mac OS with an OS/2 LAN server, configure LAN Server Macintosh (LSM) to enforce MS-DOS file-naming conventions, or index only FAT volumes. (HPFS volumes may contain long unretrievable file names.)
If the document structure includes subfolders that you don’t want indexed, you can exclude them during the indexing process.