In addition to specifying sort orders, space (blank) handling, and upper/lowercase folding, you can also use the character map files to make Zebra ignore leading articles in sorting records, or when doing complete field searching.
This is done using the map
directive in the
character map file. In a nutshell, what you do is map certain
sequences of characters, when they occur in the
beginning of a field, to a space. Assuming that the
character "@" is defined as a space character in your file, you
can do:
map (^The\s) @ map (^the\s) @
The effect of these directives is to map either 'the' or 'The', followed by a space character, to a space. The hat ^ character denotes beginning-of-field only when complete-subfield indexing or sort indexing is taking place; otherwise, it is treated just as any other character.
Because the default.idx
file can be used to
associate different character maps with different indexing types
-- and you can create additional indexing types, should the need
arise -- it is possible to specify that leading articles should
be ignored either in sorting, in complete-field searching, or
both.
If you ignore certain prefixes in sorting, then these will be eliminated from the index, and sorting will take place as if they weren't there. However, if you set the system up to ignore certain prefixes in searching, then these are deleted both from the indexes and from query terms, when the client specifies complete-field searching. This has the effect that a search for 'the science journal' and 'science journal' would both produce the same results.