This is an overview of some of Zebra's most important features:
Very large databases: logical files can be automatically partitioned over multiple disks.
Arbitrarily complex records. The internal data format is a structured format conceptually similar to XML or GRS-1, which allows lists, nested structured data elements and variant forms of data.
Robust updating - records can be added and deleted ``on the fly'' without rebuilding the index from scratch. Records can be safely updated even while users are accessing the server. The update procedure is tolerant to crashes or hard interrupts during database updating - data can be reconstructed following a crash.
Configurable to understand many input formats. A system of input filters driven by regular expressions allows most ASCII-based data formats to be easily processed. SGML, XML, ISO2709 (MARC), and raw text are also supported.
Searching supports a powerful combination of boolean queries as well as relevance-ranking (free-text) queries. Truncation, masking, full regular expression matching and "approximate matching" (eg. spelling mistakes) are all handled.
Index-only databases: data can be, and usually is, imported into Zebra's own storage, but Zebra can also refer to external files, building and maintaining indexes of "live" collections.
Zebra is written in portable C, so it runs on most Unix-like systems
as well as Windows (NT/2000/2003). A binary distribution for Windows
is available at
http://ftp.indexdata.com/pub/zebra/win32/,
and pre-built packages are available for
GNU/Debian Linux
at
http://ftp.indexdata.com/pub/zebra/debian/.
Z39.50 protocol support:
Protocol facilities: Init, Search, Present (retrieval), Segmentation (support for very large records), Delete, Scan (index browsing), Sort, Close and support for the ``update'' Extended Service to add or replace an existing XML record.
Piggy-backed presents are honored in the search request - that is, a subset of the found records can be returned directly with a search response, enabling search and retrieval to happen in a single round-trip.
Named result sets are supported.
Easily configured to support different application profiles, with tables for attribute sets, tag sets, and abstract syntaxes. Additional tables control facilities such as element mappings to different schema (eg., GILS-to-USMARC).
Complex composition specifications using Espec-1 (partial support). Element sets are defined using the Espec-1 capability, and are specified in configuration files as simple element requests (and, optionally, variant requests).
Multiple record syntaxes for data retrieval: GRS-1, SUTRS, XML, ISO2709 (MARC), etc. Records can be mapped between record syntaxes and schemas on the fly.
SRU Web Service support:
The protocol operations explain
,
searchRetrieve
and scan
are supported.
CQL to internal query model RPN conversion is supported.
Multiple XML record formats
for data retrieval are supported, modelled over the GRS-1, SUTRS,
MARC record formats. Records can be mapped between record
schemas on the fly. Arbitrarily complex XSLT transformations
can be applied during record retrieval if one uses the
alvis
filter module.
Extended RPN queries for search/retrieve and scan are supported, for controlling approximate hit counts, etc.