Release 1.4: document Indexing - 2010-05-11 17:17 - administrator
Indexing may be as simple as keeping track of unique document identifiers; but often it takes a more complex form, providing classification through the document metadata or even through word indexes extracted from the document contents. Indexing exists mainly to support retrieval.

The new release 1.4 togheter with to Apache Tika module provides objects retrieval through the attached files metadata and/or contents.

When you attach a file to a Oberon object, you may choose to index the file content and metadata simply using the "indexed" keyword in the OOQL command. For example:

object fileput <OBJECT_ID> indexed filetype "MS_Word" name "/home/myname/mydocument.docx";

The file content and the metadata will be extracted from the file and stored/indexed into the Oberon DB. With a command like the following:

query immediate object <ClassPattern> <NamePattern> <RevisionPattern> filter ( filecontent[MS_Word] ~= "<PatternToSearch>" );

you can retrieve all objects that match the <PatternToSearch> pattern in the attached files content/metadata (exclusively for MS_Word filetype).

Archives ]
© 2008-2013 Mirko Solazzi | Website Templates by IceTemplates.com
Please Read: Privacy Policy and Terms of Use

Powered by Mantis Bugtracker
Copyright © 2000 - 2008 Mantis Group