public class LuceneDocumentBase extends SDXDocumentBase
SDXDocumentBaseTarget.ConfigurationNodeDocumentBase.ConfigurationNode| Modifier and Type | Field and Description |
|---|---|
protected FieldList |
_fieldList
The (Lucene) fields that are to be handled by the index.
|
protected java.util.HashMap |
_xmlFieldList
The list of fields with a XML type
|
static java.lang.String |
DBELEM_ATTRIBUTE_REMOTE_ACCESS
The implied attribute stating whether this document base is to be exposed to remote access or not.
|
static java.lang.String |
ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
The element used to define system fields in sdx.xconf.
|
protected java.lang.String |
INDEX_DIR_CURRENT
Directory names for indexes
|
protected java.lang.String |
INDEX_DIR_MAIN |
protected long |
lastDocCount
Number of indexed doc since last split
|
protected LuceneIndex |
luceneActiveIndex
The active index for this document base
|
protected LuceneIndex |
luceneCurrentIndex
The temporary index for this document base
|
protected java.util.Vector |
luceneSearchIndexList
The sub-indexes for this document base (first entry is the activeIndex)
|
protected java.lang.String |
SEARCH_INDEX_DIRECTORY_NAME
The directory name for the index that stores documents' indexation.
|
protected int |
subIndexCount
Number of subindexes
|
_configuration, _documentAdditionStatus, _ilevel, _ilogger, _isIndexOptimized, autoOptimize, baseIndexDir, DOC_ADD_STATUS_ADDED, DOC_ADD_STATUS_FAILURE, DOC_ADD_STATUS_IGNORED, DOC_ADD_STATUS_REFRESHED, DOC_ADD_STATUS_REPLACED, DOC_URL, ELEMENT_NAME_DEFAULT_HPP, ELEMENT_NAME_DEFAULT_MAXSORT, isDatadirShared, keepOriginalDocuments, scheduler, SDX_DATABASE_FORMAT, SDX_DATABASE_VERSION, SDX_DATABASE_VERSION_2_3, SDX_DATE, SDX_DATE_MILLISECONDS, SDX_ISO8601_DATE, SDX_USER, splitActive, splitDoc, splitSize, splitUnit, useCompoundFiles_indexationPipeline, _oaiHarv, ATTRIBUTE_AUTO_OPTIMIZE, ATTRIBUTE_COMPOUND_FILES, ATTRIBUTE_SPLIT_DOC, ATTRIBUTE_SPLIT_SIZE, ATTRIBUTE_SPLIT_UNIT, DBELEM_ATTRIBUTE_DEFAULT, DBELEM_ATTRIBUTE_HPP, DBELEM_ATTRIBUTE_KEEP_ORIGINAL, DBELEM_ATTRIBUTE_MAXSORT, defaultHitsPerPage, defaultMaxSort, defaultRepository, ELEMENT_NAME_INDEX_SPLIT, ELEMENT_NAME_OPTIMIZE, INTERNAL_FIELD_NAME_SDX_OAI_DELETED_RECORD, INTERNAL_FIELD_NAME_SDXALL, INTERNAL_FIELD_NAME_SDXAPPID, INTERNAL_FIELD_NAME_SDXCONTENTLENGTH, INTERNAL_FIELD_NAME_SDXDBID, INTERNAL_FIELD_NAME_SDXDOCID, INTERNAL_FIELD_NAME_SDXDOCTYPE, INTERNAL_FIELD_NAME_SDXMODDATE, INTERNAL_FIELD_NAME_SDXREPOID, INTERNAL_SDXALL_FIELD_VALUE, isDefault, locale, oaiRepo, oaiRepositories, PROPERTY_NAME_ATTACHED, PROPERTY_NAME_CONTENT_LENGTH, PROPERTY_NAME_DOCTYPE, PROPERTY_NAME_MIMETYPE, PROPERTY_NAME_ORIGINAL, PROPERTY_NAME_PARENT, PROPERTY_NAME_REPO, PROPERTY_NAME_SUB, repoConnectionPool, repositories, useMetadata_database, CLASS_NAME_SUFFIX, DATABASE_DIR_NAME, databaseConf, dbLocation, dbPath, DEFAULT_DATABASE_TYPE_context, _description, _encoding, _id, _locale, _logger, _manager, _xmlizable_objects, _xmlLang, isToSaxInitializedCLASS_NAME_SUFFIX, PACKAGE_QUALNAMEDEFAULT_ENCODINGALL_SAVE_ATTRIB, PATH_ATTRIB, SAVE_DIRECTORY_PARAM| Constructor and Description |
|---|
LuceneDocumentBase()
Creates the document base.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
addSubIndex()
Adds a splitted sub-index and update configuration aftermath
|
protected void |
addSubIndex(LuceneIndex index)
Adds a splitted sub-index and update configuration aftermath
|
protected void |
addToSearchIndex(java.lang.Object indexationDoc,
boolean batchIndex)
Writes a document to the search index
|
void |
backup(SaveParameters save_config)
Saves the DocumentBase data objects
|
protected void |
backupIndexes(SaveParameters save_config)
Save the indexes files
|
protected void |
backupTimeStamp(SaveParameters save_config)
Save the timestamp files
|
protected void |
compactSearchIndex() |
void |
configure(org.apache.avalon.framework.configuration.Configuration configuration)
Sets the configuration options for this document base.
|
protected void |
configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the Lucene document base
|
protected void |
configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the fields list
|
protected void |
configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the OAI harverster of this Lucene document base.
|
protected void |
configureOAIRepositories(org.apache.avalon.framework.configuration.Configuration configuration)
Configures on or more OAI repositories.
|
protected void |
configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
Configures an OAIRespository
Configures an OAIRespository based on the configuration element <oai-repository>
|
protected void |
configureSearchIndex()
Configures Lucene search index
|
OAIRepository |
createOAIRepository()
Creates the default OAIRepository for the documentbase, using the older configuration
|
OAIRepository |
createOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
Creates the OAIRepository for the documentbase
Configures an OAIRespository based on the configuration that must
start with an element <oai-repository>
|
OAIRepository |
createOAIRepository(java.lang.String repoId)
Creates an OAIRepository for the documentbase, using the older configuration
|
java.util.Date |
creationDate()
Returns the creation date of the Lucene search index.
|
void |
delete(Document[] docs,
org.xml.sax.ContentHandler handler)
Deletes documents to this base.
|
protected void |
deleteFromSearchIndex(java.lang.String docId) |
int |
docCount()
Returns the number of documents in all Lucene sub indexes.
|
protected java.lang.String |
getFormatedSubIndexId(int subIndexNumber)
Gets the formated sub-index number (for directories name)
|
Index |
getIndex()
Gets the Index object for indexing and searching.
|
protected java.lang.Object |
getIndexationDocument(IndexableDocument doc,
java.lang.String storeDocId,
java.lang.String repoId,
IndexParameters params) |
org.apache.lucene.index.IndexReader |
getIndexReader()
Return the Lucene index reader
Returns the index reader for all this document base indexes.
|
protected long |
getIndexSize(LuceneIndex index)
Returns the index size
|
LuceneIndex |
getLuceneIndex() |
org.apache.lucene.search.Searcher |
getSearcher()
Returns the Lucene index searcher
Returns the index searcher for all this document base indexes.
|
java.util.HashMap |
getXMLFieldList()
Returns the list of XML type fields
|
void |
index(IndexableDocument[] docs,
Repository repository,
IndexParameters params,
org.xml.sax.ContentHandler handler)
Adds one or more indexables documents to the search index of Lucene.
|
void |
indexModified()
Modifies the last modfication timestamp file
|
void |
init()
Initializes the document base.
|
protected void |
initializeVectorizedIndex()
Initializes the index vector
Initializes the index vector by searching all sub index in it's directory
NB : working as intended. |
protected boolean |
initToSax()
Init the LinkedHashMap _xmlizable_objects with the objects in order to describ them in XML
|
protected void |
initVolatileObjectsToSax()
Init the LinkedHashMap _xmlizable_volatile_objects with the objects in
order to describ them in XML.
|
java.util.Date |
lastModificationDate()
Returns the last modification date of the Lucene search index.
|
void |
mergeBatch()
Deprecated.
This method is deprecated since SDX v. 2.3. Use mergeCurrentBatch() instead.
|
void |
mergeCurrentBatch()
Merges a batch of documents
Merges a batch of documents (in memory) into the physical index on the
file system and optimize this one if necessary (depends of the
autoOptimize attribute for the current Document Base). |
void |
optimize()
Process an optimization of the indexes and repositories and system databases
|
void |
reloadFieldList(java.lang.String appConfString)
Reloads the fieldList of an application
|
protected void |
removeSubIndex()
Remove a splitted sub-index and update configuration aftermath
Currently of no use as there is no plan to do so, just here as a reminder for future functionnalities
|
protected void |
renewKeyIndex()
Refreshes data for the main and current index
|
void |
replaceFieldList(FieldList fieldList)
Replaces the current fieldList by the new one
|
void |
restore(SaveParameters save_config)
Restore the DocumentBase data objects
|
protected void |
restoreIndexes(SaveParameters save_config)
Save the indexes files
|
protected void |
restoreTimeStamp(SaveParameters save_config)
Restore the timestamp files
|
protected IndexParameters |
setBaseParameters(IndexParameters params)
Sets the default pipeline parameters and ensures the params have a pipeline
|
protected void |
setSearchIndexParameters(LuceneIndexParameters params)
Sets the search index parameters for indexation performance
|
boolean |
splitCheck(boolean currentIndex)
Tests splitting conditions
Returns true when splitting condition are reached.
|
void |
splitIndex(boolean currentIndex)
Splits current index
Splits the current big index into 2 smaller one
|
add, checkIntegrity, configureBase, configureIdGenerator, configureOAIComponents, configureOptimizeTriggers, configureRepositories, configureSplit, delete, deleteIndexableDocumentComponents, deleteRelationsToMastersFromDatabase, getByteSplitSize, getConfiguration, getDocument, getDocument, getDocument, getDocument, getIndexationInformations, getIndexationLogger, getOwners, getRelated, getRepositoryConfigurationList, getRepositoryForDocument, getRepositoryForStorage, getSplitDoc, getSplitSize, getSplitUnit, getUseCompoundFiles, handleParameters, index, index, isAutoOptimized, isIndexOptimized, rollbackIndexation, setConfiguration, targetTriggeredaddOaiDeletedRecord, addOAIRepository, configurePipeline, createEntityForDocMetaData, delete, deletePhysicalDocument, getDefaultHitsPerPage, getDefaultMaxSort, getDefaultOAIRepository, getDefaultRepository, getIdGenerator, getIndexationPipeline, getMimeType, getOAIHarvester, getOAIRepositoriesSize, getOAIRepository, getOAIRepository, getPooledRepositoryConnection, getRepository, getSourceValidity, isDefault, isUseMetadata, managedOaiDeletedRecord, optimizeDatabase, optimizeRepositories, releasePooledRepositoryConnections, removeOaiDeletedRecordconfigure, getClassNameSuffix, getDatabaseconfigureDescription, contextualize, enableLogging, getBaseAttributes, getContext, getDescription, getEncoding, getId, getLocale, getLog, getServiceManager, getXmlLang, service, setDescription, setEncoding, setId, setLocale, setUpSdxObject, setUpSdxObject, setXmlLang, toSAX, verifyConfigurationResourcesclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetId, setIdgetDescription, setDescriptiongetEncoding, setEncodinggetLocale, getXmlLang, setLocale, setXmlLanggetIdprotected java.util.Vector luceneSearchIndexList
protected LuceneIndex luceneActiveIndex
protected LuceneIndex luceneCurrentIndex
protected FieldList _fieldList
protected java.util.HashMap _xmlFieldList
protected int subIndexCount
protected long lastDocCount
protected final java.lang.String INDEX_DIR_CURRENT
protected final java.lang.String INDEX_DIR_MAIN
protected final java.lang.String SEARCH_INDEX_DIRECTORY_NAME
public static final java.lang.String DBELEM_ATTRIBUTE_REMOTE_ACCESS
public static final java.lang.String ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
public LuceneDocumentBase()
public void configure(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configure in interface org.apache.avalon.framework.configuration.Configurableconfigure in class SDXDocumentBaseconfiguration - The configuration object from which to build a document base.
Sample configuration entry:
<sdx:documentBase sdx:id = "myDocumentBaseName" sdx:type = "lucene">
<sdx:fieldList xml:lang = "fr-FR" sdx:variant = "" sdx:analyzerConf = "" sdx:analyzerClass = "">
<sdx:field code = "fieldName" type = "word" xml:lang = "fr-FR" sdx:analyzerClass = "" sdx:analyzerConf = ""/>
<sdx:field code = "fieldName2" type = "field" xml:lang = "fr-FR" brief = "true"/>
<sdx:field code = "fieldName3" type = "date" xml:lang = "fr-FR"/>
<sdx:field code = "fieldName4" type = "unindexed" xml:lang = "fr-FR"/>
</sdx:fieldList>
<sdx:index>
<sdx:pipeline sdx:id = "sdxIndexationPipeline">
<sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step2" sdx:type = "xslt"/>
<sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step3" sdx:type = "xslt" keep = "true"/>
</sdx:pipeline>
</sdx:index>
<sdx:repositories>
<sdx:repository baseDirectory = "blah4" depth = "3" extent = "100" sdx:type = "FS" sdx:default = "true" sdx:id = "blah4"/>
<sdx:repository ref = "blah2"/>
</sdx:repositories>
</sdx:documentBase>
org.apache.avalon.framework.configuration.ConfigurationExceptionprotected void configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configureDocumentBase in class SDXDocumentBaseconfigruation - Configurationorg.apache.avalon.framework.configuration.ConfigurationExceptionprotected void configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configuration - org.apache.avalon.framework.configuration.ConfigurationExceptionpublic void reloadFieldList(java.lang.String appConfString)
throws SDXException
appConfString - The path of the configuration file wich contain the new fieldList (eg, file:///myFiles/application.xconf, cocoon://myApplication/conf/application.xconf)SDXExceptionpublic void replaceFieldList(FieldList fieldList) throws org.apache.avalon.framework.configuration.ConfigurationException
fieldList - The new fieldList wich replace the old oneorg.apache.avalon.framework.configuration.ConfigurationExceptionprotected void configureSearchIndex()
throws org.apache.avalon.framework.configuration.ConfigurationException
org.apache.avalon.framework.configuration.ConfigurationExceptionpublic OAIRepository createOAIRepository(java.lang.String repoId)
createOAIRepository in class AbstractDocumentBaserepoId - String The id of the repository to createpublic OAIRepository createOAIRepository()
createOAIRepository in interface DocumentBasecreateOAIRepository in class AbstractDocumentBasecreateOAIRepository(String)public OAIRepository createOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
configuration - The configurationprotected void configureOAIRepositories(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configureOAIRepositories in class SDXDocumentBaseconfiguration - org.apache.avalon.framework.configuration.ConfigurationExceptionprotected void configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configureOAIRepository in class SDXDocumentBaseconfiguration - The configurationorg.apache.avalon.framework.configuration.ConfigurationExceptionSDXDocumentBase.configureOAIRepository(org.apache.avalon.framework.configuration.Configuration)protected void configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configureOAIHarvester in class SDXDocumentBaseorg.apache.avalon.framework.configuration.ConfigurationExceptionpublic void index(IndexableDocument[] docs, Repository repository, IndexParameters params, org.xml.sax.ContentHandler handler) throws SDXException, org.xml.sax.SAXException, org.apache.cocoon.ProcessingException
After adding the document to the search index, this method recycles the Lucene searcher if :
index in interface DocumentBaseindex in class SDXDocumentBasedocs - The documents to add.repository - The repository where to store the documents. If null is passed, the default repository will be used.params - The parameters for this adding action.handler - A content handler where to send information about the process (may be null)
TODO : what kind of "informations" ? -pbSDXExceptionorg.xml.sax.SAXExceptionorg.apache.cocoon.ProcessingExceptionSDXDocumentBase.index(fr.gouv.culture.sdx.document.IndexableDocument[], fr.gouv.culture.sdx.repository.Repository, fr.gouv.culture.sdx.documentbase.IndexParameters, org.xml.sax.ContentHandler)public void delete(Document[] docs, org.xml.sax.ContentHandler handler) throws SDXException, org.xml.sax.SAXException, org.apache.cocoon.ProcessingException
Deletes one or more documents to this LuceneDocumentBase and recycle Lucene searcher if deletes only one document or the LuceneDocumentBase is not autoOptimize.
delete in interface DocumentBasedelete in class SDXDocumentBasedocs - The document to add and to index.handler - A content handler to feed with information.SDXExceptionorg.xml.sax.SAXExceptionorg.apache.cocoon.ProcessingExceptionAbstractDocumentBase.delete(Document, ContentHandler)protected IndexParameters setBaseParameters(IndexParameters params)
setBaseParameters in class SDXDocumentBaseparams - The params object provided by the user at indexation timepublic java.util.HashMap getXMLFieldList()
SDXDocumentBasegetXMLFieldList in class SDXDocumentBasepublic Index getIndex()
public LuceneIndex getLuceneIndex()
protected void setSearchIndexParameters(LuceneIndexParameters params)
params - The lucene specific params to userprotected void addToSearchIndex(java.lang.Object indexationDoc,
boolean batchIndex)
throws SDXException
addToSearchIndex in class SDXDocumentBaseindexationDoc - The Document to addbatchIndex - SDXExceptionprotected void deleteFromSearchIndex(java.lang.String docId)
throws SDXException
deleteFromSearchIndex in class SDXDocumentBaseSDXExceptionprotected void compactSearchIndex()
throws SDXException
compactSearchIndex in class SDXDocumentBaseSDXExceptionprotected java.lang.Object getIndexationDocument(IndexableDocument doc, java.lang.String storeDocId, java.lang.String repoId, IndexParameters params) throws SDXException
getIndexationDocument in class SDXDocumentBaseSDXExceptionpublic java.util.Date lastModificationDate()
public java.util.Date creationDate()
public void init()
throws SDXException
DocumentBaseThis method must be called after the super.getLog() has been set and the configuration done.
init in interface DocumentBaseinit in class SDXDocumentBaseSDXExceptionprotected boolean initToSax()
AbstractSdxObjectinitToSax in class SDXDocumentBaseprotected void initVolatileObjectsToSax()
Some objects need to be refresh each time a toSAX is called.
initVolatileObjectsToSax in class SDXDocumentBasepublic void optimize()
optimize in interface DocumentBaseoptimize in class SDXDocumentBasepublic void mergeCurrentBatch()
Merges a batch of documents (in memory) into the physical index on the
file system and optimize this one if necessary (depends of the
autoOptimize attribute for the current Document Base).
mergeCurrentBatch in class SDXDocumentBasepublic void indexModified()
indexModified in class SDXDocumentBasepublic void splitIndex(boolean currentIndex)
throws java.io.IOException,
SDXException
Splits the current big index into 2 smaller one
splitIndex in class SDXDocumentBaseIOException, - SDXExceptionjava.io.IOExceptionSDXExceptionprotected void initializeVectorizedIndex()
throws org.apache.avalon.framework.configuration.ConfigurationException
Initializes the index vector by searching all sub index in it's directory
NB : working as intended.
org.apache.avalon.framework.configuration.ConfigurationExceptionprotected void addSubIndex()
throws org.apache.avalon.framework.configuration.ConfigurationException
SDXException - If it's impossible to configure or initialize the sub-index to add.org.apache.avalon.framework.configuration.ConfigurationExceptionprotected void removeSubIndex()
public boolean splitCheck(boolean currentIndex)
throws SDXException
Returns true when splitting condition are reached. If so, should be followed by a splitIndex() call. Controls order:
splitCheck in class SDXDocumentBasecurrentIndex - boolean to indicate the test concerns the current
index (true) or the active one (false)true when splitting condition are reached,
false otherwise.SDXExceptionprotected long getIndexSize(LuceneIndex index)
index - LuceneIndexpublic org.apache.lucene.search.Searcher getSearcher()
throws SDXException
Returns the index searcher for all this document base indexes.
SDXException - If it's not possible to build MultiSearcher.ParallelMultiSearcherpublic org.apache.lucene.index.IndexReader getIndexReader()
throws SDXException
Returns the index reader for all this document base indexes.
SDXException - If it's not possible to build MultiReader.MultiReaderprotected java.lang.String getFormatedSubIndexId(int subIndexNumber)
subIndexNumber - int representing the number of the sub-indexprotected void addSubIndex(LuceneIndex index) throws SDXException
index - LuceneIndexSDXException - If nt's not possible to configure and initialize th sub-index.protected void renewKeyIndex()
throws SDXException
SDXException - If it's impossible to freeing resources or
initializing Lucene index.public void backup(SaveParameters save_config) throws SDXException
backup in interface Saveablebackup in class SDXDocumentBasesave_config - SaveParametersSDXExceptionSaveable.backup(fr.gouv.culture.sdx.utils.save.SaveParameters)protected void backupIndexes(SaveParameters save_config) throws SDXException
backupIndexes in class SDXDocumentBaseSDXExceptionprotected void backupTimeStamp(SaveParameters save_config) throws SDXException
backupTimeStamp in class SDXDocumentBaseSDXExceptionpublic void restore(SaveParameters save_config) throws SDXException
restore in interface Saveablerestore in class SDXDocumentBaseSDXExceptionSaveable.restore(fr.gouv.culture.sdx.utils.save.SaveParameters)protected void restoreIndexes(SaveParameters save_config) throws SDXException
restoreIndexes in class SDXDocumentBaseSDXExceptionprotected void restoreTimeStamp(SaveParameters save_config) throws SDXException
restoreTimeStamp in class SDXDocumentBaseSDXExceptionpublic int docCount()
public void mergeBatch()
throws SDXException
mergeBatch in class SDXDocumentBaseSDXExceptionCopyright © 2000-2010 Ministere de la culture et de la communication / AJLSM. All Rights Reserved.