Page Contents

File parsers
- Configuration
- Available parsers

OMERO

Downloads
Feature List
Licensing

Previous topic

Next topic

This Page

Note

This documentation is for OMERO 5.2. This version is now in maintenance mode and will only be updated in the event of critical bugs or security concerns. OMERO 5.3 is expected in the first quarter of 2017.

File parsers¶

File parsers extract text from various file types and provide it as a Reader to the FullTextBridge for use during search indexing. Plain text formats can use the default fileParser bean, but any specialized format, such as PDF or RTF requires special libraries and special registration.

Configuration¶

Currently, configuration takes places solely in service-ome.api.Search.xml. Eventually, it should be able to replace file parsers at configuration or even runtime.

Available parsers¶

File type	Parser
application/pdf	http://pdfbox.apache.org
text/xml	(internal)
text/plain	(internal)
text/csv	(internal)

The base class for File parsers are FileParser.java

See also