Class PdfParser

  • All Implemented Interfaces:
    org.springframework.beans.factory.Aware, org.springframework.context.ApplicationContextAware

    public class PdfParser
    extends FileParser
    FileParser for "application/pdf" files using PDFBox.
    • Constructor Detail

      • PdfParser

        public PdfParser()
    • Method Detail

      • doParse

        public java.lang.Iterable<java.io.Reader> doParse​(java.io.File file)
                                                   throws java.lang.Exception
        Description copied from class: FileParser
        Template method to parse a File into manageable chunks. The default implementation reads from the file lazily with chunks overlapping on the final white space. For example a file with: The quick brown fox jumps over the lazy dog might be parsed to: The quick brown fox jumps and jumps over the lazy dog. Receives a non-null, readable File instance from FileParser.parse(File) and can return a possible null Iterable or throw an Exception. In any of the non-successful cases, the FileParser.EMPTY Iterable will be returned to the consumer.
        Overrides:
        doParse in class FileParser
        Throws:
        java.lang.Exception