Interface TextAssembler

  • All Known Implementing Classes:
    MarkedUpTextAssembler

    public interface TextAssembler
    process a series of objects and text fragments, assembling them into a one final text object representing the whole content.
    • Method Detail

      • process

        void process​(FinalText completed,
                     java.lang.String contextName)
        Parameters:
        completed - process a complete chunk -- just add this subsection into the proper place.
        contextName - Name of the element context we are in. Null value if it's an Artifact.
      • process

        void process​(Word completed,
                     java.lang.String contextName)
        Parameters:
        completed - process a complete chunk -- just add this subsection into the proper place.
        contextName - Name of the element context we are in. Null value if it's an Artifact.
      • process

        void process​(ParsedText parsed,
                     java.lang.String contextName)
        Parameters:
        parsed - process one of a number of raw pdf text chunks, with placement, font, etc.
        contextName - Name of the element context we are in. Null value if it's an Artifact.
      • renderText

        void renderText​(FinalText completed)
        Parameters:
        completed - process a complete chunk -- just add this subsection into the proper place.
      • renderText

        void renderText​(ParsedTextImpl parsed)
        Parameters:
        parsed - process one of a number of raw pdf text chunks, with placement, font, etc.
      • endParsingContext

        FinalText endParsingContext​(java.lang.String containingElementName)
        Parameters:
        containingElementName - This is an element name to surround the extracted text
        Returns:
        the final text for the set of fragments and fully parsed items we were passed during processing.
      • getWordId

        java.lang.String getWordId()
        assembler can calculate an identifier for each word on a page, for use in markup.
        Returns:
        the new unique id.
      • setPage

        void setPage​(int page)
        Parameters:
        page - number of the page we are assembling
      • reset

        void reset()