Package com.lowagie.text.pdf.parser
Class Word
- java.lang.Object
-
- com.lowagie.text.pdf.parser.ParsedTextImpl
-
- com.lowagie.text.pdf.parser.Word
-
- All Implemented Interfaces:
TextAssemblyBuffer
public class Word extends ParsedTextImpl
-
-
Field Summary
Fields Modifier and Type Field Description private boolean
breakBefore
If this word or fragment was preceded by a space, or a line break, it should never be merged into a preceding word.private boolean
shouldNotSplit
Is this an indivisible fragment, because it contained a space or was split from a space- containing string.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
accumulate(TextAssembler p, java.lang.String contextName)
accept a visitor that is assembling textvoid
assemble(TextAssembler p)
Accept a visitor that is assembling textboolean
breakBefore()
private static java.lang.String
escapeHTML(java.lang.String s)
private static java.lang.String
formatPercent(float f)
FinalText
getFinalText(PdfReader reader, int page, TextAssembler assembler, boolean useMarkup)
boolean
shouldNotSplit()
java.lang.String
toString()
private java.lang.String
wordMarkup(java.lang.String text, PdfReader reader, int page, TextAssembler assembler)
Generate markup for this word.-
Methods inherited from class com.lowagie.text.pdf.parser.ParsedTextImpl
getAscent, getBaseline, getDescent, getEndPoint, getSingleSpaceWidth, getStartPoint, getText, getWidth
-
-
-
-
Field Detail
-
shouldNotSplit
private final boolean shouldNotSplit
Is this an indivisible fragment, because it contained a space or was split from a space- containing string. Non-splittable words can be merged (into new non-splittable words).
-
breakBefore
private final boolean breakBefore
If this word or fragment was preceded by a space, or a line break, it should never be merged into a preceding word.
-
-
Constructor Detail
-
Word
Word(java.lang.String text, float ascent, float descent, Vector startPoint, Vector endPoint, Vector baseline, float spaceWidth, boolean isCompleteWord, boolean breakBefore)
- Parameters:
text
- text contentascent
- font ascent (e.g. height)descent
- How far below the baseline letters gostartPoint
- first point of the textendPoint
- ending offset of textbaseline
- line along which text is set.spaceWidth
- how much space is a space supposed to take.isCompleteWord
- word should never be splitbreakBefore
- word starts here, should never combine to the left.
-
-
Method Detail
-
formatPercent
private static java.lang.String formatPercent(float f)
-
escapeHTML
private static java.lang.String escapeHTML(java.lang.String s)
-
accumulate
public void accumulate(TextAssembler p, java.lang.String contextName)
accept a visitor that is assembling text- Parameters:
p
- the assembler that is visiting us.contextName
- What is the wrapping markup element name if any- See Also:
TextAssemblyBuffer.accumulate(com.lowagie.text.pdf.parser.TextAssembler, String)
,TextAssemblyBuffer.accumulate(com.lowagie.text.pdf.parser.TextAssembler, String)
-
assemble
public void assemble(TextAssembler p)
Accept a visitor that is assembling text- Parameters:
p
- the assembler that is visiting us.- See Also:
TextAssemblyBuffer.assemble(com.lowagie.text.pdf.parser.TextAssembler)
,TextAssemblyBuffer.assemble(com.lowagie.text.pdf.parser.TextAssembler)
-
wordMarkup
private java.lang.String wordMarkup(java.lang.String text, PdfReader reader, int page, TextAssembler assembler)
Generate markup for this word. send the assembler a strings representing a CSS style that will format us nicely.- Parameters:
text
- passed in because we may have wanted to alter it, e.g. by trimming white space, or filtering characters or something.reader
- the file reader from which we are extractingpage
- number of the page we are reading text fromassembler
- object to assemble text from fragments and larger strings on a page.- Returns:
- markup to represent this one word.
-
getFinalText
public FinalText getFinalText(PdfReader reader, int page, TextAssembler assembler, boolean useMarkup)
- Parameters:
reader
- pdfReader that knows about our document. (size, etc. available here).page
- which page are we extracting text from.assembler
- Builds result by accepting content from text components of various sorts.useMarkup
- Should we generate tagged text, or just plain text.- Returns:
- the final text ready to concatenate into result string.
- See Also:
TextAssemblyBuffer.getFinalText(PdfReader, int, TextAssembler, boolean)
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
shouldNotSplit
public boolean shouldNotSplit()
- Specified by:
shouldNotSplit
in classParsedTextImpl
- Returns:
- true if this was extracted from a string containing spaces, in which case, we assume further splitting is not needed.
- See Also:
ParsedTextImpl.shouldNotSplit()
-
breakBefore
public boolean breakBefore()
- Specified by:
breakBefore
in classParsedTextImpl
- Returns:
- true if this was a space or other item that should force a space before it.
- See Also:
ParsedTextImpl.breakBefore()
-
-