Xml¶
New in v1.21.0
This represents an HTML or an XML node. It is a helper class intended to access the DOM (Document Object Model) content of a Story object.
There is no need to ever directly construct an Xml object: after creating a Story, simply take Story.body
– which is an Xml node – and use it to navigate your way through the story’s DOM.
Method / Attribute |
Description |
---|---|
Add a ul tag - bulleted list, context manager. |
|
Add a pre tag, context manager. |
|
Add a dl tag, context manager. |
|
add a div tag (renamed from “section”), context manager. |
|
Add a header tag (one of h1 to h6), context manager. |
|
Add a hr tag. |
|
Add a img tag. |
|
Add a a tag. |
|
Add a ol tag, context manager. |
|
Add a p tag. |
|
Add a span tag, context manager. |
|
Add subscript text(sub tag) - inline element, treated like text. |
|
Add subscript text (sup tag) - inline element, treated like text. |
|
Add code text (code tag) - inline element, treated like text. |
|
Add code text (code tag) - inline element, treated like text. |
|
Add code text (code tag) - inline element, treated like text. |
|
Add code text (code tag) - inline element, treated like text. |
|
Add a text string. Line breaks |
|
Append a child node. |
|
Make a copy if this node. |
|
Make a new node with a given tag name. |
|
Create direct text for the current node. |
|
Find a sub-node with given properties. |
|
Repeat previous “find” with the same criteria. |
|
Insert an element after current node. |
|
Insert an element before current node. |
|
Remove this node. |
|
Set the alignment using a CSS style spec. Only works for block-level tags. |
|
Set an arbitrary key to some value (which may be empty). |
|
Set the background color. Only works for block-level tags. |
|
Set bold on or off or to some string value. |
|
Set text color. |
|
Set the number of columns. Argument may be any valid number or string. |
|
Set the font-family, e.g. “sans-serif”. |
|
Set the font size. Either a float or a valid HTML/CSS string. |
|
Set a id. A check for uniqueness is performed. |
|
Set italic on or off or to some string value. |
|
Set inter-block text distance ( |
|
Set height of a line. Float like 1.5, which sets to |
|
Set the margin(s), float or string with up to 4 values. |
|
Insert a page break after this node. |
|
Insert a page break before this node. |
|
Set any or all desired properties in one call. |
|
Set (add) a “style” that is not supported by its own |
|
Set (add) a “class” attribute. |
|
Set indentation for first textblock line. Only works for block-level nodes. |
|
Either the HTML tag name like p or |
|
Either the node’s text or |
|
Check if the node is a text. |
|
Contains the first node one level below this one (or |
|
Contains the last node one level below this one (or |
|
The next node at the same level (or |
|
The previous node at the same level. |
|
The top node of the DOM, which hence has the tagname html. |
Class API
- class Xml¶
-
- add_header(value)¶
Add a header tag (one of h1 to h6), context manager. See headings.
- Parameters:
value (int) – a value 1 - 6.
- add_image(name, width=None, height=None)¶
Add an img tag. This causes the inclusion of the named image in the DOM.
- Parameters:
name (str) – the filename of the image. This must be the member name of some entry of the Archive parameter of the Story constructor.
width – if provided, either an absolute (int) value, or a percentage string like “30%”. A percentage value refers to the width of the specified
where
rectangle inStory.place()
. If this value is provided andheight
is omitted, the image will be included keeping its aspect ratio.height – if provided, either an absolute (int) value, or a percentage string like “30%”. A percentage value refers to the height of the specified
where
rectangle inStory.place()
. If this value is provided andwidth
is omitted, the image’s aspect ratio will be honored.
- add_link(href, text=None)¶
Add an a tag - inline element, treated like text.
- Parameters:
href (str) – the URL target.
text (str) – the text to display. If omitted, the
href
text is shown instead.
- add_number_list()¶
Add an ol tag, context manager.
- add_paragraph()¶
Add a p tag, context manager.
- add_subscript(text)¶
Add “subscript” text(sub tag) - inline element, treated like text.
- add_superscript(text)¶
Add “superscript” text (sup tag) - inline element, treated like text.
- add_code(text)¶
Add “code” text (code tag) - inline element, treated like text.
- add_var(text)¶
Add “variable” text (var tag) - inline element, treated like text.
- add_samp(text)¶
Add “sample output” text (samp tag) - inline element, treated like text.
- add_kbd(text)¶
Add “keyboard input” text (kbd tag) - inline element, treated like text.
- set_align(value)¶
Set the text alignment. Only works for block-level tags.
- Parameters:
value – either one of the Text Alignment or the text-align values.
- set_attribute(key, value=None)¶
Set an arbitrary key to some value (which may be empty).
- Parameters:
key (str) – the name of the attribute.
value (str) – the (optional) value of the attribute.
- get_attributes()¶
Retrieve all attributes of the current nodes as a dictionary.
- Returns:
a dictionary with the attributes and their values of the node.
- get_attribute_value(key)¶
Get the attribute value of
key
.- Parameters:
key (str) – the name of the attribute.
- Returns:
a string with the value of
key
.
- remove_attribute(key)¶
Remove the attribute
key
from the node.- Parameters:
key (str) – the name of the attribute.
- set_bgcolor(value)¶
Set the background color. Only works for block-level tags.
- Parameters:
value – either an RGB value like (255, 0, 0) (for “red”) or a valid background-color value.
- set_bold(value)¶
Set bold on or off or to some string value.
- Parameters:
value –
True
,False
or a valid font-weight value.
- set_color(value)¶
Set the color of the text following.
- Parameters:
value – either an RGB value like (255, 0, 0) (for “red”) or a valid color value.
- set_columns(value)¶
Set the number of columns.
- Parameters:
value – a valid columns value.
Note
Currently ignored - supported in a future MuPDF version.
- set_font(value)¶
Set the font-family.
- Parameters:
value (str) – e.g. “sans-serif”.
- set_fontsize(value)¶
Set the font size for text following.
- Parameters:
value – a float or a valid font-size value.
- set_id(unqid)¶
Set a id. This serves as a unique identification of the node within the DOM. Use it to easily locate the node to inspect or modify it. A check for uniqueness is performed.
- Parameters:
unqid (str) – id string of the node.
- set_italic(value)¶
Set italic on or off or to some string value for the text following it.
- Parameters:
value –
True
,False
or some valid font-style value.
- set_leading(value)¶
Set inter-block text distance (
-mupdf-leading
), only works on block-level nodes.- Parameters:
value (float) – the distance in points to the previous block.
- set_lineheight(value)¶
Set height of a line.
- Parameters:
value – a float like 1.5 (which sets to
1.5 * fontsize
), or some valid line-height value.
- set_margins(value)¶
Set the margin(s).
- Parameters:
value – float or string with up to 4 values. See CSS documentation.
- set_pagebreak_after()¶
Insert a page break after this node.
- set_pagebreak_before()¶
Insert a page break before this node.
- set_properties(align=None, bgcolor=None, bold=None, color=None, columns=None, font=None, fontsize=None, indent=None, italic=None, leading=None, lineheight=None, margins=None, pagebreak_after=False, pagebreak_before=False, unqid=None, cls=None)¶
Set any or all desired properties in one call. The meaning of argument values equal the values of the corresponding
set_
methods.Note
The properties set by this method are directly attached to the node, whereas every
set_
method generates a new span below the current node that has the respective property. So to e.g. “globally” set some property for the body, this method must be used.
- add_style(value)¶
Set (add) some style attribute not supported by its own
set_
method.- Parameters:
value (str) – any valid CSS style value.
- add_class(value)¶
Set (add) some “class” attribute.
- Parameters:
value (str) – the name of the class. Must have been defined in either the HTML or the CSS source of the DOM.
- set_text_indent(value)¶
Set indentation for the first textblock line. Only works for block-level nodes.
- Parameters:
value – a valid text-indent value. Please note that negative values do not work.
- append_child(node)¶
Append a child node. This is a low-level method used by other methods like
Xml.add_paragraph()
.- Parameters:
node – the Xml node to append.
- create_text_node(text)¶
Create direct text for the current node.
- Parameters:
text (str) – the text to append.
- Return type:
- Returns:
the created element.
- create_element(tag)¶
Create a new node with a given tag. This a low-level method used by other methods like
Xml.add_paragraph()
.- Parameters:
tag (str) – the element tag.
- Return type:
- Returns:
the created element. To actually bind it to the DOM, use
Xml.append_child()
.
- insert_before(elem)¶
Insert the given element
elem
before this node.- Parameters:
elem – some Xml element.
- insert_after(elem)¶
Insert the given element
elem
after this node.- Parameters:
elem – some Xml element.
- clone()¶
Make a copy of this node, which then may be appended (using
Xml.append_child()
) or inserted (using one ofXml.insert_before()
,Xml.insert_after()
) in this DOM.- Returns:
the clone (Xml) of the current node.
- remove()¶
Remove this node from the DOM.
- debug()¶
For debugging purposes, print this node’s structure in a simplified form.
- find(tag, att, match)¶
Under the current node, find the first node with the given
tag
, attributeatt
and valuematch
.- Parameters:
tag (str) – restrict search to this tag. May be
None
for unrestricted searches.att (str) – check this attribute. May be
None
.match (str) – the desired attribute value to match. May be
None
.
- Return type:
Xml.
- Returns:
None
if nothing found, otherwise the first matching node.
- find_next(tag, att, match)¶
Continue a previous
Xml.find()
(orfind_next()
) with the same values.- Return type:
Xml.
- Returns:
None
if none more found, otherwise the next matching node.
- tagname¶
Either the HTML tag name like p or
None
if a text node.
- text¶
Either the node’s text or
None
if a tag node.
- is_text¶
Check if a text node.
- first_child¶
Contains the first node one level below this one (or
None
).
- last_child¶
Contains the last node one level below this one (or
None
).
- next¶
The next node at the same level (or
None
).
- previous¶
The previous node at the same level.
- root¶
The top node of the DOM, which hence has the tagname html.
Setting Text properties¶
In HTML tags can be nested such that innermost text inherits properties from the tag enveloping its parent tag. For example <p>
.
To achieve the same effect, methods like Xml.set_bold()
and Xml.set_italic()
each open a temporary span with the desired property underneath the current node.
In addition, these methods return there parent node, so they can be concatenated with each other.
Context Manager support¶
The standard way to add nodes to a DOM is this:
body = story.body
para = body.add_paragraph() # add a paragraph
para.set_bold() # text that follows will be bold
para.add_text("some bold text")
para.set_italic() # text that follows will additionally be italic
para.add_txt("this is bold and italic")
para.set_italic(False).set_bold(False) # all following text will be regular
para.add_text("regular text")
Methods that are flagged as “context managers” can conveniently be used in this way:
body = story.body
with body.add_paragraph() as para:
para.set_bold().add_text("some bold text")
para.set_italic().add_text("this is bold and italic")
para.set_italic(False).set_bold(False).add_text("regular text")
para.add_text("more regular text")