saxon10/saxon10.gizmo.pod

754 lines
22 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

=encoding utf8
=head1 NAME
gizmo - Saxon command-line interactive or batch-mode utility
=head1 SYNOPSIS
gizmo [I<options>]
=head1 DESCRIPTION
Saxon (from 10.0) provides the Gizmo command line utility, which can be used
interactively or in batch mode, to perform simple operations such as examining
the content of a document, renaming or deletion of selected elements in the
document, or performing simple XSLT transformation and XSD validation.
The utility can be started from the command line using the Java entry-point
class C<net.sf.saxon.Gizmo>, and its actions are controlled by sub-commands
entered one per line, either on the standard input, or in a file named in the
B<-q> option.
=head1 OPTIONS
=over
=item B<-s>:I<source.xml>
Defines the source document supplied as the input to the pipeline of
sub-commands. If omitted, input is taken from standard input.
=item B<-q>:I<script.txt>
Identifies a filename containing sub-commands to be executed, one per line. By
default, sub-commands are taken from standard input.
=back
=head1 COMMANDS
Each line of input is a sub-command. The various sub-commands are listed in the
following sections.
At any point in the processing, there is (or is not) a current document. On
entry, the current document is the source document identified in the -s option,
if specified. Many of the sub-commands change the current document. The current
document may be inspected using the show sub-command, and may be saved to
filestore using the save sub-command.
Most of the sub-commands select nodes within the current document using an
XPath expression. If unprefixed element names appear within the expression,
they match nodes in the source document by local-name alone. (That is, I<X>
means C<*:>I<X>). If you only want to select no-namespace names, use the form
C<Q{}>I<X>.
Within XPath expressions, content completion is available for (a) recognized
XPath keywords such as “C<following-sibling>”, and (b) the names of elements
and attributes appearing in the current source document when it was first
loaded.
Some of the sub-commands also have a second argument which is a query.
Generally this will be an XQuery element constructor such as
C<E<lt>aE<gt>Title: {string(.)}E<lt>a/E<gt>>. It is evaluated with the context
item set to each item selected by the first expression, in turn.
=over
=item B<copy> I<expression>
Deletes everything other than the nodes selected by the expression; returns a
new document containing only the selected nodes. Note that if the expression
selects multiple elements, the new document will not be a well-formed XML
document (it will be a "fragment").
The result of the operation can be inspected using the command B<show>.
Example
The command:
copy //img
returns a document containing as its children copies of all the img elements in
the source.
=item B<call> {I<filename>}
Executes the script contained in the specified file.
The script shares the context with the calling script. It has access to the
variables and namespaces declared in the calling script, and any variables and
namespaces that it declared are available to the caller on return.
=item B<delete> I<expression>
The selected nodes (elements, attributes, or namespaces) are deleted, along
with their children and descendants.
The result of the operation can be inspected using the command B<show>.
Example
Given a source document:
<cities>
<city name="Berlin" country="DE"/>
<city name="Paris" country="FR"/>
<city name="Rome" country="IT"/>
</cities>
The command:
delete //city[@name='Rome']
produces the document:
<cities>
<city name="Berlin" country="DE"/>
<city name="Paris" country="FR"/>
</cities>
=item B<follow> I<expression> B<with> I<query>
After each node I<N> selected by the expression, the query is evaluated (with
I<N> as the context item), and its result is inserted as an immediate following
sibling of I<N>. Both the expression and the query must return nodes that are
capable of having siblings: that is, element, text, comment, or processing
instruction nodes. But if the query returns an atomic value, it is treated as a
text node with the same string value.
The result of the operation can be inspected using the command B<show>.
Examples
The command:
follow //img with <caption/>
inserts an empty CE<lt>caption/E<gt>> element after every img element in the
document.
Given a source document:
<cities>
<city name="Berlin" country="DE"/>
<city name="Paris" country="FR"/>
<city name="Rome" country="IT"/>
</cities>
The command:
follow //city[@name='Paris'] with <city name="Lyon" country="{@country}"/>
produces the document:
<cities>
<city name="Berlin" country="DE"/>
<city name="Paris" country="FR"/>
<city name="Lyon" country="FR"/>
<city name="Rome" country="IT"/>
</cities>
=item B<help> I<keyword>
If the I<keyword> is the name of a command, a brief summary of the syntax and
purpose of the command is displayed.
If the I<keyword> is omitted, a list of available commands is displayed.
=item B<list> I<expression>
Outputs a representation of the result of the expression.
If a node is selected, it is shown as a path (for example
C<DOC/SECTION[3]/PARA[2]>), preceded by a line number if one is available.
If an atomic value is selected, its string value is displayed.
Line numbers are available in documents loaded directly from filestore, but not
in documents constructed using commands such as B<rename> or B<delete>.
Example
Given a source document:
<cities>
<city name="Berlin" country="DE"/>
<city name="Paris" country="FR"/>
<city name="Rome" country="IT"/>
</cities>
The command:
list //city[@name='Paris']
displays:
Line 3: /cities/city[2]
=item B<load> I<filename>
The current document is discarded, and a new current document is loaded from
the specified file.
If the filename is relative, it is taken as being relative to the current
working directory.
The users home directory can be referred to as C<~>.
Content completion is available: use the tab key to suggest possible names at
each level.
=item B<namespace> {I<prefix>} {I<uri>}
Binds a namespace prefix to a URI. QNames using this prefix may be used in
subsequent XPath expressions within the script.
Note that the conventional prefixes C<xml>, C<xsl>, C<saxon>, C<xs>, C<xsi>,
C<fn>, C<math>, C<map>, and array are already bound to the conventional
namespace URIs.
In path expressions, unprefixed names match by local name only, so it is
usually possible to select elements without binding a namespace prefix.
=item B<paths>
Outputs a list of the distinct element paths that exist in the current source
document. They are output in order of first appearance, when scanning the tree
in document order.
This command is useful to get an overview of the structure of the source
document, especially if it is too large to display comfortably.
Example
For example, given the F<books.xml> data file included with the Saxon resources
download, the output would be:
Found 16 items
/BOOKLIST
/BOOKLIST/BOOKS
/BOOKLIST/BOOKS/ITEM
/BOOKLIST/BOOKS/ITEM/TITLE
/BOOKLIST/BOOKS/ITEM/AUTHOR
/BOOKLIST/BOOKS/ITEM/PUBLISHER
/BOOKLIST/BOOKS/ITEM/PUB-DATE
/BOOKLIST/BOOKS/ITEM/LANGUAGE
/BOOKLIST/BOOKS/ITEM/PRICE
/BOOKLIST/BOOKS/ITEM/QUANTITY
/BOOKLIST/BOOKS/ITEM/ISBN
/BOOKLIST/BOOKS/ITEM/PAGES
/BOOKLIST/BOOKS/ITEM/DIMENSIONS
/BOOKLIST/BOOKS/ITEM/WEIGHT
/BOOKLIST/CATEGORIES
/BOOKLIST/CATEGORIES/CATEGORY
=item B<precede> I<expression> B<with> I<query>
Before each node C<N> selected by the expression, the query is evaluated (with
I<N> as the context item), and its result is inserted as an immediate preceding
sibling of I<N>. Both the expression and the query must return nodes that are
capable of having siblings: that is, element, text, comment, or processing
instruction nodes. But if the query returns an atomic value, it is treated as a
text node with the same string value.
The result of the operation can be inspected using the command B<show>.
Examples
The command:
precede //img with <caption/>
inserts an empty C<E<lt>caption/E<gt>> element before every C<img> element in
the document.
Given a source document:
<cities>
<city name="Berlin" country="DE"/>
<city name="Munich" country="DE"/>
<city name="Paris" country="FR"/>
<city name="Lyon" country="FR"/>
<city name="Rome" country="IT"/>
</cities>
The command:
precede //city[not(@country=preceding-sibling::*[1]/@country)] with <country name="{@country}"/>
produces the document:
<cities>
<country name="DE"/>
<city name="Berlin" country="DE"/>
<city name="Munich" country="DE"/>
<country name="FR"/>
<city name="Paris" country="FR"/>
<city name="Lyon" country="FR"/>
<country name="IT"/>
<city name="Rome" country="IT"/>
</cities>
=item B<prefix> I<expression> B<with> I<query>
For every node I<N> selected by the expression, the query is evaluated (with
I<N> as the context item), and its result is inserted as the first child of
I<N>. The expression must return nodes that are capable of having children
(that is document or element nodes). The query must return nodes that are
capable of having siblings (that is, element, text, comment, or processing
instruction nodes). But if the query returns an atomic value, it is treated as
a text node with the same string value.
As a special case, if the query returns an attribute node, it is added as an
attribute of the selected element, replacing any existing attribute with the
same name.
The result of the operation can be inspected using the command B<show>.
Examples
The command:
prefix //section with <head>{data(@title)}</head>
copies the value of the title attribute of every C<section> element into a new
C<head> element appearing as the first child of the C<section>.
Given a source document:
<cities>
<city name="Berlin" country="DE"/>
<city name="Munich" country="DE"/>
<city name="Paris" country="FR"/>
<city name="Lyon" country="FR"/>
<city name="Rome" country="IT"/>
</cities>
The command:
prefix //city with <country name="{@country}"/>
produces the document:
<cities>
<city name="Berlin" country="DE">
<country name="DE"/>
</city>
<city name="Munich" country="DE">
<country name="DE"/>
</city>
<city name="Paris" country="FR">
<country name="FR"/>
</city>
<city name="Lyon" country="FR">
<country name="FR"/>
</city>
<city name="Rome" country="IT">
<country name="FR"/>
</city>
</cities>
With the same document, the command:
prefix //city with attribute{'id'}{count(preceding-sibling::*)+1}
produces:
<cities>
<city name="Berlin" country="DE" id="1"/>
<city name="Munich" country="DE" id="2"/>
<city name="Paris" country="FR" id="3"/>
<city name="Lyon" country="FR" id="4"/>
<city name="Rome" country="IT" id="5"/>
</cities>
=item B<quit> [B<now>]
Quits the application.
If “B<now>” is specified, the command exits immediately. Otherwise, if the
current document has been modified since it was read from filestore, the
command prompts the user asking whether the document should first be saved.
(See the B<save> command).
=item B<rename> I<expression> B<as> I<expression>
Nodes (elements or attributes) selected by the first I<expression> are renamed;
the new name is given by computing the expression in the second argument.
The second I<expression> is evaluated with the existing node as the context
item. If the result of the expression is a string, then this string is used as
the new name of the node (if it contains a prefix, this must first be declared
using the B<namespace> command). If the result of the expression is a QName,
then it defines the expanded name of the new element or attribute in its
entirety. New namespace declarations will be added to the output document if
required; existing namespace declarations are not removed, unless new bindings
are defined for existing prefixes.
Examples
rename //NOTE as "COMMENT"
renames all NOTE elements as COMMENT elements.
rename //@* as lower-case(name())
renames all attributes by converting the existing name to lower-case.
=item B<replace> I<expression> B<with> I<query>
The nodes (elements or attributes) selected by the given I<expression> are
replaced by the result of evaluating the I<query>.
Note that it is the entire node that is replaced, not just its content. To
replace the content of an element or attribute node, rather than replacing the
entire node, use the command B<update>.
The result of the operation can be inspected using the command B<show>.
Examples
Given a source document:
<cities>
<city name="Berlin" country="DE"/>
<city name="Munich" country="DE"/>
<city name="Paris" country="FR"/>
<city name="Lyon" country="FR"/>
<city name="Rome" country="IT"/>
</cities>
The command:
replace //city[@country="DE"] with <stadt @Name="{@name}"/>
produces the document:
<cities>
<stadt Name="Berlin"/>
<stadt Name="Munich"/>
<city name="Paris" country="FR"/>
<city name="Lyon" country="FR"/>
<city name="Rome" country="IT"/>
</cities>
Given a source document:
<cities>
<city><name>Berlin</name><country>DE</country></city>
<city><name>Munich</name><country>DE</country></city>
<city><name>Paris</name><country>FR</country></city>
<city><name>Lyon</name><country>FR</country></city>
<city><name>Rome</name><country>IT</country></city>
</cities>
The command:
replace //city/name[.="Munich"]/text() with "München"
produces the document:
<cities>
<city><name>Berlin</name><country>DE</country></city>
<city><name>München</name><country>DE</country></city>
<city><name>Paris</name><country>FR</country></city>
<city><name>Lyon</name><country>FR</country></city>
<city><name>Rome</name><country>IT</country></city>
</cities>
With the same source document, the command:
replace //name with {.}
produces the document:
<cities>
<city><cityName><name>Berlin</name></cityName><country>DE</country></city>
<city><cityName><name>Munich</name></cityName><country>DE</country></city>
<city><cityName><name>Paris</name></cityName><country>FR</country></city>
<city><cityName><name>Lyon</name></cityName><country>FR</country></city>
<city><cityName><name>Rome</name></cityName><country>IT</country></city>
</cities>
Note that in this expression, C<{.}> follows the XQuery rules rather than the
XSLT rules: it copies the whole element, rather than extract its string value.
=item B<save> I<filename> [I<output-param>=I<value>]…
Saves the current document to filestore, with the serialization parameters
specified.
If the file already exists, Gizmo asks “C<Overwrite existing file? (Y|N)>” and
proceeds only if the answer is “C<Y>” or “C<y>”.
If the I<filename> is relative, it is taken as being relative to the current
working directory.
The users home directory can be referred to as C<~>.
Content completion is available: use the tab key to suggest possible names at
each level.
Examples
The command:
save updated-data.xml method=xml indent=yes
saves the document to filestore using the XML output method with indentation.
=item B<schema> I<filename>
Requires Saxon-EE
Loads an XSD schema from the specified location.
The I<filename> is handled in the same way as the B<load> command.
The schema definitions are available for use in validate commands issued
subsequently in the session. The command is additive; the schema components are
added to the collection of schema components that are already loaded, which
means an error will be reported if the definitions conflict. It is not possible
to unload schema definitions once loaded, except by closing the session and
starting a new one.
=item B<set> I<name> = I<expression> | B<set . => I<expression>
The first form binds a variable to the value of the expression. The variable
name may be written as a simple NCName, or as a lexical QName, or as an EQName
in C<Q{>I<uri>B<}>I<local> format; it may also be preceded by a “C<$>” sign
(which is ignored).
The variable may be used in XPath expressions and queries appearing later in
the script.
The second form: B<set . => I<expression> may be used to set the current
document. In this case the expression must evaluate to a single document node.
For example:
set . = doc('books.xml')
achieves the same effect as:
load books.xml
Note that all updating commands (such as delete, rename etc.) create a new copy
of the current document. Variables that were set before the updating command
continue to reference the document in the state it was in before the update.
To display the value of a variable, use the B<show> or B<list> commands.
=item B<show> [I<expression>]
Outputs a representation of the result of the I<expression>.
If a node is selected, it is shown as an XML serialization of the content of
the node.
If an atomic value is selected, its string value is displayed.
The expression may be omitted, and defaults to C</>, which displays the current
document.
The document is always shown with indentation enabled, so it is not necessarily
the same as the document that will be written to filestore by the B<save> command.
Examples
show
Displays the current document, with indentation.
show //TITLE
Displays selected elements, for example:
<title>Pride and Prejudice</title>
<title>Sense and Sensibility</title>
<title>Emma</title>
<title>Northanger Abbey</title>
show count(//TITLE), sort(//TITLE)
Displays the result of an arbitrary XPath expression.
=item B<strip>
The effect is the same as:
delete //text()[not(normalize-space())]
That is, all text nodes consisting entirely of whitespace are removed from the
document.
=item B<suffix> I<expression> B<with> I<query>
For every node I<N> selected by the expression, the query is evaluated (with
I<N> as the context item), and its result is inserted as the last child of
I<N>. The expression must return nodes that are capable of having children
(that is document or element nodes). The query must return nodes that are
capable of having siblings (that is, element, text, comment, or processing
instruction nodes). But if the query returns an atomic value, it is treated as
a text node with the same string value.
Example
The command:
suffix //page with <copyright>Copyright (c) Saxonica 2020</page>
injects a copyright element at the end of every page.
=item B<transform> I<filename>
Transforms the current document using the XSLT stylesheet contained in the
specified file.
The I<filename> is handled in the same way as the B<load> command.
On completion, the result of the transformation becomes the new current
document.
Note: it is not currently possible to specify parameters to the transformation.
=item B<undo>
Reverts the most recent changes.
The current document is returned to the state it was in before the most recent
B<copy>, B<delete>, B<follow>, B<load>, B<precede>, B<prefix>, B<rename>,
B<replace>, B<strip>, B<suffix>, B<transform>, B<update>, or B<validate>
command.
It will also undo the effect of:
set . = expression
that is, it reverts to the previous current document.
It is not possible to undo the effect of B<save> or B<schema> commands.
=item B<update> I<expression> B<with> I<query>
The content of the nodes selected by the given expression is replaced by the
result of evaluating the query:
=over
=item ·
When an element or document node is selected, the existing children are
deleted, and replaced with the result of the expression.
=item ·
When the selected element is a node kind that cannot have children, for example
a comment, text node, or attribute, then its content is replaced with the
string value of the query result.
=back
Examples
Given the source document:
<cities>
<city name="Berlin" country="DE"/>
<city name="Munich" country="DE"/>
<city name="Paris" country="FR"/>
<city name="Lyon" country="FR"/>
<city name="Rome" country="IT"/>
</cities>
The command:
update //@country[.='IT'] with "ITALIA"
produces:
<cities>
<city name="Berlin" country="DE"/>
<city name="Munich" country="DE"/>
<city name="Paris" country="FR"/>
<city name="Lyon" country="FR"/>
<city name="Rome" country="ITALIA"/>
</cities>
Given a source document:
<cities>
<city><name>Berlin</name><country>DE</country></city>
<city><name>Munich</name><country>DE</country></city>
<city><name>Paris</name><country>FR</country></city>
<city><name>Lyon</name><country>FR</country></city>
<city><name>Rome</name><country>IT</country></city>
</cities>
The command:
update //city/name[.="Munich"] with "München"
produces the document:
<cities>
<city><name>Berlin</name><country>DE</country></city>
<city><name>München</name><country>DE</country></city>
<city><name>Paris</name><country>FR</country></city>
<city><name>Lyon</name><country>FR</country></city>
<city><name>Rome</name><country>IT</country></city>
</cities>
With the same source document, the command:
update //@country with lower-case(.)
produces:
<cities>
<city><name>Berlin</name><country>de</country></city>
<city><name>Munich</name><country>de</country></city>
<city><name>Paris</name><country>fr</country></city>
<city><name>Lyon</name><country>fr</country></city>
<city><name>Rome</name><country>it</country></city>
</cities>
=item B<validate>
Requires Saxon-EE
Validates the current document using the XSD schema(s) previously loaded using
the B<schema> command, supplemented with any schema found via an
C<xsi:schemaLocation> or C<xsi:noNamespaceSchemaLocation> attribute in the
source document.
On completion, the validated result (complete with type annotations) becomes
the new current document. Any expressions and queries executed against a typed
document are treated as schema-aware.
If validation fails, the validation error messages are output, and the current
document remains unchanged.
=back
=head1 SEE ALSO
L<https://www.saxonica.com/documentation10/index.html>, L<saxon10(1)>,
L<saxon10q(1)>, L<xsltproc(1)>.
=head1 AUTHOR
Michael H. Kay E<lt>L<mike@saxonica.com>E<gt>