API Documentation

Document level API

xmlschema.validate(xml_document, schema=None, cls=None, path=None, schema_path=None, use_defaults=True, namespaces=None, locations=None, base_url=None, defuse='remote', timeout=300, lazy=False)

Validates an XML document against a schema instance. This function builds an XMLSchema object for validating the XML document. Raises an XMLSchemaValidationError if the XML document is not validated against the schema.

Parameters
  • xml_document – can be an XMLResource instance, a file-like object a path to a file or an URI of a resource or an Element instance or an ElementTree instance or a string containing the XML data. If the passed argument is not an XMLResource instance a new one is built using this and defuse, timeout and lazy arguments.

  • schema – can be a schema instance or a file-like object or a file path or a URL of a resource or a string containing the schema.

  • cls – class to use for building the schema instance (for default XMLSchema is used).

  • path – is an optional XPath expression that matches the elements of the XML data that have to be decoded. If not provided the XML root element is used.

  • schema_path – an XPath expression to select the XSD element to use for decoding. If not provided the path argument or the source root tag are used.

  • use_defaults – defines when to use element and attribute defaults for filling missing required values.

  • namespaces – is an optional mapping from namespace prefix to URI.

  • locations – additional schema location hints, in case a schema instance has to be built.

  • base_url – is an optional custom base URL for remapping relative locations, for default uses the directory where the XSD or alternatively the XML document is located.

  • defuse – optional argument to pass for construct schema and XMLResource instances.

  • timeout – optional argument to pass for construct schema and XMLResource instances.

  • lazy – optional argument for construct the XMLResource instance.

xmlschema.to_dict(xml_document, schema=None, cls=None, path=None, process_namespaces=True, locations=None, base_url=None, defuse='remote', timeout=300, lazy=False, **kwargs)

Decodes an XML document to a Python’s nested dictionary. The decoding is based on an XML Schema class instance. For default the document is validated during the decoding phase. Raises an XMLSchemaValidationError if the XML document is not validated against the schema.

Parameters
  • xml_document – can be an XMLResource instance, a file-like object a path to a file or an URI of a resource or an Element instance or an ElementTree instance or a string containing the XML data. If the passed argument is not an XMLResource instance a new one is built using this and defuse, timeout and lazy arguments.

  • schema – can be a schema instance or a file-like object or a file path or a URL of a resource or a string containing the schema.

  • cls – class to use for building the schema instance (for default uses XMLSchema).

  • path – is an optional XPath expression that matches the elements of the XML data that have to be decoded. If not provided the XML root element is used.

  • process_namespaces – indicates whether to use namespace information in the decoding process.

  • locations – additional schema location hints, in case a schema instance has to be built.

  • base_url – is an optional custom base URL for remapping relative locations, for default uses the directory where the XSD or alternatively the XML document is located.

  • defuse – optional argument to pass for construct schema and XMLResource instances.

  • timeout – optional argument to pass for construct schema and XMLResource instances.

  • lazy – optional argument for construct the XMLResource instance.

  • kwargs – other optional arguments of XMLSchema.iter_decode() as keyword arguments.

Returns

an object containing the decoded data. If validation='lax' keyword argument is provided the validation errors are collected and returned coupled in a tuple with the decoded data.

Raises

XMLSchemaValidationError if the object is not decodable by the XSD component, or also if it’s invalid when validation='strict' is provided.

xmlschema.to_json(xml_document, fp=None, schema=None, cls=None, path=None, converter=None, process_namespaces=True, locations=None, base_url=None, defuse='remote', timeout=300, lazy=False, json_options=None, **kwargs)

Serialize an XML document to JSON. For default the XML data is validated during the decoding phase. Raises an XMLSchemaValidationError if the XML document is not validated against the schema.

Parameters
  • xml_document – can be an XMLResource instance, a file-like object a path to a file or an URI of a resource or an Element instance or an ElementTree instance or a string containing the XML data. If the passed argument is not an XMLResource instance a new one is built using this and defuse, timeout and lazy arguments.

  • fp – can be a write() supporting file-like object.

  • schema – can be a schema instance or a file-like object or a file path or an URL of a resource or a string containing the schema.

  • cls – schema class to use for building the instance (for default uses XMLSchema).

  • path – is an optional XPath expression that matches the elements of the XML data that have to be decoded. If not provided the XML root element is used.

  • converter – an XMLSchemaConverter subclass or instance to use for the decoding.

  • process_namespaces – indicates whether to use namespace information in the decoding process.

  • locations – additional schema location hints, in case a schema instance has to be built.

  • base_url – is an optional custom base URL for remapping relative locations, for default uses the directory where the XSD or alternatively the XML document is located.

  • defuse – optional argument to pass for construct schema and XMLResource instances.

  • timeout – optional argument to pass for construct schema and XMLResource instances.

  • lazy – optional argument for construct the XMLResource instance.

  • json_options – a dictionary with options for the JSON serializer.

  • kwargs – optional arguments of XMLSchema.iter_decode() as keyword arguments to variate the decoding process.

Returns

a string containing the JSON data if fp is None, otherwise doesn’t return anything. If validation='lax' keyword argument is provided the validation errors are collected and returned, eventually coupled in a tuple with the JSON data.

Raises

XMLSchemaValidationError if the object is not decodable by the XSD component, or also if it’s invalid when validation='strict' is provided.

xmlschema.from_json(source, schema, path=None, converter=None, json_options=None, **kwargs)

Deserialize JSON data to an XML Element.

Parameters
  • source – can be a string or a read() supporting file-like object containing the JSON document.

  • schema – an XMLSchema instance.

  • path – is an optional XPath expression for selecting the element of the schema that matches the data that has to be encoded. For default the first global element of the schema is used.

  • converter – an XMLSchemaConverter subclass or instance to use for the encoding.

  • json_options – a dictionary with options for the JSON deserializer.

  • kwargs – Keyword arguments containing options for converter and encoding.

Returns

An element tree’s Element instance. If validation='lax' keyword argument is provided the validation errors are collected and returned coupled in a tuple with the Element instance.

Raises

XMLSchemaValidationError if the object is not encodable by the schema, or also if it’s invalid when validation='strict' is provided.

Schema level API

class xmlschema.XMLSchema10

The class for XSD v1.0 schema instances. It’s generated by the meta-class XMLSchemaMeta and takes the same API of XMLSchemaBase.

xmlschema.XMLSchema

The default class for schema instances.

alias of xmlschema.validators.schema.XMLSchema10

class xmlschema.XMLSchemaBase(source, namespace=None, validation='strict', global_maps=None, converter=None, locations=None, base_url=None, defuse='remote', timeout=300, build=True, use_meta=True, loglevel=None)

Base class for an XML Schema instance.

Parameters
  • source (Element or ElementTree or str or file-like object) – an URI that reference to a resource or a file path or a file-like object or a string containing the schema or an Element or an ElementTree document.

  • namespace (str or None) – is an optional argument that contains the URI of the namespace. When specified it must be equal to the targetNamespace declared in the schema.

  • validation (str) – defines the XSD validation mode to use for build the schema, it’s value can be ‘strict’, ‘lax’ or ‘skip’.

  • global_maps (XsdGlobals or None) – is an optional argument containing an XsdGlobals instance, a mediator object for sharing declaration data between dependents schema instances.

  • converter (XMLSchemaConverter or None) – is an optional argument that can be an XMLSchemaConverter subclass or instance, used for defining the default XML data converter for XML Schema instance.

  • locations (dict or list or None) – schema location hints, that can include additional namespaces to import after processing schema’s import statements. Usually filled with the couples (namespace, url) extracted from xsi:schemaLocations. Can be a dictionary or a sequence of couples (namespace URI, resource URL).

  • base_url (str or None) – is an optional base URL, used for the normalization of relative paths when the URL of the schema resource can’t be obtained from the source argument.

  • defuse (str or None) – defines when to defuse XML data. Can be ‘always’, ‘remote’ or ‘never’. For default defuse only remote XML data.

  • timeout (int) – the timeout in seconds for fetching resources. Default is 300.

  • build (bool) – defines whether build the schema maps. Default is True.

  • use_meta (bool) – if True the schema processor uses the package meta-schema, otherwise the meta-schema is added at the end. In the latter case the meta-schema is rebuilt if any base namespace has been overridden by an import. Ignored if the argument global_maps is provided.

  • loglevel (int) – for setting a different logging level for schema initialization and building. For default is WARNING (30). For INFO level set it with 20, for DEBUG level with 10. The default loglevel is restored after schema building, when exiting the initialization method.

Variables
  • XSD_VERSION (str) – store the XSD version (1.0 or 1.1).

  • BUILDERS (namedtuple) – a namedtuple with attributes related to schema components classes. Used for build local components within parsing methods.

  • BUILDERS_MAP (dict) – a dictionary that maps from tag to class for XSD global components. Used for build global components within lookup functions.

  • BASE_SCHEMAS (dict) – a dictionary from namespace to schema resource for meta-schema bases.

  • FALLBACK_LOCATIONS (dict) – fallback schema location hints for other standard namespaces.

  • meta_schema (XMLSchema) – the XSD meta-schema instance.

  • attribute_form_default (str) – the schema’s attributeFormDefault attribute, defaults to ‘unqualified’.

  • element_form_default (str) – the schema’s elementFormDefault attribute, defaults to ‘unqualified’

  • block_default (str) – the schema’s blockDefault attribute, defaults to ‘’.

  • final_default (str) – the schema’s finalDefault attribute, defaults to ‘’.

  • default_attributes (XsdAttributeGroup) – the XSD 1.1 schema’s defaultAttributes attribute, defaults to None.

  • target_namespace (str) – is the targetNamespace of the schema, the namespace to which belong the declarations/definitions of the schema. If it’s empty no namespace is associated with the schema. In this case the schema declarations can be reused from other namespaces as chameleon definitions.

  • validation (str) – validation mode, can be ‘strict’, ‘lax’ or ‘skip’.

  • maps (XsdGlobals) – XSD global declarations/definitions maps. This is an instance of XsdGlobal, that store the global_maps argument or a new object when this argument is not provided.

  • converter (XMLSchemaConverter) – the default converter used for XML data decoding/encoding.

  • locations (NamespaceResourcesMap) – schemas location hints.

  • namespaces (dict) – a dictionary that maps from the prefixes used by the schema into namespace URI.

  • imports (dict) – a dictionary of namespace imports of the schema, that maps namespace URI to imported schema object, or None in case of unsuccessful import.

  • includes – a dictionary of included schemas, that maps a schema location to an included schema. It also comprehend schemas included by “xs:redefine” or “xs:override” statements.

  • warnings (list) – warning messages about failure of import and include elements.

  • notations (NamespaceView) – xsd:notation declarations.

  • types (NamespaceView) – xsd:simpleType and xsd:complexType global declarations.

  • attributes (NamespaceView) – xsd:attribute global declarations.

  • attribute_groups (NamespaceView) – xsd:attributeGroup definitions.

  • groups (NamespaceView) – xsd:group global definitions.

  • elements (NamespaceView) – xsd:element global declarations.

root

Root element of the schema.

get_text()

Gets the XSD text of the schema. If the source text is not available creates an encoded string representation of the XSD tree.

url

Schema resource URL, is None if the schema is built from a string.

tag

Schema root tag. For compatibility with the ElementTree API.

id

The schema’s id attribute, defaults to None.

version

The schema’s version attribute, defaults to None.

schema_location

A list of location hints extracted from the xsi:schemaLocation attribute of the schema.

no_namespace_schema_location

A location hint extracted from the xsi:noNamespaceSchemaLocation attribute of the schema.

target_prefix

The prefix associated to the targetNamespace.

default_namespace

The namespace associated to the empty prefix ‘’.

base_url

The base URL of the source of the schema.

builtin_types = <bound method XMLSchemaBase.builtin_types of <class 'xmlschema.validators.schema.XMLSchemaBase'>>
root_elements

The list of global elements that are not used by reference in any model of the schema. This is implemented as lazy property because it’s computationally expensive to build when the schema model is complex.

classmethod create_meta_schema(source=None, base_schemas=None, global_maps=None)

Creates a new meta-schema instance.

Parameters
  • source – an optional argument referencing to or containing the XSD meta-schema resource. Required if the schema class doesn’t already have a meta-schema.

  • base_schemas – an optional dictionary that contains namespace URIs and schema locations. If provided it’s used as substitute for class ‘s BASE_SCHEMAS. Also a sequence of (namespace, location) items can be provided if there are more schema documents for one or more namespaces.

  • global_maps – is an optional argument containing an XsdGlobals instance for the new meta schema. If not provided a new map is created.

classmethod create_schema(*args, **kwargs)

Creates a new schema instance of the same class of the caller.

create_any_content_group(parent, any_element=None)

Creates a model group related to schema instance that accepts any content.

Parameters
  • parent – the parent component to set for the any content group.

  • any_element – an optional any element to use for the content group. When provided it’s copied, linked to the group and the minOccurs/maxOccurs are set to 0 and ‘unbounded’.

create_any_attribute_group(parent)

Creates an attribute group related to schema instance that accepts any attribute.

Parameters

parent – the parent component to set for the any attribute group.

get_locations(namespace)

Get a list of location hints for a namespace.

include_schema(location, base_url=None)

Includes a schema for the same namespace, from a specific URL.

Parameters
  • location – is the URL of the schema.

  • base_url – is an optional base URL for fetching the schema resource.

Returns

the included XMLSchema instance.

import_schema(namespace, location, base_url=None, force=False)

Imports a schema for an external namespace, from a specific URL.

Parameters
  • namespace – is the URI of the external namespace.

  • location – is the URL of the schema.

  • base_url – is an optional base URL for fetching the schema resource.

  • force – is set to True imports the schema also if the namespace is already imported.

Returns

the imported XMLSchema instance.

resolve_qname(qname, namespace_imported=True)

QName resolution for a schema instance.

Parameters
  • qname – a string in xs:QName format.

  • namespace_imported – if this argument is True raises an XMLSchemaNamespaceError if the namespace of the QName is not the targetNamespace and the namespace is not imported by the schema.

Returns

an expanded QName in the format “{namespace-URI}*local-name*”.

Raises

XMLSchemaValueError for an invalid xs:QName is found, XMLSchemaKeyError if the namespace prefix is not declared in the schema instance.

iter_globals(schema=None)

Creates an iterator for XSD global definitions/declarations related to schema namespace.

Parameters

schema – Optional argument for filtering only globals related to a schema instance.

iter_components(xsd_classes=None)

Creates an iterator for traversing all XSD components of the validator.

Parameters

xsd_classes – returns only a specific class/classes of components, otherwise returns all components.

classmethod check_schema(schema, namespaces=None)

Validates the given schema against the XSD meta-schema (meta_schema).

Parameters
  • schema – the schema instance that has to be validated.

  • namespaces – is an optional mapping from namespace prefix to URI.

Raises

XMLSchemaValidationError if the schema is invalid.

build()

Builds the schema’s XSD global maps.

built

Property that is True if XSD validator has been fully parsed and built, False otherwise. For schemas the property is checked on all global components. For XSD components check only the building of local subcomponents.

validation_attempted

Property that returns the validation status of the XSD validator. It can be ‘full’, ‘partial’ or ‘none’.

validity

Property that returns the XSD validator’s validity. It can be ‘valid’, ‘invalid’ or ‘notKnown’.

all_errors

A list with all the building errors of the XSD validator and its components.

get_converter(converter=None, namespaces=None, **kwargs)

Returns a new converter instance.

Parameters
  • converter – can be a converter class or instance. If it’s an instance the new instance is copied from it and configured with the provided arguments.

  • namespaces – is an optional mapping from namespace prefix to URI.

  • kwargs – optional arguments for initialize the converter instance.

Returns

a converter instance.

validate(source, path=None, schema_path=None, use_defaults=True, namespaces=None)

Validates an XML data against the XSD schema/component instance.

Raises

XMLSchemaValidationError if XML data instance is not a valid.

is_valid(source, path=None, schema_path=None, use_defaults=True, namespaces=None)

Like validate() except that do not raises an exception but returns True if the XML document is valid, False if it’s invalid.

iter_errors(source, path=None, schema_path=None, use_defaults=True, namespaces=None)

Creates an iterator for the errors generated by the validation of an XML data against the XSD schema/component instance.

Parameters
  • source – the source of XML data. Can be an XMLResource instance, a path to a file or an URI of a resource or an opened file-like object or an Element instance or an ElementTree instance or a string containing the XML data.

  • path – is an optional XPath expression that matches the elements of the XML data that have to be decoded. If not provided the XML root element is selected.

  • schema_path – an alternative XPath expression to select the XSD element to use for decoding. Useful if the root of the XML data doesn’t match an XSD global element of the schema.

  • use_defaults – Use schema’s default values for filling missing data.

  • namespaces – is an optional mapping from namespace prefix to URI.

decode(source, path=None, schema_path=None, validation='strict', *args, **kwargs)

Decodes XML data. Takes the same arguments of the method XMLSchema.iter_decode().

iter_decode(source, path=None, schema_path=None, validation='lax', process_namespaces=True, namespaces=None, use_defaults=True, decimal_type=None, datetime_types=False, converter=None, filler=None, fill_missing=False, **kwargs)

Creates an iterator for decoding an XML source to a data structure.

Parameters
  • source – the source of XML data. Can be an XMLResource instance, a path to a file or an URI of a resource or an opened file-like object or an Element instance or an ElementTree instance or a string containing the XML data.

  • path – is an optional XPath expression that matches the elements of the XML data that have to be decoded. If not provided the XML root element is selected.

  • schema_path – an alternative XPath expression to select the XSD element to use for decoding. Useful if the root of the XML data doesn’t match an XSD global element of the schema.

  • validation – defines the XSD validation mode to use for decode, can be ‘strict’, ‘lax’ or ‘skip’.

  • process_namespaces – indicates whether to use namespace information in the decoding process, using the map provided with the argument namespaces and the map extracted from the XML document.

  • namespaces – is an optional mapping from namespace prefix to URI.

  • use_defaults – indicates whether to use default values for filling missing data.

  • decimal_type – conversion type for Decimal objects (generated by XSD decimal built-in and derived types), useful if you want to generate a JSON-compatible data structure.

  • datetime_types – if set to True the datetime and duration XSD types are decoded, otherwise their origin XML string is returned.

  • converter – an XMLSchemaConverter subclass or instance to use for the decoding.

  • filler – an optional callback function to fill undecodable data with a typed value. The callback function must accepts one positional argument, that can be an XSD Element or an attribute declaration. If not provided undecodable data is replaced by None.

  • fill_missing – if set to True the decoder fills also missing attributes. The filling value is None or a typed value if the filler callback is provided.

  • kwargs – keyword arguments with other options for converter and decoder.

Returns

yields a decoded data object, eventually preceded by a sequence of validation or decoding errors.

encode(obj, path=None, validation='strict', *args, **kwargs)

Encodes to XML data. Takes the same arguments of the method XMLSchema.iter_encode().

Returns

An ElementTree’s Element or a list containing a sequence of ElementTree’s elements if the argument path matches multiple XML data chunks. If validation argument is ‘lax’ a 2-items tuple is returned, where the first item is the encoded object and the second item is a list containing the errors.

iter_encode(obj, path=None, validation='lax', namespaces=None, converter=None, unordered=False, **kwargs)

Creates an iterator for encoding a data structure to an ElementTree’s Element.

Parameters
  • obj – the data that has to be encoded to XML data.

  • path – is an optional XPath expression for selecting the element of the schema that matches the data that has to be encoded. For default the first global element of the schema is used.

  • validation – the XSD validation mode. Can be ‘strict’, ‘lax’ or ‘skip’.

  • namespaces – is an optional mapping from namespace prefix to URI.

  • converter – an XMLSchemaConverter subclass or instance to use for the encoding.

  • unordered – a flag for explicitly activating unordered encoding mode for content model data. This mode uses content models for a reordered-by-model iteration of the child elements.

  • kwargs – Keyword arguments containing options for converter and encoding.

Returns

yields an Element instance/s or validation/encoding errors.

ElementTree and XPath API

class xmlschema.ElementPathMixin

Mixin abstract class for enabling ElementTree and XPath API on XSD components.

Variables
  • text – The Element text. Its value is always None. For compatibility with the ElementTree API.

  • tail – The Element tail. Its value is always None. For compatibility with the ElementTree API.

tag

Alias of the name attribute. For compatibility with the ElementTree API.

attrib

Returns the Element attributes. For compatibility with the ElementTree API.

get(key, default=None)

Gets an Element attribute. For compatibility with the ElementTree API.

iter(tag=None)

Creates an iterator for the XSD element and its subelements. If tag is not None or ‘*’, only XSD elements whose matches tag are returned from the iterator. Local elements are expanded without repetitions. Element references are not expanded because the global elements are not descendants of other elements.

iterchildren(tag=None)

Creates an iterator for the child elements of the XSD component. If tag is not None or ‘*’, only XSD elements whose name matches tag are returned from the iterator.

find(path, namespaces=None)

Finds the first XSD subelement matching the path.

Parameters
  • path – an XPath expression that considers the XSD component as the root element.

  • namespaces – an optional mapping from namespace prefix to namespace URI.

Returns

The first matching XSD subelement or None if there is not match.

findall(path, namespaces=None)

Finds all XSD subelements matching the path.

Parameters
  • path – an XPath expression that considers the XSD component as the root element.

  • namespaces – an optional mapping from namespace prefix to full name.

Returns

a list containing all matching XSD subelements in document order, an empty list is returned if there is no match.

iterfind(path, namespaces=None)

Creates and iterator for all XSD subelements matching the path.

Parameters
  • path – an XPath expression that considers the XSD component as the root element.

  • namespaces – is an optional mapping from namespace prefix to full name.

Returns

an iterable yielding all matching XSD subelements in document order.

XSD globals maps API

class xmlschema.XsdGlobals(validator, validation='strict')

Mediator class for related XML schema instances. It stores the global declarations defined in the registered schemas. Register a schema to add it’s declarations to the global maps.

Parameters
  • validator – the origin schema class/instance used for creating the global maps.

  • validation – the XSD validation mode to use, can be ‘strict’, ‘lax’ or ‘skip’.

build()

Build the maps of XSD global definitions/declarations. The global maps are updated adding and building the globals of not built registered schemas.

clear(remove_schemas=False, only_unbuilt=False)

Clears the instance maps and schemas.

Parameters
  • remove_schemas – removes also the schema instances.

  • only_unbuilt – removes only not built objects/schemas.

copy(validator=None, validation=None)

Makes a copy of the object.

iter_globals()

Creates an iterator for XSD global definitions/declarations.

iter_schemas()

Creates an iterator for the schemas registered in the instance.

register(schema)

Registers an XMLSchema instance.

XML Schema converters

The base class XMLSchemaConverter is used for defining generic converters. The subclasses implement some of the most used conventions for converting XML to JSON data.

class xmlschema.converters.ElementData(tag, text, content, attributes)

Namedtuple for Element data interchange between decoders and converters. The field tag is a string containing the Element’s tag, text can be None or a string representing the Element’s text, content can be None, a list containing the Element’s children or a dictionary containing element name to list of element contents for the Element’s children (used for unordered input data), attributes can be None or a dictionary containing the Element’s attributes.

class xmlschema.XMLSchemaConverter(namespaces=None, dict_class=None, list_class=None, etree_element_class=None, text_key='$', attr_prefix='@', cdata_prefix=None, indent=4, strip_namespaces=False, preserve_root=False, force_dict=False, force_list=False, **kwargs)

Generic XML Schema based converter class. A converter is used to compose decoded XML data for an Element into a data structure and to build an Element from encoded data structure. There are two methods for interfacing the converter with the decoding/encoding process. The method element_decode accepts ElementData instance, containing the element parts, and returns a data structure. The method element_encode accepts a data structure and returns an ElementData that can be

Parameters
  • namespaces – map from namespace prefixes to URI.

  • dict_class – dictionary class to use for decoded data. Default is dict.

  • list_class – list class to use for decoded data. Default is list.

  • etree_element_class – the class that has to be used to create new XML elements, if not provided uses the ElementTree’s Element class.

  • text_key – is the key to apply to element’s decoded text data.

  • attr_prefix – controls the mapping of XML attributes, to the same name or with a prefix. If None the converter ignores attributes.

  • cdata_prefix – is used for including and prefixing the character data parts of a mixed content, that are labeled with an integer instead of a string. Character data parts are ignored if this argument is None.

  • indent – number of spaces for XML indentation (default is 4).

  • strip_namespaces – if set to True removes namespace declarations from data and namespace information from names, during decoding or encoding. Defaults to False.

  • preserve_root – if set to True the root element is preserved, wrapped into a single-item dictionary. Applicable only to default converter and to ParkerConverter.

  • force_dict – if set to True complex elements with simple content are decoded with a dictionary also if there are no decoded attributes. Applicable to default converter only. Defaults to False.

  • force_list – if set to True child elements are decoded within a list in any case. Applicable to default converter only. Defaults to False.

Variables
  • dict – dictionary class to use for decoded data.

  • list – list class to use for decoded data.

  • etree_element_class – Element class to use

  • text_key – key for decoded Element text

  • attr_prefix – prefix for attribute names

  • cdata_prefix – prefix for character data parts

  • indent – indentation to use for rebuilding XML trees

  • strip_namespaces – remove namespace information

  • preserve_root – preserve the root element on decoding

  • force_dict – force dictionary for complex elements with simple content

  • force_list – force list for child elements

lossless

The negation of lossy property, preserved for backward compatibility.

losslessly

The XML data is decoded without loss of quality, neither on data nor on data model shape. Only losslessly converters can be always used to encode to an XML data that is strictly conformant to the schema.

copy(**kwargs)
map_attributes(attributes)

Creates an iterator for converting decoded attributes to a data structure with appropriate prefixes. If the instance has a not-empty map of namespaces registers the mapped URIs and prefixes.

Parameters

attributes – A sequence or an iterator of couples with the name of the attribute and the decoded value. Default is None (for simpleType elements, that don’t have attributes).

map_content(content)

A generator function for converting decoded content to a data structure. If the instance has a not-empty map of namespaces registers the mapped URIs and prefixes.

Parameters

content – A sequence or an iterator of tuples with the name of the element, the decoded value and the XsdElement instance associated.

etree_element(tag, text=None, children=None, attrib=None, level=0)

Builds an ElementTree’s Element using arguments and the element class and the indent spacing stored in the converter instance.

Parameters
  • tag – the Element tag string.

  • text – the Element text.

  • children – the list of Element children/subelements.

  • attrib – a dictionary with Element attributes.

  • level – the level related to the encoding process (0 means the root).

Returns

an instance of the Element class setted for the converter instance.

element_decode(data, xsd_element, level=0)

Converts a decoded element data to a data structure.

Parameters
  • data – ElementData instance decoded from an Element node.

  • xsd_element – the XsdElement associated to decoded the data.

  • level – the level related to the decoding process (0 means the root).

Returns

a data structure containing the decoded data.

element_encode(obj, xsd_element, level=0)

Extracts XML decoded data from a data structure for encoding into an ElementTree.

Parameters
  • obj – the decoded object.

  • xsd_element – the XsdElement associated to the decoded data structure.

  • level – the level related to the encoding process (0 means the root).

Returns

an ElementData instance.

class xmlschema.ParkerConverter(namespaces=None, dict_class=None, list_class=None, preserve_root=False, **kwargs)

XML Schema based converter class for Parker convention.

ref: http://wiki.open311.org/JSON_and_XML_Conversion/#the-parker-convention ref: https://developer.mozilla.org/en-US/docs/Archive/JXON#The_Parker_Convention

Parameters
  • namespaces – Map from namespace prefixes to URI.

  • dict_class – Dictionary class to use for decoded data. Default is dict for Python 3.6+ or OrderedDict for previous versions.

  • list_class – List class to use for decoded data. Default is list.

  • preserve_root – If True the root element will be preserved. For default the Parker convention remove the document root element, returning only the value.

class xmlschema.BadgerFishConverter(namespaces=None, dict_class=None, list_class=None, **kwargs)

XML Schema based converter class for Badgerfish convention.

ref: http://www.sklar.com/badgerfish/ ref: http://badgerfish.ning.com/

Parameters
  • namespaces – Map from namespace prefixes to URI.

  • dict_class – Dictionary class to use for decoded data. Default is dict for Python 3.6+ or OrderedDict for previous versions.

  • list_class – List class to use for decoded data. Default is list.

class xmlschema.AbderaConverter(namespaces=None, dict_class=None, list_class=None, **kwargs)

XML Schema based converter class for Abdera convention.

ref: http://wiki.open311.org/JSON_and_XML_Conversion/#the-abdera-convention ref: https://cwiki.apache.org/confluence/display/ABDERA/JSON+Serialization

Parameters
  • namespaces – Map from namespace prefixes to URI.

  • dict_class – Dictionary class to use for decoded data. Default is dict for Python 3.6+ or OrderedDict for previous versions.

  • list_class – List class to use for decoded data. Default is list.

class xmlschema.JsonMLConverter(namespaces=None, dict_class=None, list_class=None, **kwargs)

XML Schema based converter class for JsonML (JSON Mark-up Language) convention.

ref: http://www.jsonml.org/ ref: https://www.ibm.com/developerworks/library/x-jsonml/

Parameters
  • namespaces – Map from namespace prefixes to URI.

  • dict_class – Dictionary class to use for decoded data. Default is dict for Python 3.6+ or OrderedDict for previous versions.

  • list_class – List class to use for decoded data. Default is list.

Resource access API

class xmlschema.XMLResource(source, base_url=None, defuse='remote', timeout=300, lazy=True)

XML resource reader based on ElementTree and urllib.

Parameters
  • source – a string containing the XML document or file path or an URL or a file like object or an ElementTree or an Element.

  • base_url – is an optional base URL, used for the normalization of relative paths when the URL of the resource can’t be obtained from the source argument.

  • defuse – set the usage of SafeXMLParser for XML data. Can be ‘always’, ‘remote’ or ‘never’. Default is ‘remote’ that uses the defusedxml only when loading remote data.

  • timeout – the timeout in seconds for the connection attempt in case of remote data.

  • lazy – if set to False the source is fully loaded into and processed from memory. Default is True that means that only the root element of the source is loaded. This is ignored if source is an Element or an ElementTree.

root

The XML tree root Element.

document

The ElementTree document, None if the instance is lazy or is not created from another document or from an URL.

text

The XML text source, None if it’s not available.

url

The source URL, None if the instance is created from an Element tree or from a string.

base_url

The base URL for completing relative locations.

namespace

The namespace of the XML document.

copy(**kwargs)

Resource copy method. Change init parameters with keyword arguments.

tostring(indent='', max_lines=None, spaces_for_tab=4, xml_declaration=False)

Generates a string representation of the XML resource.

open()

Returns a opened resource reader object for the instance URL.

load()

Loads the XML text from the data source. If the data source is an Element the source XML text can’t be retrieved.

is_lazy()

Returns True if the XML resource is lazy.

is_loaded()

Returns True if the XML text of the data source is loaded.

iter(tag=None)

XML resource tree iterator.

iter_location_hints()

Yields schema location hints from the XML tree.

get_namespaces()

Extracts namespaces with related prefixes from the XML resource. If a duplicate prefix declaration is encountered then adds the namespace using a different prefix, but only in the case if the namespace URI is not already mapped by another prefix.

Returns

A dictionary for mapping namespace prefixes to full URI.

get_locations(locations=None)

Returns a list of schema location hints. The locations are normalized using the base URL of the instance. The locations argument can be a dictionary or a list of namespace resources, that are inserted before the schema location hints extracted from the XML resource.

static defusing(source)

Defuse an XML source, raising an ElementTree.ParseError if the source contains entity definitions or remote entity loading.

Parameters

source – a filename or file object containing XML data.

parse(source)

An equivalent of ElementTree.parse() that can protect from XML entities attacks. When protection is applied XML data are loaded and defused before building the ElementTree instance.

Parameters

source – a filename or file object containing XML data.

Returns

an ElementTree instance.

iterparse(source, events=None)

An equivalent of ElementTree.iterparse() that can protect from XML entities attacks. When protection is applied the iterator yields pure-Python Element instances.

Parameters
  • source – a filename or file object containing XML data.

  • events – a list of events to report back. If omitted, only “end” events are reported.

fromstring(text)

An equivalent of ElementTree.fromstring() that can protect from XML entities attacks.

Parameters

text – a string containing XML data.

Returns

the root Element instance.

xmlschema.fetch_resource(location, base_url=None, timeout=30)

Fetch a resource trying to accessing it. If the resource is accessible returns the URL, otherwise raises an error (XMLSchemaURLError).

Parameters
  • location – an URL or a file path.

  • base_url – reference base URL for normalizing local and relative URLs.

  • timeout – the timeout in seconds for the connection attempt in case of remote data.

Returns

a normalized URL.

xmlschema.fetch_schema(source, locations=None, **resource_options)

Fetches the schema URL for the source’s root of an XML data source. If an accessible schema location is not found raises a ValueError.

Parameters
  • source – An an Element or an Element Tree with XML data or an URL or a file-like object.

  • locations – A dictionary or dictionary items with schema location hints.

  • resource_options – keyword arguments for providing XMLResource class init options.

Returns

An URL referring to a reachable schema resource.

xmlschema.fetch_schema_locations(source, locations=None, **resource_options)

Fetches the schema URL for the source’s root of an XML data source and a list of location hints. If an accessible schema location is not found raises a ValueError.

Parameters
  • source – an Element or an Element Tree with XML data or an URL or a file-like object.

  • locations – a dictionary or dictionary items with Schema location hints.

  • resource_options – keyword arguments for providing XMLResource class init options.

Returns

A tuple with the URL referring to the first reachable schema resource, a list of dictionary items with normalized location hints.

xmlschema.load_xml_resource(source, element_only=True, **resource_options)

Load XML data source into an Element tree, returning the root Element, the XML text and an url, if available. Usable for XML data files of small or medium sizes, as XSD schemas.

Parameters
  • source – an URL, a filename path or a file-like object.

  • element_only – if True the function returns only the root Element of the tree.

  • resource_options – keyword arguments for providing XMLResource init options.

Returns

a tuple with three items (root Element, XML text and XML URL) or only the root Element if ‘element_only’ argument is True.

xmlschema.normalize_url(url, base_url=None, keep_relative=False)

Returns a normalized URL doing a join with a base URL. URL scheme defaults to ‘file’ and backslashes are replaced with slashes. For file paths the os.path.join is used instead of urljoin.

Parameters
  • url – a relative or absolute URL.

  • base_url – the reference base URL for construct the normalized URL from the argument. For compatibility between “os.path.join” and “urljoin” a trailing ‘/’ is added to not empty paths.

  • keep_relative – if set to True keeps relative file paths, which would not strictly conformant to URL format specification.

Returns

A normalized URL.

Errors and exceptions

exception xmlschema.XMLSchemaException

The base exception that let you catch all the errors generated by the library.

exception xmlschema.XMLSchemaRegexError

Raised when an error is found when parsing an XML Schema regular expression.

exception xmlschema.XMLSchemaValidatorError(validator, message, elem=None, source=None, namespaces=None)

Base class for XSD validator errors.

Parameters
  • validator (XsdValidator or function) – the XSD validator.

  • message (str or unicode) – the error message.

  • elem (Element) – the element that contains the error.

  • source (XMLResource) – the XML resource that contains the error.

  • namespaces (dict) – is an optional mapping from namespace prefix to URI.

Variables

path – the XPath of the element, calculated when the element is set or the XML resource is set.

exception xmlschema.XMLSchemaNotBuiltError(validator, message)

Raised when there is an improper usage attempt of a not built XSD validator.

Parameters
  • validator (XsdValidator) – the XSD validator.

  • message (str or unicode) – the error message.

exception xmlschema.XMLSchemaParseError(validator, message, elem=None)

Raised when an error is found during the building of an XSD validator.

Parameters
  • validator (XsdValidator or function) – the XSD validator.

  • message (str or unicode) – the error message.

  • elem (Element) – the element that contains the error.

exception xmlschema.XMLSchemaModelError(group, message)

Raised when a model error is found during the checking of a model group.

Parameters
  • group (XsdGroup) – the XSD model group.

  • message (str or unicode) – the error message.

exception xmlschema.XMLSchemaModelDepthError(group)

Raised when recursion depth is exceeded while iterating a model group.

exception xmlschema.XMLSchemaValidationError(validator, obj, reason=None, source=None, namespaces=None)

Raised when the XML data is not validated with the XSD component or schema. It’s used by decoding and encoding methods. Encoding validation errors do not include XML data element and source, so the error is limited to a message containing object representation and a reason.

Parameters
  • validator (XsdValidator or function) – the XSD validator.

  • obj (Element or tuple or str or list or int or float or bool) – the not validated XML data.

  • reason (str or unicode) – the detailed reason of failed validation.

  • source (XMLResource) – the XML resource that contains the error.

  • namespaces (dict) – is an optional mapping from namespace prefix to URI.

exception xmlschema.XMLSchemaDecodeError(validator, obj, decoder, reason=None, source=None, namespaces=None)

Raised when an XML data string is not decodable to a Python object.

Parameters
  • validator (XsdValidator or function) – the XSD validator.

  • obj (Element or tuple or str or list or int or float or bool) – the not validated XML data.

  • decoder (type or function) – the XML data decoder.

  • reason (str or unicode) – the detailed reason of failed validation.

  • source (XMLResource) – the XML resource that contains the error.

  • namespaces (dict) – is an optional mapping from namespace prefix to URI.

exception xmlschema.XMLSchemaEncodeError(validator, obj, encoder, reason=None, source=None, namespaces=None)

Raised when an object is not encodable to an XML data string.

Parameters
  • validator (XsdValidator or function) – the XSD validator.

  • obj (Element or tuple or str or list or int or float or bool) – the not validated XML data.

  • encoder (type or function) – the XML encoder.

  • reason (str or unicode) – the detailed reason of failed validation.

  • source (XMLResource) – the XML resource that contains the error.

  • namespaces (dict) – is an optional mapping from namespace prefix to URI.

exception xmlschema.XMLSchemaChildrenValidationError(validator, elem, index, particle, occurs=0, expected=None, source=None, namespaces=None)

Raised when a child element is not validated.

Parameters
  • validator (XsdValidator or function) – the XSD validator.

  • elem (Element or ElementData) – the not validated XML element.

  • index (int) – the child index.

  • particle (ParticleMixin) – the validator particle that generated the error. Maybe the validator itself.

  • occurs (int) – the particle occurrences.

  • expected (str or list or tuple) – the expected element tags/object names.

  • source (XMLResource) – the XML resource that contains the error.

  • namespaces (dict) – is an optional mapping from namespace prefix to URI.

exception xmlschema.XMLSchemaIncludeWarning

A schema include fails.

exception xmlschema.XMLSchemaImportWarning

A schema namespace import fails.