Requirements for the Java Implementation Plan for the XBRLAPI.ORG XBRL API
Data representation requirements
The following requirements impact directly upon the data representation choices for the Java implementation of the XBRL API:
- It must be pFree Open-Source Softwareible to use multiple data representations for the information in an XBRL DTS without needing to alter the code implementing the XBRL API. Rather the XBRL API MUST be implemented in terms of a series of operations applied to an abstraction of the underlying data. This abstraction layer means that each data representation that can underpin the XBRL API implementation only needs to provide an implementation of the functions defined in the abstraction layer.
- The XBRL data underlying the XBRL API implementation MUST contain a superset of the information in the original XML documents that they are formed from. In particular, they MUST preserve enough information to reconstitute the original XML documents with full accuracy in terms of structures like XML comments, processor instructions, spacing etc. The notable exception to this requirement is that document type declarations do not need to be recoverable from the data store. See the DTD related bug for more details on this issue.
Handling of XML Standards
- The XLink processing of the original XML documents, MUST be done by an XLink processor that is separable from the rest of the code base. This is intended to simplify interchanging the reference XLink processor implemented as part of XBRLAPI with other commercial and non-commercial XLink processors, such as the one released by Fujitsu.
- The XML Base resolution will be performed by a generic XML Base resolver that fully conforms to the XBRL Base 1.0 specification.
- The XPointer resolution will be performed by a generic XPointer resolver that fully conforms to the XPointer framework 1.0 specification and xmlns and element schemes that are W3C recommendations.
- XPointer resolution using the element scheme will allow for recognition of any id attributes or elements that are identified explicitly to the XPointer resolver or that are identified as being of ID type in a DTD declaration or an XML Schema declaration.
Data Discovery
- The data discovery process MUST not be memory intensive.
- The parser MUST be namespace aware.
- The parser must allow for both DTD and XML Schema validation during the streaming process and it must expose the PSVI resulting from XML Schema validation.
- The parser must allow for use of a Lexical Handler to ensure preservation of lexical content in the discovered XML
- The parser must allow for use of a Declaration handler to enable detection of DTD declared ID type attributes (to support XPointer resolution).
- The data discovery process MUST enable all XML documents that are discovered to be written to a local cache, making them available on an ongoing basis regardless of changes to network topology. A custom entity resolver for the parser must ensure that the parsing is done on the cached documents rather than the original documents.
- The caching mechanism MUST use the local filesystem to store the cached documents in a manner that allows intuitive human exploration of the cache for purposes other than usage of the Java XBRL API implementation.
- The caching of the original XML documents MUST be entirely separable from the process of decomposing the discovered documents into meta-data for storage in an XML database.
Validation
- The documents loaded into the XBRLAPI data store MUST be validated against relevant XML-Schema and DTS validation rules.
- It must be pFree Open-Source Softwareible to expose the data in an XBRLAPI data store to a 3rd party XBRL validation tool to enable the validation functions of that software to be used.