Data exchange standards and protocols

Our definition of data exchange systems is intended to cover indirect communications between partners, usually through some hub mechanism. We exclude any direct peer-to-peer communication and any data exchange that can be implemented with in-house systems.

DNS as a reference

A reasonable analogy to use as a reference is the access to websites through the Domain Name Service (DNS). The DNS uses one of the original Uniform Resource Identifiers (URI) for the Internet, namely the Hypertext Transfer Protocol (HTTP). If we take an example of a website address, then the basic DNS model can be explained.

http://www.cen.eu

The basic idea of a URI such as this is to provide a name that is understandable by humans but, for communication over the internet, this needs to be communicated in the form of an Internet Protocol (IP) address, using an IPv4 or IPv6 address format. The character string 'http' identifies the URI system, where the DNS with the appropriate algorithms for resolving the namespace for 'http'. For reference, the common term "URL" should specify a specific location on the Internet but as this is increasingly not the case, the term URL is deprecated within the Internet Engineering Task Force.

The character string 'eu' is behind a dot (full stop or period), and identifies what is known as a country code Top Level Domain (ccTLD). True Top Level Domain examples are 'com', 'org', net, and 'info'. A TLD or ccTLD has a sponsoring organisation and registry, in the case of .eu, this is an organisation called EURid.

EURid has a number of name servers that are used to resolve (convert) a URL into an IP address. The string 'cen' is in the next level of the search, and any enquiry is routed from the DNS resolver for the domain eu to an IP address(es) that hosts the CEN website. Once on the CEN website, individual pages can be called up either from the menu or directly from the enquirer's web browser by using the forward solidus (/) as part of the syntax to resolve and return the specific page being requested.

A typical web searchA typical web search

The figure shows that the number of pages that can be cached is relatively small at the browser level, and increases at the computer level. We have used the 'back' and 'forward' symbol to indicate browser and 'history' icon to indicate computer, neither of which is technically accurate, but points to the difference in scale. If the ISP cannot resolve the web address, then it sends a request to the TLD to start a recursive search process – all of which takes a fraction of a second.

The reason for giving this elaborate explanation around web pages is to indicate that so long as a hierarchical structure exists, it is theoretically possible to resolve any form of address. Other higher level URIs include mailto used for transmitting e-mail addresses, ftp for file transfer, and urn the Uniform Resource Name, which is discussed in greater detail below.

Uniform Resource Names

As mentioned above, the Uniform Resource Name (URN), is a sub-set of the Internet URI scheme. A number of sub-divisions of URNs have been registered for use on the Internet under what is known as a URN namespace with a namespace ID (NID). Some that are relevant to this report include (in order of original registration): oid, isbn, swift, nfc, epc, epcglobal. These are a few of the 40 formally registered URN scheme. There is even a proposal for a URN for ISO, which is not yet on the register.

The basic requirement is for some namespace code to have a hierarchical structure so that the code can be passed through a recursive resolution system. The resolver algorithm takes the highest level component of the code (which as shown with the DNS system described above) does not have to be numeric and resolves this to a root resolver. Then, the resolution process continues to the next tier in the hierarchy, with each tier separated by some syntax character.

Just as with a web browser, functions for processing a urn through a urn resolving application may be shared between:

  • the resolver (generally – but not always - with public access, as will be discussed below) to gain access to the IP address
  • a host system that provides the detailed information (that might be partly or completely restricted except to those with permission for access).

Additional context for RFID and the supply chain

Web sites and information held with other URI schemes are not always required to be updated dynamically and close to real time. New web pages are added on an occasional basis, and e-mail addresses – even for large corporations – change fairly slowly.

In contrast, there are some significant expectations of making RFID data available on the internet. Some information may be fairly persistent in its association with a product (e.g. name, country of manufacturer, ingredients). Other features are likely to be associated with the serialisation of a product or other objects in terms of its unique item identifier, so the time, location, custody of an item as it moves through the supply chain changes with the passage of time.

In turn, this means that information about any individual object is unlikely to be centralised, but distributed in various data repositories associated with the chain of custody. A retailer will probably know which individual customer purchased a domestic appliance, and either the customer or the retailer might share this with the manufacturer for warranty purposes. The customer would certainly have no access rights to know of other customers who purchased the product, nor the difference between the ex-factory and retail point-of-sale price. The net effect of this is that some new principles will need to be part of every data exchange system for objects:

  • A partitioning of a code structure for resolution, so that an IP address is identified (typically) to the product level but not to the unique instance of a product. This type of structure is already available as the difference between a web site URL and the address of a particular web page.
  • The need for authentication and permission to access particular type data, something that is already available on the web and invoked at the web site level.
  • An increased level of distribution of data associated with a particular unique item identifier. This requirement (discussed in more detail below) means that individual organisations will have the dual role of making enquiries for more information associated with an object, but will also be the source of information from external enquiries. Although this type of information is available between pairs or small groups of trading partners, the prospect to develop something like the Internet of Things requires a greater level of data exchange at a significantly lower level of granularity.

Key relationships with other components

The figure below shows the relationship of data exchange from the point of processing object data and sensor data.

Data exchange: interrelationshipsData exchange: interrelationships

The data management processes create whatever internal operational databases are required to make the organisation function properly. Details of this are beyond the scope of this report. However, such resources are the prime source of what is held within the organisation on the local data repository to share with others who have access on the particular name server. This name server could be part of an EPC system, or an IATA information system for baggage handling, or an automotive system for tracing components for manufacture to after sales repairs, or any other data exchange system.

This local data repository requires an interface standard to determine what generic business data needs to be placed in the local repository. The repository may then require additional data, by initiating queries through an access process system, locally accessible name search, or through some interface to an external route named server.

The process can also be applied in reverse, where an external source can request data that is on the local data repository. This part of the system is significantly more elaborate and complex than that of a web master adding new pages onto a web site.

All the external communications are through a name server root resolver that has the IP addresses of other name server resolvers that can provide the next level IP addresses. The process is recursive, moving from one resolver to another until an IP address is found that can provide all the information being sought by the original query.

Standards

EPCglobal is reasonably advanced with its development of standards for data exchange:

  • The EPC Information System Standard fulfils the functions defined in the figure above as the Data Repository Capture Interface and Query Interface.
  • The Object Name Service Standard is an interface standard that is equivalent to the Name Server Interface.

Another data exchange system pre-dates EPC. This is the Handle system that provides a global name service particularly for media products. This is part of a larger system known as digital Object Identifiers (DOI), which provides a digital identifier for any object of Intellectual Property. Recently, the DOI system began to be standardised as ISO 26324 Information and documentation – Digital object identifier system.

Both the EPCglobal ONS system and the DOI system make use of various standards and protocol defined by the Internet Engineering Task Force (IETF). Documents are published in the form of a Request for Comments and, even when approved, they retain this naming convention. Two particular RFCs are relevant for accessing data over the internet:

  • RFC 3986 Uniform Resource Identifier (URI): Generic Syntax.
  • RFC 3403 Dynamic Delegation Discovery System (DDDS) – Part 3: The Domain Name System (DNS) database.

For the internet to be used to resolve a Unique Item Identifier, the hierarchical structure needs to be converted into a domain-name, usually by stripping off any serialised component. The conversion algorithm from UII to domain name is part of the specification of the particular URI scheme. Once the local resolver has created this domain name format, a DNS query is issued for NAPTR records for that domain. The processes defined in RFC 3403 return URLs that point to services that can provide the information.

Significant development areas

The EPCglobal Object Name Service (ONS) is a significant development for using the Internet as part of the data exchange system. Standards and tools are already in place to support this function.

Many observers have assumed that ONS is the only form that can be used to make use of the DNS resolution system. This is not true, and any formally registered URI – including specific URNs (like the EPC codes) can be used. The granularity both of the resolution system and of the associated data is defined within the IETF RFC for the particular URN. Whereas, for retail purposes, a book might need to be identified to an instance of a particular product, the urn:isbn only identifies to the title or specific edition of a title.

The Digital Object Identifier (DOI) has been mentioned in this clause, but readers need to be aware that it only deals with objects associated with Intellectual Property. However, we understand that there are over 4 million such objects as part of the system, and also understand that the Commission of the European Union uses the DOI system for identifying many documents. As such a significant system (probably with more records and more users than ONS) it provides an alternative model that might have components that are relevant for other domains.

Given that most Unique Item Identifiers that are compliant with ISO encoding rules are based on object identifiers, a solution needs to be developed for this major class. Some form of generic model is required. This then needs to be supplemented by, as follows:

  • Domain-specific syntax and mapping rules from the UII to the DNS format.
  • Based on this syntax, specific capture and query interface standards need to be developed to access information on the local data repository.
  • The Name Server Interface (equivalent to the EPCglobal ONS standard) also needs to be developed.

Our research has identified that the URI scheme can offer high degrees of flexibility and the amount of disclosure that is required to satisfy the IETF can vary, depending on the domain. For example, urn:swift provides very little information about the structure of its URI, except for the higher level of the hierarchy. This ensures that while this URN can be identified by a "non-SWIFT" resolver in the sense of delivering appropriate error codes, only compliant SWIFT resolvers are able to deal with secure banking data. Further research is required to establish whether some aspects of a resolution service needs security, in addition to subscriber authentication and permission rules.