The Tech4Open summit is a one day event with unique design challenges, discussing and finding solutions to the infrastructures needed to enable massive re-use of open works. The focus will be on what needs to be done to automate and simplify the processes of re-use through making proper use of metadata.
The host will present the current thinking within an area and lead the participants in collectively finding the answers for some of the problems faced in the area. While we don't charge for participation, we do ask that you register beforehand. We have a limited number of spaces available, and we want to make sure that everyone can find a place.
Commons Machinery has built tools that enable copy-pasting of metadata together with images from web pages with RDFa. Lessons learnt from this will serve as the starting point for the discussions and Jonas Öberg and Peter Liljenberg will showcase the work and talk about the challenges discovered along the way.
The length of an attribution chain
Current standards, such as the Creative Commons Rights Expression Language (CC-REL), suggest that works that are based on previous works refer to such works by placing the URI of the original work in the metadata of the work. This allows an application to use that URI to look up previous works and build an attribution tree, the combination of which might become the full attribution of the resulting work. With RDF and other standards for metadata, it would technically be possible to include a part of that attribution tree within the resulting work, so that an application doesn’t need to lookup the metadata in external sources. Key questions here are: when should this happen and how many generations of source work metadata should be kept?
The minimum metadata needed
Most metadata standards, within each particular domain, define a large number of terms, not all of which are always used, and not all are always needed. For a photograph, relevant details might include information about the photographer and persons (if any) photographed. A book might need details about the author, but also the translator (if any). Are there any minimum requirements for what information is needed to be conveyed about a work, that is common for any type of work, regardless of what domain it’s from (music, science, literature, etc)?
Lunch will be provided at the venue.
Where is metadata about works stored?
Metadata about a work can be stored embedded in a file, in a XMP sidecar, as RDFa in a containing web page, in a registry (and exposed as e.g. XMP or RDF/XML). When multiple places can store metadata properties, how should they be used? Should all information be duplicated to all possible storage places, or should some only reside in e.g. a registry or as RDFa? When a metadata property is stored in multiple places, what is the precedence when reading to resolve conflicts?
How is metadata about a work found?
Assuming registries of works exist, what’s the protocol to access them? If registries get widespread use it will be necessary to have an machine-usable discovery of the registry information for the image. How do you determine what registry to query, when there might be any number of ad-hoc registries? Is it viable to rely on semantic web standards using the HTTP Accept header, e.g. either request HTML version and get embedded RDFa, or request RDF/XML? What should be the common namespaces and usages that all registries complying with these scheme should support?
Automatically generating attributions
In an ideal world, attributing the creator of a work should likely happen automatically when their work is used. And while giving attribution is a requirement of many licenses, not least the Creative Commons suite of licenses, there are no clear ways that define how this should happen. Looking at the Creative Commons licenses, there are some instructions as to what information should be included in the credit, but that this “may be implemented in any reasonable manner.” How would an automatically generated attribution ideally look like, and how do we get to there from the metadata at hand, especially if there are multiple creators or generations?