provenance

See also roundup

ml, January 25th, 2012

The Learning Resource Metadata Initiative specification (which Creative Commons is coordinating) has entered its final public commenting period. Please look if you’re at all interested in education metadata and/or how efforts spurred by schema.org (which LRMI is) will shape up.

The W3C published drafts recently that ought be of great interest to the Creative Commons technology community: a family of documents regarding provenance and a guide to using microdata, microformats, and RDFa in HTML. I mentioned these on my personal blog here and here.

Speaking of things mentioned on my personal blog, a couple days ago I posted some analysis of how people are deploying CC related metadata based on a structured data extracted by the Web Data Commons project from a sample of the Common Crawl corpus. Earlier this month I posted a marginally technical explanation of using CSS text overlays to provide attribution and a brief historical overview of ‘open hardware licensing’, something which the CC technology team hasn’t been involved in, but is vaguely labs-ish, and needs deep technical attention.

Other things needing deep technical attention: how CC addresses Digital Restrictions Management in version 4.0 of its licenses is being discussed. We don’t know enough about the technical details of various restricted systems (see last sentence) that CC licensed works are being distributed on/to/with every day, and ought to. Another needs-technical-attention issue is ‘functional content’ for example in games and 3D printing. And we’re still looking for a new CTO.

Finally, Jonathan Rees just posted on how to apply CC0 to an ontology. You should subscribe to Jonathan’s blog as almost every post is of great interest if you’ve read this far.

Addendum: It seems remiss to not mention SOPA, so I’m adding it. Thanks to the technology community for rising up against this bad policy. CC promoted the campaign on its main website through banners and a number of blog posts. Don’t forget that SOPA/PIPA may well rise again, the so-called Research Works Act is very different but is motivated by the same thinking, and ACTA threatens globally. Keep it up! In the long term, is not building a healthy commons (and thus technology needed to facilitate building a healthy commons) a big part of the solution? On that, see yet another post on my personal blog…

Comments Off

Creative Commons: Using Provenance in the Context of Sharing Creative Works

ml, October 3rd, 2011

I provided a brief non-technical writeup on Creative Commons and provenance for the W3C Provenance Working Group‘s Connection Task Force documenting “Communities Addressing Important Issues in Provenance”.

See the writeup on the Provenance WG wiki (please suggest edits in comments below), current version follows.


Creative Commons Creative Commons (CC) provides licenses and public domain tools that can be used for any kind of creative works like texts, images, websites, or other media, as well as databases. CC tools are well known and used, especially in online publications. Each CC license and public domain tool is identified by a unique URL, allowing proper identification and reference of these as part of a work’s provenance information.

Additionally, Creative Commons provides a vocabulary to describe its tools and works licensed or marked with those tools in a machine interpretable way: The Creative Commons Rights Expression Language (CC REL). CC REL can be expressed in RDF.

The provenance of assertions about a work’s license or public domain status is of great important for licensors, licensees, curators, and future potential users. All CC licenses legally require certain information (attribution and license notice) be retained; even in the case of its public domain tools, retaining such information is a service to readers and in accordance with research and other norms. To the extent license and related information is not retained or cannot be trusted, users ability to find and rely upon freedoms to use such works is degraded. In many cases, the original publication location of a work will disappear (linkrot) or rights information will be removed, either unintentionally (eg template changes) or intentionally (here especially, provenance is important; CC licenses are irrevocable). In the degenerate case, a once CC-licensed work becomes just another orphan work.

The core statements needed are who licensed, dedicated to the public domain, or marked as being in the public domain, which work, and when? Each of these statements have sub-statements, eg the relationship of “who” to rights in the work or knowledge about the work, and exactly what work and at what granularity?

Provenance information is also necessary for discovering the uses of shared works and building new metrics of cultural relevance, scientific contribution, etc, that do not strictly require on centralized intermediaries.

Finally, in CC’s broader context, an emphasis on machine-assisted provenance aligns with renewed interest in copyright formalities (eg work registries), puts a work’s relationship to society’s conception of knowledge in a different light (compare intellectual provenance and intellectual property), and is in contrast with technical restrictions which aim to make works less useful to users rather than more.

Comments Off