metadata

See also roundup

ml, January 25th, 2012

The Learning Resource Metadata Initiative specification (which Creative Commons is coordinating) has entered its final public commenting period. Please look if you’re at all interested in education metadata and/or how efforts spurred by schema.org (which LRMI is) will shape up.

The W3C published drafts recently that ought be of great interest to the Creative Commons technology community: a family of documents regarding provenance and a guide to using microdata, microformats, and RDFa in HTML. I mentioned these on my personal blog here and here.

Speaking of things mentioned on my personal blog, a couple days ago I posted some analysis of how people are deploying CC related metadata based on a structured data extracted by the Web Data Commons project from a sample of the Common Crawl corpus. Earlier this month I posted a marginally technical explanation of using CSS text overlays to provide attribution and a brief historical overview of ‘open hardware licensing’, something which the CC technology team hasn’t been involved in, but is vaguely labs-ish, and needs deep technical attention.

Other things needing deep technical attention: how CC addresses Digital Restrictions Management in version 4.0 of its licenses is being discussed. We don’t know enough about the technical details of various restricted systems (see last sentence) that CC licensed works are being distributed on/to/with every day, and ought to. Another needs-technical-attention issue is ‘functional content’ for example in games and 3D printing. And we’re still looking for a new CTO.

Finally, Jonathan Rees just posted on how to apply CC0 to an ontology. You should subscribe to Jonathan’s blog as almost every post is of great interest if you’ve read this far.

Addendum: It seems remiss to not mention SOPA, so I’m adding it. Thanks to the technology community for rising up against this bad policy. CC promoted the campaign on its main website through banners and a number of blog posts. Don’t forget that SOPA/PIPA may well rise again, the so-called Research Works Act is very different but is motivated by the same thinking, and ACTA threatens globally. Keep it up! In the long term, is not building a healthy commons (and thus technology needed to facilitate building a healthy commons) a big part of the solution? On that, see yet another post on my personal blog…

Comments Off

LRMI tech WG CFP

ml, July 18th, 2011

If you know your stuff, the you might be able to guess from the subject what this is about. Perhaps LR = Learning Resource is not obvious. More on the main CC blog…

Comments Off

New Guide on Publishing CC License Metadata

akozak, January 7th, 2011

You may have noticed that the copy-and-paste HTML you get from the CC license chooser includes some strange attributes you’re probably not familiar with. That is RDFa metadata, and it allows for the CC license deeds, search engines, Open Attribute, and other tools to discover metadata about your work and generate attribution HTML. Many platforms have implemented CC REL metadata in their CC license marks, such as Connexions and Flickr, and it’s our recommended way to mark works with a CC license.

In an effort to make CC license metadata (or CC REL metadata) much easier to implement, we’ve created CC REL by Example. It includes many example HTML pages, as well as explanations and links to more information.

We’re hoping this guide will serve as a useful set of examples for developers and publishers who want to publish metadata for CC licensed works. Even if you just use CC licenses for your own content, now is a great time to take a first step into structured data and include information about how you’d like to be attributed.

You can find the source to the guide in git. Feedback and suggestions can be sent to webmaster@creativecommons.org.

1 Comment »

Thesis on metadata interoperability: RDF

ml, December 15th, 2010

From Interoperability to Harmonization in Metadata Standardization – Designing an Evolvable Framework for Metadata Harmonization (pdf) by Mikael Nilsson:

The analytical  framework is used to analyze and compare seven metadata specifications, and a
concrete set of harmonization issues is presented. These issues are used as a basis for a metadata
harmonization framework where a multitude of metadata specifications with different character-
istics can coexist. The thesis concludes that the Resource Description Framework (RDF) is the
only existing specification that has the right characteristics to serve as a practical basis for such a
harmonization  framework,  and therefore must be taken into account when designing metadata
specifications.
Based on the harmonization framework, a best practice for metadata standardiza-
tion development is developed, and a roadmap for harmonization improvements of the analyzed
standards is presented.

(emphasis added)

Nilsson will defend his thesis on 2010-12-15, and you can watch (details).

Thinking in RDF is the natural thing to do at Creative Commons, whether modeling license attributes, work registration, or domain-specific descriptions that add value to licensed works. Nice to see in depth academic backing for this intuition.

Comments Off

Toward expressive and interoperable Common Core metadata

akozak, December 10th, 2010

It’s been suggested with increasing frequency that an educational resource complying (or not) with the new Common Core standards would be the kind of thing that could be published as metadata on the web. This metadata could provide a platform upon which tools could be built. For example, an educational search tool could would allow anyone to search for learning objects that satisfy one or more Common Core standards.

The CCSSO, who published and stewards the Common Core standards through adoption, have not yet proposed a format for this metadata. In that vacuum, others are proposing their own solutions.

Karen Fasimpaur (who is awesome and was interviewed for a CC Talks With feature) recently published a set of tags to identify the Common Core standards. These tags are strings of text that uniquely identify the standards. For example, the College and Career Readiness Anchor Standards for Reading, Key Ideas and Details, Standard 1 is identified as “cc-k-5e-r-ccr-1″.

The goal, it seems, is to publish unique identifiers for the Common Core standards so that those unique identifiers could be attached to objects on the web as metadata, identifying which educational standards those objects meet.

We applaud efforts to identify when educational resources meet educational standards, and projects to catalog or tag resources with that data. This is one step forward in providing human-readable tags that encode that data, much like the string “by-sa” identifies the Creative Commons Attribution ShareAlike license.

The next step would be to provide stable URIs as identifiers for those tags such that machines, in addition to humans, can parse that metadata. These URIs could be maintained by authoritative organizations such as the CCSSO, but that isn’t technically necessary.

In addition, the URIs to the Common Core standards ought to be self-descriptive. That is, there should be metadata about that URI discoverable within that URI. For example, the CC licenses are self-descriptive. They contain metadata about the licenses so that when someone marks up a work as CC licensed, a machine could discover facts about that license by visiting the URL. This metadata is encoded in RDFa, and can be seen by looking at the source to the deed or viewing it through an RDFa distiller.

A URI identifying each standard brings other benefits. When used in subject-predicate-object expressions in metadata standards like RDFa, the expressive power of the identifier increases greatly. One could, for example, identify an arbitrary URI as being standards aligned and make complex statements about the standard, wheras with a human-readable tag interpretation is left to the reader. For example, you could place metadata referencing an educational resource on a “landing page” rather than the resource itself, or mark up specific blocks of text as meeting certain standards. Stable URIs to the Common Core standards, coupled with a metadata standard like RDFa, would allow for subject precision that is lacking in the K12 OpenEd metadata proposal.

Efforts like K12 OpenEd’s to publish Common Core standards metadata for educational resources are good progress. It gives us all a starting point that can inform future work.

1 Comment »

Google News and Source Citation

nathan, December 10th, 2010

Last month the Google News team announced two new meta tags publishers can use to mark up their content with source information. The first, syndication-source, is similar to rel="canonical": it lets Google know which version is authoritative (at least as far as Google News is concerned). The other, original-source, mirrors questions we’ve been thinking about here at CC. Google’s description in the announcement reads:

Indicates the URL of the first article to report on a story. We encourage publishers to use this metatag to give credit to the source that broke the story…the intent of this tag is to reward hard work and journalistic enterprise.

Most Creative Commons licenses allow derivative works, and the question of how you cite (attribute) your derivative is worth exporing. While it’s enough to include the attribution information, explicitly labeling the link to the source as the basis of your work not only allows others to discover that content, but also allows tools to begin drawing the graph of content reuse and repurposing.

Google’s suggestion for news articles is a good start: it lets publishers indicate the original source in a machine readable way. However it’d be even better if that information were also visible to readers of the article by default. Creative Commons licenses require that adaptations credit the use of the original in the adaptation (see §4b, CC BY 3.0, for example). You can imagine using the Dublin Core Terms to annotate this credit information using RDFa. For example:

This article originally appeared in <a xmlns:dc="http://purl.org/dc/terms" rel="dc:source" href="http://example.org/original-article">example.org</a>.

This also opens up the possibility of annotating the type of adaptation that occurred, such as translation, format change, etc.

Publishing machine readable information about sources and re-use is exactly where we want to go. Until the tools are ubiquitous, however, making that information visible to readers will be very important.

Comments Off

XMP FileInfo panel for Adobe Creative Suites 4 and 5 now available!

akozak, December 6th, 2010

This is a special guest post by John Bishop of John Bishop Images.

Prior to Adobe’s Creative Suite 4, adding Creative Commons license metadata via the FileInfo… dialog (found in Photoshop, Illustrator, InDesign and more) meant coding a relatively simple text based XML panel definition and has been available from the Creative Commons Wiki since 2007.

Starting with Creative Suite 4 Adobe migrated the XMP FileInfo panel to a Flash based application, meaning that adding Creative Commons metadata became much more complex, requiring Adobe’s XMP SDK and the ability to develop applications in Flash, C++ or Java.

After significant development and testing john bishop images is pleased to announce the availability of a custom Creative Commons XMP FileInfo Panel for Creative Suite 4 and Creative Suite 5 – free of charge.

This comprehensive package offers the ability to specify Creative Commons license metadata directly in first class, industry standard tools and places Creative Commons licensing metadata on the same footing as the standardized, commercial metadata sets like Dublin Core (DC), IPTC and usePLUS and tightly integrates all the metadata fields required for a Creative Commons license in one panel.

Also included is a metadata panel definition that exposes the Creative Commons license metadata in the mini metadata panels found in Bridge, Premiere Pro, etc. And finally a set of templates that can be customized for the various license types and more is also included; these templates can be accessed from Acrobat.

For more information and to download the Creative Commons XMP FileInfo panel visit john bishop images’ Creative Commons page.

Note: The panels are localized and a English-US language file is supplied. To contribute localization files in other languages please contact john bishop images.

1 Comment »

Draft metadata for Public Domain Mark

nathan, August 19th, 2010

As announced on the CC blog earlier this month, we’re working on a new tool to complement CC0, the Public Domain Mark (PDM). We have a set of open issues for PDM (and related improvements) that we’re working through, including developing the marking metadata that will be generated (Issue 640). I’ve put together a set of examples in the wiki that we’re looking for feedback on.

There are a couple of minor variations from past practice to note:

  • We support labeling both the creator of the work (dct:creator), as well as the person who identified it as being in the public domain and made it available. We chose dct:publisher for the latter, as it’s defined as “An entity responsible for making the resource available.”
  • We will support the scenario where there are two license statements: PD Mark and CC0. This is generated in the case that the labeler chooses to waive rights they may have incurred in the work, as part of restoration, digitization, etc. After some discussion, we’re simply using two license assertions for this. There are arguably two subjects at work here — the actual work and the digital representation of it — but in the interest of simplicity and consistency with our past recommendations, we’re treating them as one.
  • We’re planning to support non-binding usage guidelines on the deed at launch time, but not through the chooser. The example metadata is what the deeds would consume to display that link. I didn’t find a good existing predicate for this, so I propose we use cc:usageGuidelines and define it as a refinement of dct:relation.

If you have comments or suggestions, you can leave them as comments on this post, or leave a comment on the Public Domain Mark discussion page.


In the interest of completeness, I’m also planning to define cc:morePermissions, used for CC+, as a refinement of dc:relation.

Comments Off

Search and Discovery for OER

nathan, January 14th, 2010

Last summer the Open Society Institute generously provided funding for CC to host a meeting about search and discovery for OER. The goal was to bring together people with experience and expertise in different areas of OER discovery and see if there was common ground: pragmatic recommendations publishers could follow to achieve increased visibility right now.

After many months of inactivity, I’m happy to announce that we’ve published a draft of an initial document, a basic publishing guide for OER. “Towards a Global Infrastructure For Sharing Learning Resources” describes steps creators and publishers of OER can take today to make sure their work spreads as widely as possible. This draft was developed by attendees of the meeting, and is currently being reviewed; as such it may (and probably will) change.

As you can see from the meeting notes, this isn’t the last thing we hope to get from this meeting. The next steps involve getting our hands a little dirtier — figuring out how we link registries of OER repositories and implementing code to do so. It should be interesting to see how this work develops, and how it influences our prototype, DiscoverEd.

1 Comment »

New validator released!

asheesh, January 6th, 2009

This past summer, Hugo Dworak worked with us (thanks to Google Summer of Code) on a new validator. This work was greatly overdue, and we are very pleased that Google could fund Hugo to work on it. Our previous validator had not been updated to reflect our new metadata standards, so we disabled it some time ago to avoid creating further confusion. The textbook on CC metadata is the “Creative Commons Rights Expression Language”, or ccREL, which specifies the use of RDFa on the web. (If this sounds like keyword soup, rest assured that the License Engine generates HTML that you can copy and paste; that HTML is fully compliant with ccREL.) We hoped Hugo’s work on a new validator would let us offer a validator to the Creative Commons community so that publishers can test their web pages to make sure they encode the information they intended.

Hugo’s work was a success; he announced in August 2008 a test version of the validator. He built on top of the work of others: the new validator uses the Pylons web framework, html5lib for HTML parsing and tokenizing, and RDFlib for working with RDF. He shared his source code under the recent free software license built for network services, AGPLv3.

So I am happy to announce that the test period is complete, and we are now running the new code at http://validator.creativecommons.org/. Our thanks go out to Hugo, and we look forward to the new validator gaining some use as well as hearing your feedback. If you want to contribute to the validator’s development or check it out for any reason, take a look at the documentation on the CC wiki.

1 Comment »


next page