I’ve been considering license integration into a personal project of mine and thoughts of that have spilled over into work. And so we’ve been talking at Creative Commons recently about the current methods for licensing content managed by applications and what the future might be. The purpose of this post is to document the present state of licensing options. (A post on the future of licensing tools may come shortly afterward.)
Present, CC supported tools
To begin with, there are these three CC-hosted options:
- CC licensing web API — A mostly-RESTful interface for accessing CC licensing information. Some language-specific abstraction layers are provided. Supported and kept up to date. Lacking a JSON layer, which people seem to want. Making a request for every licensing action in your application may be a bit heavy.
- Partner interface — Oldest thing we support, part of the license engine. Typical case is that you get a popup and when the popup closes the posting webpage can access the info that’s chosen. Still gets you your chooser based interface but on your own site. Internet Archive uses it, among others.
All of these have the problem that the chooser of CC licenses is only useful if you want exactly the choices we offer (and specifically the most current version of the licenses we provide). You need to track those changes in the database anyway, which means you either are not keeping track of version used or you are and when we change you might be in for a surprise.
Going it alone
So instead there are these other routes that sites take:
- Don’t use any tools and store license choices locally — What Flickr and every other major option does: reproduce everything yourself. In the case of Flickr, the six core licenses at version 2.0. In YouTube, just one license (CC BY 3.0). That works when you have one service, when you know what you want, and what you want your users to use. It doesn’t work well when you want people to install a local copy and you don’t know what they want to use.
- Let any license you want as long as it fits site policy — and you don’t facilitate it, and it gets kind of outside the workflow of the main CMS you’re using… wiki sites are an example of this, but usually have a mechanism for adding a license to footer of media uploaded. The licenses are handled by wiki templates, anyone can make a template for any license they choose.
None of those are really useful for software you expect other people to install where you want to provide some assistance to either administrators of the software who are installing it to be used or where you want the administrator to give the user some choice or choices relevant to that particular site.
The liblicense experiment
This brings us to another solution that CC has persued:
- liblicense — Packages all licenses we provide, give an api for users to get info and metadata about them. Allows for web-request-free access to the cc licenses. It doesn’t address non-CC licenses, however, and is mostly unmaintained.
So, these are the present options that application developers have at their disposal for doing licensing of application-managed content. There’s a tradeoff with each one of them though: either you have to rely on web requests to CC for each licensing decision you make, you go it alone, or you use something unmaintained which is CC-licensing-specific anyway. Nonetheless, cc.api and the partner interface are supported if you want something from CC, and people do tend to make by with doing things offline. But none of the tools we have are so flexible, so what can software like MediaGoblin or an extension for WordPress or etc do?
There’s one more option, one that too my knowledge hasn’t really been explored, and would be extremely flexible but also well structured.
The semantic web / linked data option?
It goes like this: let either users or admins specify licenses by their URL. Assuming that page self-describes itself via some metadata (be it RDFa, providing a rel=”alternate” RDF page in your headers, or microdata), information about that license could be extracted directly from the URL and stored in the database. (This information could of course then be cached / recorded in the database.) This provides a flexible way of adding new licenses, is language-agnostic, and allows for a canonical set of information about said licenses. Libraries could be written to make the exctraction of said information easier, could even cache metadata for common licenses (and for common licenses which don’t provide any metadata at their canonical URLs…).
I’m hoping that in the near future I’ll have a post up here demonstrating how this could work with a prototypical tool and use case.
Thanks to Mike Linksvayer, for most of this post was just transforming a braindump of his into a readable blogpost.2 Comments »
Another delayed report, late by a day. This time, however, I can deliver; the current version of the plugin sports the filter system I unsuccessfully tried to implement the week before: While previous versions of the plugin inserted HTML directly into the post (example screenshot), the new iteration only inserts a shortcode containing the attachment ID and a caption (e.g.
[[cc:18|some caption]]). The actual markup is then generated when the page is requested. This satisfies use cases in which a blogger wishes to modify media metadata later on, like changing license or alt text.
Less visible for the user, I was able to unify the two saving functions triggered on saving and inserting media and adding a metadata field which holds the exact license URI for every attachment (determined using the Creative Commons API). I recommend checking out the repository.
For the coming week, I will look into post thumbnails, which require no inline markup for purely decorative pictures, like those used at Spreeblick and Breitband. I will also explore alternate content and plugin directories again, as my last attempt regarding that issue was a complete failure.Comments Off
This week, development on the plugin proceeded at a faster pace. Shortly after I posted the last report, Nathan Kinkade pointed out the fix to the bug that prevented saving, a simple type error. On the next day, I implemented stylesheet support, hereby adapting three styles I originally made for my defunct microdata plugin, and an admin interface to switch between them (screenshot). Additional more or less notable changes are:
- metadata is not only saved now, but will also be retrieved to populate form fields
- multiple RDFa fixes, machine-readable data should be correct now
- the plugin has a directory structure, earlier versions were just a single file
- there is now a sample file for stylesheet development
- metadata is also saved when the media item is inserted into the post
- the plugin now uses the Creative Commons API to get the current license version
I consider this version of the plugin not finished, but functional enough for testers, who are encouraged to check out the Git repository. For the coming week, I will look into the issue surrounding alternate content and plugin directories and proceed to polish the existing features.Comments Off
We had a request come in from a developer using the CC web services for better contextual information about the choices users make when selecting a license. In particular, they wanted to present users with information about what “ShareAlike” means. As I dug into it I realized that the existing
<description> we provided for the questions themselves are… well, bad. I can only imagine how hard it’d be to craft a user interface using those as help text.
I just pushed an update to the development version of the API that adds
<description> elements to the individual
<enum> elements. These map to the help pop-ups we use on the main license chooser. If all seems well we’ll push this down to 1.5 as well.
I should also note that this update includes two new, very nice (for me) improvements:
- I finally landed Frank’s test suite work from last summer. We had intended to replace the whole API with a leaner version, but that’s still in the works. So in the interim, we have a test suite and I’m not afraid of change anymore (on this project at least).
- I’ve updated the documentation to use Sphinx. ReStructured Text is the secret weapon of many a Python project; Sphinx makes it even more powerful, providing a set of directives and tools to generate something that looks, well, at least decent.
I wake up in the morning to find that eternal funny man George Carlin has died. Add to that the fact that one of my recently-ported api tests has failed, and you just know it’s going to be one of those days.
Work on porting the test suite was slow at first. I had thought it would be easy enough just to jump in and start coding. Later this pipe dream evaporated and I forced myself to learn CherryPy, Python Paste, and a bit of WSGI. Then Asheesh and Nathan filled me in on the intricacies of the “buildout” build system (which is really quite nice when you get to know it), and I was ready to go.
So the porting has begun. Check out the branch where all the fun is happening, if you happen to be so inclined. And as the fates would have it, the lucky number seventh test I ported just so happens to fail. This leads to the real challenge every software tester must eventually face: fixing a broken test. Getting to the bottom of this means a fun-filled day ahead.2 Comments »