license engine

More helpful 404 pages

cwebber, January 28th, 2011

This is one of those little features that tends to go into the license engine that runs on the website which are helpful and small, but not too noticeable if not pointed out. I usually do a pretty bad job of making note of these when they go out, but this time, I’m doing better!

Even most people who don’t know anything about HTTP know that a 404 status code on the web somehow means that the thing you were looking for isn’t actually there. How frustrating! But if it’s not there, maybe we have enough information to help you find what you actually wanted.

That’s the idea between the work that went into Issue 255: “Smart” 404 pages. Maybe we didn’t find a license (or public domain tool) under the URL you put in, but we might be able to help you find a license that does exist. For example, licenses listed under /licenses/ on are parsed out like /licenses/{code}/{version}/ or /licenses/{code}/{version}/{jurisdiction}/. Knowing that, we can give a list of licenses for what licenses someone might have meant when they:

The pages mostly look like a normal 404 page, but with just a bit more contextually helpful information (the “were you looking for” section). And, of course, they still return a 404 status code!

Comments Off

Understanding the State of Sanity (via whiteboards and ascii art)

cwebber, December 18th, 2009

Since I started working at Creative Commons a number of months ago, I’ve been primarily focused on something we refer to as the “sanity overhaul”.  In this case, sanity refers to try and simplify what is kind of a long and complicated code history surrounding Creative Commons’ licenses, both as in terms of the internal tooling to modifying, deploying, and querying licenses and the public facing web interfaces for viewing and downloading them.  Efforts toward the sanity overhaul started before I began working here, executed by both Nathan Yergler and Frank Tobia, but for a long time they were in a kind of state of limbo as other technical efforts had to be dedicated to other important tasks.  The good news is that my efforts have been permitted to be (almost) entirely dedicated toward the sanity overhaul since I have started, and we are reaching a point where all of those pieces are falling into place and we are very close to launch.

To give an idea of the complexity of things as they were and how much that complexity has been reduced, it is useful to look at some diagrams.  When Nathan Kinkade first started working at Creative Commons (well before I did), Nathan Yergler took some time to draw on the whiteboard what the present infrastructure looked like:

as well as what he envisioned the “glorious future” (sanity) would look like:

When I started, the present infrastructure had shifted a little bit further still, but the vision of the “glorious future” (sanity) had mostly stayed the same.

This week (our “tech all-hands week”) I gave a presentation on the “State of Sanity”.  Preparing for that presentation I decided to make a new diagram.  Since I was already typing up notes for the presentation in Emacs, I thought I might try and make the most minimalist and clear ASCII art UML-like diagram that I could (my love of ASCII art is well known to anyone who hangs out regularly in #cc on Freenode).  I figured that I would later convert said diagram to a traditional image using Inkscape or Dia, but I was so pleased with the end result that I just ended up using the ASCII version:


     ( o_o)
     |USER| --.
     '----'   |
         ___   .---.
       .'   ','     '.
     -'               '.
    (     INTARWEBS     )
     '_.     ____    ._'
        '-_-'    '--'
      +---------------+  Web interface user
      |   cc.engine   |  interacts with
      +---------------+  Abstraction layer for
      |  cc.license   |  license querying and
      +---------------+  pythonic license API
      +---------------+  Actual rdf datastore and
      |  license.rdf  |  license RDF operation tools


  |  cc.i18npkg  |
  | .----------. |
  | | i18n.git | |


  +------------+  +-----------+  +---------+  +-------------+
  |    old     |  | old zope  |  | licenze |  | license_xsl |
  | cc.license |  | cc.engine |  +---------+  +-------------+
  +------------+  +-----------+

This isn’t completely descriptive on its own, and I will be annotating as I include it in part of the Sphinx developer docs we are bundling with the new cc.engine.  But I think that even without annotation, it is clear how much cleaner the new infrastructure is at than the old “present infrastructure” whiteboard drawing, which means that we are making good progress!


New validator released!

asheesh, January 6th, 2009

This past summer, Hugo Dworak worked with us (thanks to Google Summer of Code) on a new validator. This work was greatly overdue, and we are very pleased that Google could fund Hugo to work on it. Our previous validator had not been updated to reflect our new metadata standards, so we disabled it some time ago to avoid creating further confusion. The textbook on CC metadata is the “Creative Commons Rights Expression Language”, or ccREL, which specifies the use of RDFa on the web. (If this sounds like keyword soup, rest assured that the License Engine generates HTML that you can copy and paste; that HTML is fully compliant with ccREL.) We hoped Hugo’s work on a new validator would let us offer a validator to the Creative Commons community so that publishers can test their web pages to make sure they encode the information they intended.

Hugo’s work was a success; he announced in August 2008 a test version of the validator. He built on top of the work of others: the new validator uses the Pylons web framework, html5lib for HTML parsing and tokenizing, and RDFlib for working with RDF. He shared his source code under the recent free software license built for network services, AGPLv3.

So I am happy to announce that the test period is complete, and we are now running the new code at Our thanks go out to Hugo, and we look forward to the new validator gaining some use as well as hearing your feedback. If you want to contribute to the validator’s development or check it out for any reason, take a look at the documentation on the CC wiki.

1 Comment »

Internet Explorer and Internationalized JsWidget (0.3)

asheesh, August 1st, 2007

The scrubbing bubbles have been at work again on JsWidget. JsWidget is an attempt to let web application developers insert the Creative Commons “Choose a license” questions into their application by just including one file from

I just released a new pre-release version of JsWidget, version 0.3. You can read about the project on its wiki page, including learning how to use it. There are some interesting new features:

First of all, it’s compatible with Internet Explorer. My generated JavaScript code was suffering from a correctness issue that using XHTML in Firefox showed me, and fixing that made it render in Internet Explorer. (Then I had to switch from using onChange to onClick, again a subtle correctness issue.)

Secondly, it supports a cool form of internationalization called HTTP Content Negotiation. Web browsers optionally (but usually) send a header to the web server indicating the sort of content they can accept, including what languages the user wants to read. In addition to the old ?locale= form of specifying a language, the text should be translated to the user’s native language. (Unfortunately not all of the strings are translated yet, but try hovering over an info box or looking through the list of jurisdictions. In all cases where we don’t have a translation, we fall back to US English.)

Finally, it supports a feature called “license seeding.” By default, the UI offers the user the Attribution license in the generic jurisdiction. By passing in a URL, you can change that starting point. This is especially useful for letting a user revisit a license choice he made in the past and consider changing it.

You can read more about these features on the wiki page for this project, and if you’re sly you could even look at your plan for the future. But the most fun thing to do always is to play with our demos! Now there are three:

Comments Off

The easiest way yet to integrate CC licensing into a web app (preview)

asheesh, July 19th, 2007

I’ve been working for the past week or so on a JavaScript licensing widget that has been suggested on our wiki. It’s a new way to integrate CC licensing into your web application. It’s really as easy as pie: Just add the following tag somewhere in the body:

<script src=”” />

and a CC licensing widget will appear. Your web application can then use
regular DOM queries to determine the user’s choice.

NOTE that this is not ready for prime-time use! I want feedback on what people would like us to add or change. Right now it serves only English-language text; in the future you will be able to add ?locale=, stick your language code at the end, and get text back in your language. Beyond translation, tell me how else I can be of service!

You can download a trivial sample application and a long-ish README at our SourceForge project. For y’all’s convenience here’s a link to the README.

It’ll take you all of five minutes to deeply understand what’s going on, so I suggest you do if you’re thinking about (or if you already are) offering CC licensing to users of web applications you work on.

(P.S. This is cross-posted to the cc-devel list.)

Comments Off

Liblicense has licenses! 376 of them…

jakin, July 6th, 2007

Prepping for the 0.1 release, I’ve generated RDF descriptions of all CC licenses in all available jurisdictions, as well as the GPL, LGPL, and Public Domain.

Available here:

Each license, if applicable, has all the attributes laid out on the wiki, including localization. One problem, however, is getting localized descriptions of the licenses. That isn’t available at

Licenses were generated with this python script, which reads the relevant information from and cctools svn.

Comments Off

More Summer of Code — Liblicense, Tracker, Beagle,…

jakin, June 30th, 2007

Let’s see, where am I at.

Code in GStreamer to read the license URI is getting pushed through. Now there’s Bug #451939 that updates the GStreamer API with a license and copyright uri tag. When this all gets pushed through, access to the license URI will be available through GST_TAG_LICENSE_URI and/or GST_TAG_COPYRIGHT_URI.

In Tracker, I’ve written code to handle generic indexing of embedded/sidecar XMP. Previously it just extracted the license, and now any elements can be pulled out and indexed. Currently, Dublin Core and CC elements are indexed. The code is still local, and yet to be committed.

In another direction, I’ve been lending a hand to liblicense. As mentioned in Scott’s previous post, I’ve got two i/o modules ready. Both are based on Exempi. One reads/writes license metadata directly into Quicktime, AVI, PDF, PNG, TIFF, and JPEG formats. The other read/writes sidecar XMP for any format. There’s more to come.

I also want to look into a liblicense config module and frontend for KDE4. I figure I can put my KDE programming experience to good use.

And in yet another direction, I’m looking into indexing licenses in Beagle. After browsing the code, I can adapt most of what I learned about license metadata while working with Tracker to extending Beagle. I even notice that their image formats filters already support extracting XMP, so adding the extra license checks is straightforward. A preliminary patch and request for feedback has been posted on their mailing list.

All in all, I’ve done some work here and there, for this project and that…

1 Comment »

Enhanced Metadata Graduates from Labs

nathan, June 21st, 2007

Early this morning we launched some functionality on the main “license chooser”: previously available only on “Labs”: As many (ok, at least a few) people have noted, we previously stopped embedding RDF in the HTML generated by the chooser. As we’ve “noted”: in the past, RDF in a comment has several draw backs, not the least of which is that it’s opaque to parsers. The new update to the license chooser restores the embedded metadata using “RDFa”:

As the name implies, RDFa is a way of expressing RDF using _attributes_ in the HTML. This is similar to microformats, but different in that any RDFa parser can read any RDFa information — no special knowledge required. So the new metadata once again allows you to encode the name of your work, your name, and the type of work, all in the HTML. A full example (with all fields filled in) is shown here:

Creative Commons License

CC TechBlog by
Creative Commons is licensed under a
Creative Commons Attribution 3.0 License.

Based on a work at

Permissions beyond the scope of this license may be available at

So how do you know the metadata is there? Check out the “RDFa Bookmarklets”: which demonstrate how you can expose the information using some simple Javascript.

*UPDATE* Unfortunately WordPress MU strips out attributes it doesn’t recognize, so the example above isn’t as complete as it could be.

Comments Off