This morning, the Creative Commons REST API was updated to include support for the Public Domain Mark and to deprecate the issuing of the retired Public Domain Certification and Dedication. We have also added a new element in license responses, that denotes whether or not the issued license has been deprecated by Creative Commons. These changes affect all versions of the REST API, excluding 1.0.
Over the past year, there has been a collaborative effort by the CC Tech team to perform a “sanity overhaul” on the tools and libraries that much of the CC Tech infrastructure relies on. Among the projects to be overhauled, there was the CC REST API, and it was trailing all of our other services in terms of currentness and ease of release iteration. The CC REST API was in need of a complete re-engineering effort so that it could be maintained and supported at the same level as our other newly-overhauled projects. We’re glad to announce that the reworking of the REST API has been a success and our focus on refactoring over new features can now come to a close.
To get started using the CC REST API in your own projects, consult the development version documentation or the 1.5 version documentation if your software is ready for production. If you have any issues, feature requests, or questions about the REST API, then feel free to send a message to the cc-devel mailing list with your questions. To file a bug report with the REST API, please submit your issue to the API project in our roundup bug tracker.No Comments »
Included in last week’s redesign is an updated Licenses page, describing the CC licenses and what makes them unique. The combination of machine-readable metadata, human readable deeds, and the legal code are unique (to my knowledge) in the public licensing world, and this approach enables interesting applications of the licenses and broadens their accessibility. That said, the approach is not always easy to describe.
After seeing Alex’s updated graphics for the page, I roped him into helping me create an interactive version, based on CSS3 Transitions and Transformations. You can find the visualization here on CC Labs. More information about why and how I built it is on my personal blog. Note that this demonstration requires Chrome, Safari, or Firefox 4. Opera 11 sort of works. IE, well, doesn’t.1 Comment »
Yesterday we launched a refresh of the site design for creativecommons.org. Included in the changes pushed was one small one originally suggested by our international Affiliate Network: the inclusion of the license identifier on the deeds.
Anyone who’s been in the CC community for any length of time has seen people refer to the licenses by their short-hand names: CC BY for Attribution, BY-SA for Attribution-ShareAlike, etc. But that short hand, while useful, has been a bit of inside baseball: it’s part of the URL, but never appeared on the deeds, which we want to be the human readable summary of the license. As of yesterday the short-hand name is now on the deed. We’ve also annotated it with RDFa, so the licenses self-describe their short name (software can dereference the license URI and look for information describing it there). Thanks again to Alek and the affiliate network for suggesting this change.1 Comment »
You may have noticed that the copy-and-paste HTML you get from the CC license chooser includes some strange attributes you’re probably not familiar with. That is RDFa metadata, and it allows for the CC license deeds, search engines, Open Attribute, and other tools to discover metadata about your work and generate attribution HTML. Many platforms have implemented CC REL metadata in their CC license marks, such as Connexions and Flickr, and it’s our recommended way to mark works with a CC license.
In an effort to make CC license metadata (or CC REL metadata) much easier to implement, we’ve created CC REL by Example. It includes many example HTML pages, as well as explanations and links to more information.
We’re hoping this guide will serve as a useful set of examples for developers and publishers who want to publish metadata for CC licensed works. Even if you just use CC licenses for your own content, now is a great time to take a first step into structured data and include information about how you’d like to be attributed.1 Comment »
Right before the winter break I came across Scott Wilson’s blog post on a CETIS blog about license discovery in RSS and Atom feeds. Scott provides an pseudo-algorithm for how they’ve approached license discovery. It’s a good approach, and I’m very happy to see people publishing about how they’ve approached this sort of issue. Reading it reminded me of a few points that are often glossed over or forgotten.
Scott points out that there are two CC namespaces —
http://web.resource.org/cc/. Due to hysteric^W historical reasons, web.resource.org was the first host of the CC REL schema, which we later moved to creativecommons.org (as the appropriate home). This came up on another thread late last year, and we’ve taken the first step to making this a little easier to deal with, redirecting the old home, web.resource.org, to creativecommons.org/ns. We’ll be publishing equivalency assertions soon to further clarify the situation for processors.
Scott also points out that the RDF included with licensed works is sometimes redundant. Yes, absolutely. Our previous recommendation suggested the inclusion of RDF describing the license in an HTML comment. As mentioned previously, we also realized this is redundant and of minimal value. It’s not clear under what circumstances a processor would be inclined to trust RDF about a license, at creativecommons.org, published with the work, elsewhere. Hindsight, 20/20, etc.
Finally, when discussing how to handle the license URIs extracted, Scott’s approach states that if the license URI is not known, they mark it as “unknown”. This is a situation where self-describing documents can be useful to processors. An alternative approach would be to dereference the URI and attempt to extract details about the license. We use this approach ourselves in several situations, most recently with OpenAttribute, a prototype Firefox add-on for displaying license and attribution information.No Comments »
Highlights from December, 2010:
- We concluded our annual campaign. Thanks to everyone who helped us raise over $500,000; your support is greatly appreciated. (And it’s not too late to contribute!)
- We verified that our OpenOffice.org plugin is compatible with LibreOffice, so obviously it needs a better name than CCOOo. Suggestions? Leave a comment.
- John Bishop shipped updated CC + XMP support for Adobe CS 4 and CS 5
- We began transcoding our videos into WebM, a free codec for HTML5 video
- Metadata interoperability and harmonization continues to be an area we’re paying attention to, particularly with respect to OER, where there’s no clear winner [yet].
- And while technically from November, it bears highlighting that Technical Case Studies are now in the CC wiki.
When I started at CC a number of years ago and began having to review Logwatch output on a daily basis, I tired quickly of the massive list of failed SSH login attempts in the log output. I care much less about who failed to login than who actually did log in. So the first thing I did was to reduce the verbosity of the SSH filters for Logwatch by creating the file
/etc/logwatch/conf/services/sshd.conf, and added only “
Detail = 0” to it. However, I still found it annoying to have thousands of failed login attempts on virtually all servers. Granted, I wasn’t really worried that anyone would get in by trying to brute-force a login. It was a more a matter of principle, and also a small bit that every failed login attempt uses some tiny amount of resources that could better be used for legitimate traffic. So I implemented connection rate limiting via Netfilter. However, that didn’t work for our then software engineer Asheesh, who generally has around 30 open terminals and as many SSH connections to remote hosts, and who was hitting the rate connection limit. So he started using the ControlMaster feature of SSH to get around this limitation. Some time later I removed the rules altogether with the idea that they weren’t doing anything useful, and were probably detrimental because the kernel was having to inspect a bunch of incoming packets and track connections. Also, at that same time Asheesh recommend that I use a program called fail2ban instead of tackling the issue with Netfilter. I didn’t like the idea. Something seemed hackish about inserting Netfilter rules via some daemon process that scrapes log files of various services. I also am an advocate of running as few services as possible on any given server; the less that runs, the less chance that something will fail in a service-impacting way. Then, the whole thing fell into the forgotten, until a few days ago.
A few days ago I was looking over the Logwatch output of our servers, as I do ever day, and was offended to find that on one server in particular there were nearly 30,000 failed SSH login attempts in a single day. Sure, in terms of network traffic and machine resources, it’s just a drop in the bucket, but it aggravated me. I revisited the idea of fail2ban and did a bit more research. I came to the conclusion that it was pretty stable and worked really well for most people. So I decided to install it on one server. I was really happy to find that it was as easy as
apt-get install fail2ban. Done! On Debian, fail2ban works for SSH out-of-the-box, and I didn’t have to do a thing; just another testament to the awesomeness of package management in Debian. I was so impressed that I went ahead and installed it on all CC servers. It has been running nicely for about a week, and failed SSH login attempts are now reduced to a few dozen a day on each machine. Are the machines more secure? Probably not. But it’s just one of those things that makes a sysadmin happy.
The analytical framework is used to analyze and compare seven metadata specifications, and a
concrete set of harmonization issues is presented. These issues are used as a basis for a metadata
harmonization framework where a multitude of metadata specifications with different character-
istics can coexist. The thesis concludes that the Resource Description Framework (RDF) is the
only existing specification that has the right characteristics to serve as a practical basis for such a
harmonization framework, and therefore must be taken into account when designing metadata
specifications. Based on the harmonization framework, a best practice for metadata standardiza-
tion development is developed, and a roadmap for harmonization improvements of the analyzed
standards is presented.
Nilsson will defend his thesis on 2010-12-15, and you can watch (details).
Thinking in RDF is the natural thing to do at Creative Commons, whether modeling license attributes, work registration, or domain-specific descriptions that add value to licensed works. Nice to see in depth academic backing for this intuition.No Comments »
It’s been suggested with increasing frequency that an educational resource complying (or not) with the new Common Core standards would be the kind of thing that could be published as metadata on the web. This metadata could provide a platform upon which tools could be built. For example, an educational search tool could would allow anyone to search for learning objects that satisfy one or more Common Core standards.
The CCSSO, who published and stewards the Common Core standards through adoption, have not yet proposed a format for this metadata. In that vacuum, others are proposing their own solutions.
Karen Fasimpaur (who is awesome and was interviewed for a CC Talks With feature) recently published a set of tags to identify the Common Core standards. These tags are strings of text that uniquely identify the standards. For example, the College and Career Readiness Anchor Standards for Reading, Key Ideas and Details, Standard 1 is identified as “cc-k-5e-r-ccr-1″.
The goal, it seems, is to publish unique identifiers for the Common Core standards so that those unique identifiers could be attached to objects on the web as metadata, identifying which educational standards those objects meet.
We applaud efforts to identify when educational resources meet educational standards, and projects to catalog or tag resources with that data. This is one step forward in providing human-readable tags that encode that data, much like the string “by-sa” identifies the Creative Commons Attribution ShareAlike license.
The next step would be to provide stable URIs as identifiers for those tags such that machines, in addition to humans, can parse that metadata. These URIs could be maintained by authoritative organizations such as the CCSSO, but that isn’t technically necessary.
In addition, the URIs to the Common Core standards ought to be self-descriptive. That is, there should be metadata about that URI discoverable within that URI. For example, the CC licenses are self-descriptive. They contain metadata about the licenses so that when someone marks up a work as CC licensed, a machine could discover facts about that license by visiting the URL. This metadata is encoded in RDFa, and can be seen by looking at the source to the deed or viewing it through an RDFa distiller.
A URI identifying each standard brings other benefits. When used in subject-predicate-object expressions in metadata standards like RDFa, the expressive power of the identifier increases greatly. One could, for example, identify an arbitrary URI as being standards aligned and make complex statements about the standard, wheras with a human-readable tag interpretation is left to the reader. For example, you could place metadata referencing an educational resource on a “landing page” rather than the resource itself, or mark up specific blocks of text as meeting certain standards. Stable URIs to the Common Core standards, coupled with a metadata standard like RDFa, would allow for subject precision that is lacking in the K12 OpenEd metadata proposal.
Efforts like K12 OpenEd’s to publish Common Core standards metadata for educational resources are good progress. It gives us all a starting point that can inform future work.1 Comment »
Last month the Google News team announced two new meta tags publishers can use to mark up their content with source information. The first,
syndication-source, is similar to
rel="canonical": it lets Google know which version is authoritative (at least as far as Google News is concerned). The other,
original-source, mirrors questions we’ve been thinking about here at CC. Google’s description in the announcement reads:
Indicates the URL of the first article to report on a story. We encourage publishers to use this metatag to give credit to the source that broke the story…the intent of this tag is to reward hard work and journalistic enterprise.
Most Creative Commons licenses allow derivative works, and the question of how you cite (attribute) your derivative is worth exporing. While it’s enough to include the attribution information, explicitly labeling the link to the source as the basis of your work not only allows others to discover that content, but also allows tools to begin drawing the graph of content reuse and repurposing.
Google’s suggestion for news articles is a good start: it lets publishers indicate the original source in a machine readable way. However it’d be even better if that information were also visible to readers of the article by default. Creative Commons licenses require that adaptations credit the use of the original in the adaptation (see §4b, CC BY 3.0, for example). You can imagine using the Dublin Core Terms to annotate this credit information using RDFa. For example:
This article originally appeared in <a xmlns:dc="http://purl.org/dc/terms" rel="dc:source" href="http://example.org/original-article">example.org</a>.
This also opens up the possibility of annotating the type of adaptation that occurred, such as translation, format change, etc.
Publishing machine readable information about sources and re-use is exactly where we want to go. Until the tools are ubiquitous, however, making that information visible to readers will be very important.No Comments »