I’ve been asked, as a tech intern here at Creative Commons, to create a way of locally tracking file licenses on a system. A while back Jon wrote down his ideas about system-wide license tracking on the Creative Commons wiki. The purpose of this system would be to provide an interface for developers to access the available licenses on a system. Additionally, like the existing online license chooser, this library, called libLicense, will feature a way to choose a license through toggling certain flags available for a family of licenses. Naturally, the first family available will be the Creative Commons licenses. The larger goal for the summer is to utilize this library in a few initial systems. Currently, I’m looking at integration into Gnome and Sugar (from the One Laptop Per Child project). This further work will occur after libLicense is working.
To run libLicense the data of all the licenses will need to be stored in some sort of fashion. My initial thought is this:
- All data will be stored in a directory. On Linux this directory would be /usr/share/licenses . (This is borrowed from Jon’s thoughts.)
- Families of licenses will be stored in a subdirectory of the licenses directory. For example, the Creative Commons licenses would be stored within creative_commons.
- Within these family directories each specific license will be stored in a file with the naming scheme <bitcode>-<short name>-<jurisdiction>-<locale>.license . These files will store the license uri, name, status (active or retired), description and legal text. How this will be stored is up in the air. My intial thoughts include separating each attribute on a line or having a format similar to .desktop files.
- In addition to storing license data, some family information must be stored, namely the family bit flags. In the case of the Creative Commons licenses, the bit flags would be Attribution, Share-Alike, Non-Commercial and No Derivatives. They would combine to create the bitcode present in the license filename. These bit flags would be the heart of the license chooser logic. If the combination does not exist, the flags are incompatible.
The library would potentially have these functions:
get_jurisdiction(uri) - returns the jurisdiction for the given license.
get_jurisdictions(short or bitcode) - returns the available jurisdiction for the given short name or bitcode.
get_locale(uri) – returns the locale for the given license.
get_locales(jurisdiction, short or bitcode) – returns the available locales for the given jurisdiction and short name or bitcode.
get_name(uri) – returns the name of the license.
get_version(uri) – returns the version of the license.
get_versions(short, jurisdiction) - returns the available versions for the given short name or bitcode and jurisdiction.
get_short(uri) - returns the short name for the given uri.
has_flag(attribute,uri) – returns if the flag is set for the given uri.
family_flags(family) - returns the flags available for a given family.
family(uri) – returns the family the given uri belongs to.
get_notification(uri[,url]) - returns the notification string for the given url with an option to provide a verification url.
verify_uri(uri) - returns whether or not the given uri is recognized by the system.
get_license(family,flags, jurisdiction,locale) – returns the uri which satisfies the given attributes.
get_all_licenses() - returns all general licenses available.
get_general_licenses(family) - returns all general licenses in a family.
get_families() – returns a list of available families.
Did I miss something? Does something not make sense? Please post a comment.2 Comments »
I’ve made progress extracting licenses from the following formats: Vorbis, MP3, FLAC, PDF, JPEG, TIFF, PNG, PDF, HTML, and MSOffice. They are by no means all done, but for several formats I have patches and am awaiting approval from Tracker.
I’ve written a GStreamer bug report and submitted a patch to allow reading the WCOP (License URI) id3v2 tag. Discussion continues there.
No luck with video metadata (AVI, Matroska, OGM, Quicktime). Things are just too ad-hoc in that arena to get anything worthwhile done. For Tracker, GStreamer is doing all the work on extracting video metadata, but as far as I can tell, nothing relating to licenses ever gets extracted and passed on to Tracker. GStreamer would need to be updated to read the tags, but that can’t be done unless there are consistent specs on how to do so. Exempi can embed XMP into MOV and AVI, but I don’t know how to get it back out. It may or may not be feasible to write an extractor that only extracts XMP using Exempi.
Information on various file formats’ metadata is available here: http://wiki.creativecommons.org/Tracker_CC_Indexing While Tracker won’t specifically be indexing every format mentioned, I’m trying to document the formats relevant to Creative Commons. If I’m missing any important formats, please let me know.
Overall, things are progressing well. At the rate things are going, by the end of the summer I’ll have become a manual for file format specifications :-/
Cheers1 Comment »
Week 1 of Google Summer of Code is complete and already I’m seeing much progress. There’s a mess of formats to embed licenses into and a mess of ways to embed them. My first task has been straightening out where licenses are embedded in each format and how exactly to go about extracting them. Here’s where I’m at:
|Format||Form of Metadata||Location of Metadata||Extraction with Tracker||Test content|
||Extracting MP3 tags has moved from an ID3 parser to handing off the work to GStreamer/MPlayer/Totem. As far as I can tell, this prevents me from extracting the XMP.||XMP embedded with Exempi|
|XMP||metadata field||Extend the current PDF extractor (which uses Poppler) to read the metadata field. However reading the metadata field isn’t wrapped in Poppler’s glib bindings, but I have written and submitted a patch.||XMP embedded with Exempi|
||Extend the GStreamer extractor to check for the presence of an XMP comment field. GStreamer places this within the EXTENDED_COMMENTS tag (requires GStreamer 0.10.10).||XMP embedded with vorbiscomment|
|JPEG||XMP||Exif XML Packet field||Extend the Imagemagick extractor, using ‘convert file.jpg xmp:-’ to read XMP||XMP embedded with Exempi|
|PNG||XMP||iTXt, XML:com:adobe:xmp field||Extend the PNG extractor, adding a check for XML:com:adobe:xmp. (For backwards compatibility, the ability to read iTXt in libpng is disabled by default until version 1.3.)||XMP embedded with Exempi|
|HTML||RDFa||<a rel=”license” href=”…”></a>||Write a new HTML extractor, using libxml2, and scan for RDFa||Various actual sites, including creativecommons.org|
|SVG||RDF||/svg/metadata/rdf||I could specifically parse the XML, checking for the RDF schema used by Inkscape. Should I check for XMP also???||Inkscape|
|Any XML||XMP||Wherever valid||Write a generic XML extractor (and/or extractor for each particular format), scanning with libxml2|
|OpenOffice.org (OASIS)||OO.org CC License Add-In SoC Project is working on the spec||OO.org Add-In|
|MS Office||DocumentSummaryInformation Infile, CreativeCommons_LicenseURL property||Extend existing msoffice extractor||MSOffice Add-in|
If this is all well and good, I’d like to help update the CC Wiki with updated embedding specifications.
As far as coding goes, I wrote the code for Tracker to check for and extract metadata from XMP sidecar files. XMP is parsed by Hubert’s XMP library. The timing of Adobe’s release of their XMP Toolkit and Hubert subsequently release of Exempi 1.99.x, have been an early boon to the project. The ‘license’ tag in the CC namespace is the only metadata extracted at the moment.
I’ve also been hacking the extractors of the above list of formats to determine the feasibility and processes of extracting license metadata from each.
Where I stand now is that feedback on the above would be much appreciated and if all is well I can get the XMP sidecar code I have pushed into Tracker’s Subversion repository soon.
Happy hacking, indeed.6 Comments »
His blog post includes a nice snippet of code showing how to apply a CC license to a PDF:
f = xmp_files_open_new("test.pdf", XMP_OPEN_FORUPDATE);
XmpPtr xmp = xmp_files_get_new_xmp(f);
xmp_set_property(xmp, NS_XAP_RIGHTS, "Copyright", "(c) ACME Inc., some rights reserved"
" - This work is licensed to the public under the Creative Commons Attribution-ShareAlike "
Excellent news for the community, and for the continuing saga of XMP.No Comments »
(But don’t get too excited — ccPublisher is actually only at version 2.2.1 and Move My Data looks like vaporware at this point.)No Comments »
I was working this morning and noticed that “#cc”:http://wiki.creativecommons.org/IRC, our IRC channel, was particularly active. But I couldn’t figure out what they were talking about. It didn’t look like any conversation about licensing _I’d_ ever seen. And then I realized: it was a botnet rental negotiation. What I especially loved is the question, “you a fed?” Presumably *K_Soze* is under the false impression that if a law enforcement officer answers the question dishonestly, they’re guilty of entrapment.
Presented for your enjoyment, the logs:
[09:19am] K_Soze: Evenin all
[09:20am] K_Soze: I’m wondering what seller sells?
[09:20am] Seller: hi K_Soze
[09:20am] Seller: that depends…
[09:20am] Seller: what you want to buy?
[09:21am] K_Soze: Well that’s always the question… but it’s also a matter of whether people can supply… and people can supply the right quality and quantity?
[09:21am] Seller: K_Soze: you will have to be more specific?
[09:22am] K_Soze: I thought a channel called cc , talking a guy called seller was specific enough, but maybe not….
[09:23am] Seller: you are not a fed are you?
[09:23am] K_Soze: There’s no need to get spooked… I’ve dealt with people on this chan before who I’m sure will vouch for me
[09:24am] Seller: lets just say I like to help people out
[09:24am] Seller: you got a problem – i got a solution
[09:25am] K_Soze: Yeah but not everyone is looking for such a high quality solution… the cc bit is generally easy… it’s other more…… “mechanical” things I’m interested in…
[09:25am] Seller: hmmm
[09:25am] Seller: i sell by the thousands
[09:25am] Seller: top quality
[09:26am] K_Soze: Thousands? Well I’ve spoken to people who sell by the thousands… they ask if I want one or two thousand….
[09:26am] K_Soze: but that’s obviously not what’ i’m interested in
[09:26am] Seller: you dont want a couple thousand mechanical friends?
[09:26am] Seller: to help out?
[09:26am] K_Soze: No, I want more than a couple…
[09:26am] Seller: how much more?
[09:27am] K_Soze: Well I guess it’s not so much the number as the commotion such a bunch of friends could induce…
[09:28am] Seller: can do pings, can do http, can do smtp
[09:28am] K_Soze: Geographically dispersed?
[09:28am] Seller: very very effective
[09:28am] Seller: all over the place – china, russia, usa
[09:29am] Seller: australia
[09:29am] Seller: i can mix them up for ya
[09:29am] K_Soze: Hrmmmm…..
[09:29am] K_Soze: by the job or by day / week?
[09:29am] Seller: you rent them per week
[09:29am] Seller: web interface – very easy
[09:30am] K_Soze: web? no IRC?
[09:30am] Seller: yep web – very easy
[09:30am] K_Soze: hrmm interesting…. would be curious to give them a run…. this collection… they attracting much attention?
[09:31am] Seller: barely used so far
[09:31am] nathany: uh, do you guys realize this channel is for Creative Commons license-related discussion?
[09:31am] Seller: oh oops
[09:31am] Seller: K_Soze: see ya later
[09:31am] nathany: yeah, oops
[09:31am] Seller left the chat room.
[09:31am] K_Soze: cc = Creative commons? laters.
[09:31am] K_Soze left the chat room. (“leaving”)
Start with Elias’ XTech 2007 post and presentation.No Comments »
Although the previous Adobe open source license is quite open, we decided that is was best to use a standard open source license that is respected in the open source community. Opensource.org was invaluable in reviewing the many different open source licenses that are available.
The 4.1.1 XMP release is significant because it include the source code for developers to read, write and update XMP in popular image, document and video file formats including JPEG, PSD, TIFF, AVI, WAV, MPEG, MP3, MOV, INDD, PS, EPS and PNG.
Also, please help digg this so more can find out about it!No Comments »
In a follow-up to Mike’s post about XMP, I (through CC) have been working with Adobe XMP’s product manager, Gunar Penikis, on how CC and Adobe can work together on XMP. Also, in the same line, I’m friends with and working with Cyrille Berger and Hubert Figuiere, who have each noted how positive of a step releasing XMP SDK/Toolkit under a BSD license is for the larger community.
I’m having some other discussions with all the above mentioned folks with regards to how this is going to pan out, but all I can say is that it is going to encouage XMP to flourish, and return help smooth out metadata and embedding across the board.
This really frees up the space for more developments1 Comment »
CC staff Jon Phillips and Alex Roberts attended the Libre Graphics Meeting in Montreal last weekend. Jon posted his slides (PDF). Alex posted a heartening update titled Libre on his personal blog, which I’ll repost here:
No Comments »
The title is a bit of a misnomer, since my laptop runs OSX and I use Adobe software at work. At least right now, various parts of my workflow (and I sure do dislike that term) will be changing soon. Hereâ€™s why.
Last weekend I attended the second, annual, Libre Graphics Meeting, in Montreal. A time of firsts for me: my first time visiting Canada; my first LGM; and the first time Iâ€™d met, in person, many of the hackers and artists in the F/OSS community. Some of whom inspired me to become a designer, and give me the lofty goal of working in free culture. So that was incredible.
LGM featured a lot of talks going over new developments in the community, it was great to see the directions all the art tools are going â€” Scribus, Krita, Inkscape, etc. But what really got me hyped, and excited about the future, were the demos. Iâ€™ll admit Iâ€™ve been out of the OSS art loop for a while, having little time to check out the latest trunk builds of everything, but seeing them all in use really inspired me that yes, these tools work, and yes, they can in fact be used in place of the commercial giants. Scribus has, by far, come the farthest since I first saw it. I believe itâ€™s fairly safe to assume that I could use it, instead of InDesign, for much of the print work I do. Iâ€™ll be sure to report on my progress in that regard.
That being said, while you can use the open source tools for production, you do need an open mind and ability to learn what they can do â€” both similarly and differently. For instance, Inkscape has some incredible features that you wonâ€™t find in Illustrator â€” gradients on strokes; advanced object linking, allowing you to create complex effects that remain completely editable; full access to the underlying XML, so you can directly edit any content. But unlike Illustrator, Inkscape doesnâ€™t yet handle CMYK or spot colours, and has no support for any kind of blending modes (coming soon). So I doubt Iâ€™d be able to move 100% away from non-free tools, for the foreseeable future, but it really isnâ€™t too often I find myself tasked with print work. So a minor inconvenience at worst.
The whole experience makes me extremely excited over these improved possibilities, of using the tools in the real world, and the joy of contributing back where I can. This includes Serif and FontView â€” my font manager, and viewer apps respectively. A good amount of hacking on FontView went on over the weekend, solving a number of large bugs, which also made me happy.