RDFaCE: an RDFa-enhanced TinyMCE rich editor

ml, July 9th, 2011

For a long time — it feels like much longer than the RDFa Plugin for WordPress tech challenge has been on the wiki (28 months) — the idea that there should be such a thing has been around. I recall multiple Summer of Code applications proposing to tackle the problem. However, it is a really hard UI problem.

I’m really happy to see the announcement of RDFaCE, which does most of the hard work.

Without reading any documentation or watching their screencast (still haven’t watched it, no idea if it is any good!) I was able to add a cc:attributionName annotation specific to the image in their demo on my first try:

  • select the photographer name, insert cc:attributionName annotation with literal value already in the text. RDFaCE seems to already know the correct cc: namespace mapping.
  • select content around photo, set subject to photo URL
  • verify that triples produced are correct

Granted I more or less know what I’m doing. But, so do lots of other people. Contrary to some impressions, annotating stuff on the web with name-value pairs (“stuff” is the subject in the “triple”) is hardly brain-twisting.

I look forward to seeing RDFaCE bundled in a WordPress plugin with some awareness of the WordPress media manager, and using on this very blog.

TinyMCE is the free software rich text editor used in lots of projects in addition to WordPress, so this is a great step forward!

1 Comment »

Thesis on metadata interoperability: RDF

ml, December 15th, 2010

From Interoperability to Harmonization in Metadata Standardization – Designing an Evolvable Framework for Metadata Harmonization (pdf) by Mikael Nilsson:

The analytical  framework is used to analyze and compare seven metadata specifications, and a
concrete set of harmonization issues is presented. These issues are used as a basis for a metadata
harmonization framework where a multitude of metadata specifications with different character-
istics can coexist. The thesis concludes that the Resource Description Framework (RDF) is the
only existing specification that has the right characteristics to serve as a practical basis for such a
harmonization  framework,  and therefore must be taken into account when designing metadata
Based on the harmonization framework, a best practice for metadata standardiza-
tion development is developed, and a roadmap for harmonization improvements of the analyzed
standards is presented.

(emphasis added)

Nilsson will defend his thesis on 2010-12-15, and you can watch (details).

Thinking in RDF is the natural thing to do at Creative Commons, whether modeling license attributes, work registration, or domain-specific descriptions that add value to licensed works. Nice to see in depth academic backing for this intuition.

Comments Off

Creative Commons Open Office Plugin. Here it is with some new features….

akila87, May 20th, 2010

Hi there!

I was selected for the Google Summer of Code for the above project with the guidance from Christopher Webber & Nathan Yergler. I am a 22 year old student from University of Moratuwa Sri Lanka where I am doing my BSc Engineering degree on Electronic & Telecommunication. What I am doing in this project is updating the OpenOffice plugin developed by Cassio Melo in GSoC 2007.

I have been working on the project since last April. I was able to add many requested features and some new ideas by me.

So lets look at the progress of the add-on and those “New Features”.

Support for OpenOffice 3.1, 3.2

The add on currently in the extension repository only has support for the OpenOffce 2 version. I recompiled the add on to make it work in OpenOffice 3.1, 3.2

Support for adding images from Flickr, Open Clip Art and Wikimedia Commons.

In 2008 Summer of Code Mihai Husleag made the Flickr Image Re-Use add on. I used this code to add this feature. Now you can search images in Flickr, Open Clip Art and Wikimedia Commons and add them to your document. The license and attribution data will also be added with the image. This works on Writer Calc Impress and Draw.

Insert Clip Art from Open Clip Art

Support for OpenOffice Draw.

This task was included in developer challenges. This is the first task that I completed in the project. Now you can add visible license statement and the metadata to a draw document.

Support for Draw

Showing a notification when opening CC licensed documents.

License Notification

Speed Up first time license insertion.

Now the RDF database will load to memory when open office starts but this will not freeze OpenOffice because it is done in a separate thread. So the initial license insertion delay which is due to RDF loading will no longer be there.

Auto update visible license notice when license changes.

In the previous version this worked only for writer documents. Now Calc, Impress and Draw also have this feature.

Adding RDF meta data.

This only works for Writer. Other OpenOffice applications doesn’t support this currently.

RDF Metadata

This is what I’ve done so far. You can get the add on at http://extensions.services.openoffice.org/en/project/ccoootest and the check the source code at: http://code.creativecommons.org/viewsvn/ccooo/branches/akila-gsoc-2010/

If you have any suggestions about this project (new functionalities, things you don’t like, etc) feel free to leave a comment. And if you found any bugs please let me know.


New validator released!

asheesh, January 6th, 2009

This past summer, Hugo Dworak worked with us (thanks to Google Summer of Code) on a new validator. This work was greatly overdue, and we are very pleased that Google could fund Hugo to work on it. Our previous validator had not been updated to reflect our new metadata standards, so we disabled it some time ago to avoid creating further confusion. The textbook on CC metadata is the “Creative Commons Rights Expression Language”, or ccREL, which specifies the use of RDFa on the web. (If this sounds like keyword soup, rest assured that the License Engine generates HTML that you can copy and paste; that HTML is fully compliant with ccREL.) We hoped Hugo’s work on a new validator would let us offer a validator to the Creative Commons community so that publishers can test their web pages to make sure they encode the information they intended.

Hugo’s work was a success; he announced in August 2008 a test version of the validator. He built on top of the work of others: the new validator uses the Pylons web framework, html5lib for HTML parsing and tokenizing, and RDFlib for working with RDF. He shared his source code under the recent free software license built for network services, AGPLv3.

So I am happy to announce that the test period is complete, and we are now running the new code at http://validator.creativecommons.org/. Our thanks go out to Hugo, and we look forward to the new validator gaining some use as well as hearing your feedback. If you want to contribute to the validator’s development or check it out for any reason, take a look at the documentation on the CC wiki.

1 Comment »

License-oriented metadata validator and viewer: summertime is winding up

hugo dworak, August 16th, 2008

Google Summer of Code 2008 approaches its end, as less than forty-eight hours are left to submit the code that will then be evaluated by mentors, therefore it is fitting to pause for a moment and sum up the work that has been done with regard to the license-oriented metadata validator and viewer and to confront it with the original proposal for the project.

A Web application capable of parsing and displaying license information embedded in both well-formed and ill-formed Web pages has been developed. It supports the following means of embedding license information: Dublin Core metadata, RDFa, RDF/XML linked externally or embedded (utilising the data URL scheme) using the link and a elements, and RDF/XML embedded in a comment or as an element (the last two being deprecated). This functionality has been proven by unit testing. The source code of a Web page can be uploaded or pasted by a user, there is also a possibility to provide a URI for the Web application to analyse it. The software has been written in Python and uses the Pylons Web Framework and the Genshi toolkit. Should you be willing to test this Lynx-friendly application, please visit its Web site.

The Web application itself uses a library called “libvalidator”, which in turn is powered by cc.license (a library developed by Creative Commons that returns information about a given license), pyRdfa (a distiller that generates the RDF triples from an (X)HTML+RDFa file), html5lib (an HTML parser/tokenizer), and RDFLib (a library for working with RDF). The choice of this set of tools has not been obvious and the library had undergone several redesigns, which included removing the code that employed encutils, XML canonicalization, µTidylib, and the BeautifulSoup. The idea of using librdf, librdfa, rdfadict has been abandoned. The source code of both the Web application (licensed under the GNU Affero General Public License version 3 or newer) and its core library (licensed under the GNU Lesser General Public License version 3 or newer) is available through the Git repositories of Creative Commons.

In contrast to the contents of the original proposal, the following goals have not been met: traversal of special links, syndication feeds parsing, statistics, and cloning the layout of the Creative Commons Web site. However, these were never mandatory requirements for the Web application. It is also worth noting that the software has been written from scratch, although a now-defunct metadata validator existed. Nevertheless, the development does not end with Google Summer of Code — these and several new features (such as validation of multimedia files via liblicense and support for different language versions) are planned to be added, albeit at a slower pace.

After the test period, the validator will be available under http://validator.creativecommons.org/.

1 Comment »

liblicense 0.8 (important) fixes RDF predicate error

asheesh, July 30th, 2008

Brown paper bag release: liblicense claims that the RDF predicate for a file’s license is http://creativecommons.org/ns#License rather than http://creativecommons.org/ns#license. Only the latter is correct.

Any code compiled with liblicense between 0.6 and 0.7.1 (inclusive) contains this mistake.

This time I have audited the library for other insanities like the one fixed here, and there are none. Great thanks to Nathan Yergler for spotting this. I took this chance to change ll_write() and ll_read() to *NOT* take NULL as a valid predicate; this makes the implementation simpler (and more correct).

Sadly, I have bumped the API and ABI numbers accordingly. It’s available in SourceForge at http://sf.net/projects/cctools, and will be uploaded to Debian and Fedora shortly (and will follow from Debian to Ubuntu).

I’m going to head to Argentina for a vacation and Debconf shortly, so there’ll be no activity from on liblicense for a few weeks. I would love help with liblicense in the form of further unit tests. Let’s squash those bugs by just demonstrating all the cases the license should work in.

Comments Off

RDFa for Semantic MediaWiki [GSoC 2008]

davemccabe, July 1st, 2008

Hello, world!

My name is David McCabe, and this summer I am adding RDFa support to Semantic MediaWiki, as part of the Google Summer of Code 2008. I am an undergraduate in Mathematics at Portland State University. For the Google Summer of Code 2006, I wrote Liquid Threads, a MediaWiki extension that replaces talk pages with a threaded discussion system.

Semantic MediaWiki (SMW) is the software used for the CC wiki and many other wikis. SMW allows authors to mark up wiki pages so that their contents and relationships are machine-readable. SMW already publishes this machine-readable data in RDF/XML format.

You can read about RDFA on the CC Wiki. There is also a Google Tech Talk on RDFa.

Comments Off

Metadata work of interest

ml, August 3rd, 2007

Some of these could turn out to be interesting for describing licensed content on the web, all rather interesting.

hAudio proposed microformat.

Proposed hAudio to RDFa mapping.

RDFa-deployed Multimedia Medata (ramm.x) may be an effort to map and standardize use of existing and upcoming media description standards in RDFa … I had to skim “ramm.x in 10 sec” and “what ramm.x is NOT” a few times to gather that, but the key description on that page seems to be:

Does ramm.x replace RDF-based multimedia vocabularies, as, e.g., the Music Ontology Specification?
No! ramm.x aims at bringing existing formats, as MPEG-7 and the like, into the Semantic Web. It acts as a bridge using a certain formalisation of an existing vocabulary.

Getting a bit more esoteric, Protocol for Web Description Resources (POWDER):

facilitates the publication of descriptions of multiple resources such as all those available from a Web site.

Which is a bit of an understatement.

Comments Off

Sidecar XMP and License Extractors in Tracker

jakin, July 10th, 2007

Tracker has accepted my patches to read XMP sidecar, as well as patches to extract licenses from MS Office (old format), TIFF, HTML, PNG, and PDF. This support will be available in the 0.6 release, which potentially will be released later this week.

My final set of patches will additionally add support for extracting licenses from JPEG, SVG, and OpenOffice’s OASIS. Also, through GStreamer, Tracker already recognizes licenses of Vorbis and FLAC.

This marks the half-way point of Summer of Code 2007.

1 Comment »

Liblicense has licenses! 376 of them…

jakin, July 6th, 2007

Prepping for the 0.1 release, I’ve generated RDF descriptions of all CC licenses in all available jurisdictions, as well as the GPL, LGPL, and Public Domain.

Available here:

Each license, if applicable, has all the attributes laid out on the wiki, including localization. One problem, however, is getting localized descriptions of the licenses. That isn’t available at https://cctools.svn.sourceforge.net/svnroot/cctools/i18n/trunk/i18n/

Licenses were generated with this python script, which reads the relevant information from creativecommons.org and cctools svn.

Comments Off

next page