<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Labs &#187; nutch</title>
	<atom:link href="http://labs.creativecommons.org/category/nutch/feed/" rel="self" type="application/rss+xml" />
	<link>http://labs.creativecommons.org</link>
	<description>by Creative Commons</description>
	<lastBuildDate>Mon, 09 Nov 2009 17:29:34 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Summer of Code Project: &#8220;Including RDFa support in Nutch: Updating the ccNutch plug-in&#8221;</title>
		<link>http://labs.creativecommons.org/2007/04/13/summer-of-code-project-including-rdfa-support-in-nutch-updating-the-ccnutch-plug-in/</link>
		<comments>http://labs.creativecommons.org/2007/04/13/summer-of-code-project-including-rdfa-support-in-nutch-updating-the-ccnutch-plug-in/#comments</comments>
		<pubDate>Fri, 13 Apr 2007 14:58:14 +0000</pubDate>
		<dc:creator>Alan Kelon</dc:creator>
				<category><![CDATA[nutch]]></category>
		<category><![CDATA[rdfa]]></category>
		<category><![CDATA[summer of code]]></category>

		<guid isPermaLink="false">http://techblog.creativecommons.org/2007/04/13/summer-of-code-project-including-rdfa-support-in-nutch-updating-the-ccnutch-plug-in/</guid>
		<description><![CDATA[Hello!
I&#8217;m one of the selected students for Google Summer of Code 2007 and I&#8217;m pleased to be joining Creative Commons community this summer. My project title is Including RDFa support in Nutch: Updating the ccNutch plug-in under mentoring of Nathan R. Yergler. The abstract (with hyperlinks missing in soc page) is:
RDFa is emerging standard from [...]]]></description>
			<content:encoded><![CDATA[<p>Hello!</p>
<p>I&#8217;m one of the <a href="http://techblog.creativecommons.org/2007/04/12/summer-of-code-projects/">selected students</a> for Google Summer of Code 2007 and I&#8217;m pleased to be joining Creative Commons community this summer. My project title is <a href="http://code.google.com/soc/cc/appinfo.html?csaid=33FD37DB5A5F2E4C">Including RDFa support in Nutch: Updating the ccNutch plug-in</a> under mentoring of<a href="http://yergler.net/"> Nathan R. Yergler</a>. The abstract (with hyperlinks missing in soc page) is:</p>
<blockquote><p><a href="http://www.w3.org/TR/2007/WD-xhtml-rdfa-primer-20070312">RDFa</a> is emerging standard from W3 Consortium to provide a <a href="http://www.w3.org/2001/sw/BestPractices/HTML/2005-rdfa-syntax">syntax</a> that expresses semantics in structured data using a set of elements and attributes that embeds RDF in HTML, such as a license on a document or<br />
a photoâ€™s creator name and its camera setting information.</p>
<p><a href="http://lucene.apache.org/nutch">Nutch</a> is an open source search engine that uses <a href="http://lucene.apache.org/">Lucene</a> for searching the Web (or a subset of it) or in a customized form for an intranet. <a href="http://wiki.creativecommons.org/CcNutch">ccNutch</a> is a plug-in for Nutch to search Creative Commons content. Currently, ccNutch indexes only text documents and does not support RDFa very well.</p>
<p>The inclusion of RDFa in ccNutch will be a great improvement for the advances of semantic web because we could easily index image, audio and video contained in web pages through their RDFa meta-data and then search them. In this way, we will be increasing our range of searchable artifacts available under creative licenses that is a worth to try.</p></blockquote>
<p>My first step is to update ccNutch with the source code from Lucene repository. Then I&#8217;ll start to write the Requirements Document and Architecture to define precisely what I&#8217;ll do. To do so, I&#8217;m going to study the ccNutch and Nutch code base more deeply as well to study the RDFa standard. After that, I&#8217;ll write the Project Plan document to define our schedule, milestones and make risk assessment.</p>
<p>Right now, let me introduce myself: My name is Alan Kelon, I&#8217;m 23 years old and I live in <a href="http://en.wikipedia.org/wiki/Recife">Recife</a>, Brazil. I am a 1st year Ph.D.  student in Computer Science (in Portuguese)  at <a href="http://www.cin.ufpe.br">Informatics Center</a> (in Portuguese), <a href="http://www.ufpe.br">Federal University of Pernambuco</a> (in Portuguese), a.k.a. CIn/UFPE. The university and my house are very close to <a href="http://en.wikipedia.org/wiki/Ricardo_Brennand_Institute" title="Ricardo Brennand Institute">Ricardo Brennand Institute</a> :-)  I also hold a M.Sc. degree in Computer Science from Federal University of Pernambuco (2005-2007) â€“ entitled as &#8220;A Software Process Proposal to Open Source Software Factories&#8221; â€“ and a B.Sc. in Computer Science from <a href="http://www.ufpb.br">Federal University of ParaÃ­ba</a> (2005).  In 2006, I was a teaching assistant in a Software Engineer graduate level  class. The course was entitled &#8220;Software Engineering: Building Open Source  Software Factories&#8221;.  This year edition of the course will be starting at the end of this month and I&#8217;ll be lecturer again. This year, I lectured in a undergraduate leval class entitled &#8220;Advanced Topics in Software Engineering: Open Source Software&#8221;.</p>
<p>Since my undergrad studies I&#8217;m involved with free software. The first contact was to to build and maintain a Beowulf Linux cluster and to developed a high availability system from 10/2002 to 01/2005. In the past, I was also with <a href="http://www.debian.org">Debian</a> in my local community,  played with <a href="http://www.andromda.org">AndroMDA</a> in the very early stages of <a href="http://openerp.persapiens.org">OpenERP</a>, developed <a href="http://vensso.sourceforge.net">VENSSO CRM</a> (in Portuguese), mentored/founded <a href="http://sourceforge.net/projects/gvsproject">GVS</a> and <a href="http://sourceforge.net/project/telescope">Telescope</a>. This last one is my active research project as part of my Ph.D.  Finally, I lead the research group on Open Source and Distributed Software Development at Informatics Center, Federal University of Pernambuco, and <a href="http://salu.cesar.org.br/ncm_cesar/servlet/newstorm.ns.presentation.NavigationServlet?publicationCode=15&amp;pageCode=1229">C.E.S.A.R</a> â€“ Recife Center for Advanced Studies and Systems â€“, with strong collaboration of the local software industry, where I have the  opportunity to advocate the open source development model and philosophy.</p>
<p>All in all: &#8220;Talk is cheap, show me the code&#8221;. Let&#8217;s do it now ;-)</p>
]]></content:encoded>
			<wfw:commentRss>http://labs.creativecommons.org/2007/04/13/summer-of-code-project-including-rdfa-support-in-nutch-updating-the-ccnutch-plug-in/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
