RDFa button (inofficial)

A
Update/Note: This is not an official RDFa button, those (in the known colours) will be provided by W3C's Communications Team once RDFa is a Rec or CRec.

A couple of days ago I created an RDFa technology button, and I was asked to share it, so here it is:

RDFa
(PNG, GIF, SVG source file)

Please see the W3C Semantic Web Logos and Policies page for license details. This button is derived from the original W3C ones.

Adding (partial) RDFa support to the Firefox HTML Validator extension

I
Update (2008-04-24): I managed to get rid of the xmlns-related errors (.replace() to the rescue ;), so the extension now accepts markup that follows the latest RDFa DTD (including @typeof). And while at it, I created versions for win and mac.

One of the reasons I haven't been using RDFa in production is the problem of quality assurance (a.k.a. plain old html validation). Not because RDFa isn't valid markup as such, but the main tool I'm using during development is Marc Gueury's excellent HTML Validator Extension for Firefox. RDFa is valid XHTML+RDFa, but XHTML+RDFa is not HTML, so the extension reports dozens of errors starting with the unrecognized Doctype declaration. The W3C Markup Validator supports RDFa, but I often develop while I'm offline, or on a non-public Web server, and the little "0 errors / 0 warnings" message in the status bar is more convenient than having to send markup to an online service.

Yesterday, however, I started working on an RDFa generator for one of Intellidimension's projects (Very interesting to see them use RDF big time, while many of us are still experimenting and thinking about potential markets, BTW). So, now that the RDFa-caused messages made it almost impossible to spot real HTML errors, I wondered if the add-on could perhaps be hacked to accept RDFa as well. Long story short: It can, to a certain extent. I don't know if arbitrary XML namespace prefixes (xmlns:foo="...") can be supported by a pure DTD/SGML-based validator (the FF extension uses openSP). FWIW, I couldn't get it to work.

Apart from that, RDFa-enabling the extension was mainly copying the RDFa DTD and a set of modules to the plug-in's SGML library. It now happily accepts RDFa attributes (about, resource, property, datatype, content, etc) and makes my life a little bit easier. If anyone has an idea how I could make it accept (non-predefined) namespace prefixes as well, I'd appreciate hints.

The tweaked extension is so far just a hack. I didn't even ping Marc yet or change the internal ID, so any extension update will remove the RDFa functionality. You can try/download it if you like (windows version), but I may have to take it offline should Marc not be happy about the re-distribution.

New ARC2 plugins

K
If there was a "most productive SemWeb coder" category in Danny's "This Week's Semantic Web", this week's turn would probably be Keith Alexander's. Last week, he provided no fewer than three ARC2 Plugins:
While at it, he also implemented a SPARQL+ wrapper for Talis Platform stores.

I think I blogged about Morten's RemoteEndpoint plugin a while back (this one should really become part of the core codebase), but did I mention Peter Krantz' File System Synchronizer? It keeps an RDF Store in sync with a file system directory which enables a really nice option to implement larger RDF editing systems on top of ARC: By using editing tools that work with small RDF files (quick response times and everything) and his plugin, it becomes possible to provide rich query functionality over the whole dataset without the store getting in the way of the publishing tools. RDF index rebuilding can be slow, de-coupling read from write operations and introducing an asynchronous update process is a nice solution.

Awesome stuff.

RDFAuth, with less Story-telling

A
Update: Dan Brickley suggested (in a private mail to Henry and me) that "RDFAuth" is most probably not a very smart name anyway, as something that contains official/generic technologies (RDF and oAuth in this case) may send wrong signals and cause misunderstanding. And that we shouldn't waste time fighting. He suggests more specific names (BeatnikAuth/knoweeAuth) for the time being, as this is all still premature stuff, and because no one should claim to have created an "RDFAuth", especially not if it isn't backed by the whole community. Well, what can I say, he's of course right. I apologize and will s/RDFAuth/knoweeAuth/ from now on.

You may have read Henry Story's recent post about RDFAuth, an RDF-oriented mechanism to access (partly) protected web resources. He's not describing the RDFAuth protocol, though. I've tried to clarify things a couple of times on the semantic-web list, but somehow he seems to prefer to hijack the name instead, together with parts of the idea and claim it as his invention (it's not mine either, to make things clear). Now, innovation is always based on a combination of prior work and improvements, but his "following my strict architectural guidelines, I came across what I am just calling RDFAuth" preening goes a tiny bit too far to not trigger a comment.

What he describes (a PGP-based authentication protocol) is clearly interesting, but it's simply not what RDFAuth, an idea that was developed in the knowee project, is about. For knowee (which just released the alpha version, btw), we needed something that can be implemented on basic, shared web servers. PGP is simply not an option (if considered mandatory). People won't upload their private keys to 3rd party servers, and PGP libs are not necessarily available in those environments either.

Final clarifications:
  • RDFAuth may support PGP, it's just not a requirement.
  • I'm pretty sure that Henry's PGP-only approach will attract more SemWeb geeks than RDFAuth, it just wouldn't necessarily work for knowee's target audience.
  • The RDFAuth idea is in no way special or new. It more or less predates oAuth, but long-term-ish I'll most probably have to replace it with oAuth, once there is a way to generate tokens without the browser redirect dance (fully server-side token generation is another knowee requirement).
  • I read about a token-based, decentralized identification mechanism on a very early OpenID FAQ page that described a non-browser-dependent way to log into web sites. I can't find the link anymore, but this is basically what RDFAuth is based on. So, this is not my idea either.
  • The possibility of combining 200 OK response headers with WWW-Authenticate was suggested by Etan Wexler on the FOAF mailing list
  • Dan Brickley explored SPARQL-based group membership discovery a while back. I like this idea of de-coupling data exchange decisions from the identification/authorisation process very much (RDFers don't need things like sReg or Attribute Exchange).
  • The only thing that RDFAuth adds is light-weight, personal token services (as a replacement of OpenID's browser-based identification), and the re-use of straight HTTP BasicAuth, so that partly protected resources can more easily be discovered by both server-side and client-side tools (e.g. Tabulator), and also to allow widely deployed modules like mod_php to access the login token and client identifier using built-in functionality. And I doubt that layering a protocol on top of HTTP BasicAuth hasn't been done before, so, again, nothing special to brag about here.
OK, enough geek whining ;), don't want to waste more time of my precious weekend.

Semantic Web Aliases

A
Update: Kingsley provides a number of Web references for most of the buzzwords below.

I just had an interesting twitter dispute with Ian, triggered by his invention of another alias for the Semantic Web. For last year's webinale I created a slide with which I tried to "de-confuse" people a little bit, I guess I'll need several slides this year. This is mostly just for future reference, not many of those are going to stick anyway:

A list of terms people use to name (a subset/aspect/whatever of) the "Semantic Web":
  • Semantic Web (by timbl)
  • SemWeb (by the developer community)
  • Web of Data (by timbl)
  • Data Web (by timbl)
  • The Web as a Database (by timbl)
  • Web of Knowledge (by stefandecker)
  • lowercase semantic [wW]eb (by tantek)
  • Knowledge Web (by ?)
  • Semantic Web 2.0 (by stefandecker)
  • Web 3.0 (by nova)
  • Semantic Graph (by nova)
  • Hyperdata (by danja)
  • Linked Data (by timbl)
  • Linked Data Web (by kidehen)
  • Structured Web (by the structured blogging community and mkbergman)
  • Semantic Data Web (by kidehen)
  • GGG - The Giant Global Graph (by timbl)
  • Web 3G (by iand)

See also: Interblag

And I fear there are more (even w/o considering "Pipe Dream", "Ivory Towers Inc." and similar ones). I like "Structured Web" and "Hyperdata" very much. But at the end of the day (yes, I know silly jargon as well), I think we'll just call it the [sS]emantic [wW]eb ;-)

SPARQLBot 101

M
While SPARQLBot was mostly a fun hack for last week's SemanticCamp, there is still a lot of activity on the #sparqlbot channel (it actually seems to increase). More than 30 SPARQL commands have been created. Michael Hausenblas now kindly created an introduction that gives a nice overview of the stuff that has been added to the command collection so far: SPARQLBot 101. Have fun, and thanks, Michael!

Back from SemanticCamp London

S
SemanticCamp BadgeLike many other SemWeb weavers, I followed Tom's call for last week-end's SemanticCamp London. So much fun! I used the opportunity to discuss a number of ideas I've been pondering for quite some time, and it was great to be able to get more direct insights from microformats community members. I had the impression that the event helped bringing RDFers and microformateers a little closer together. At least in the conversations I had. There was no childish "your approach is flawed/too limited/doomed to fail", and I think I didn't hear a single (serious) "fundamentally". A lot of "I prefer", "I don't like", and a number of tongue-in-cheek comments, but that's cool as part of a starting dialog. Much better than the progress-blocking arrogance we've seen for much too long (in both camps, btw).

I tried to substantiate this "common goal, complementary tech" notion with two little interactive demos and a tech pitch:
  • On saturday we created SPARQLBot, an IRC Bot based on ARC/Trice that aggregates XFN, hCard, and FOAF data, and lets you explore your "online social graph" with simple IRC commands. SPARQLBot is a nice example how the huge amount of high-quality microformats data can be combined with RDF technologies such as flexible storage and simple querying. (And that it only took a few hours to implement a working demo also shows how SemWeb technologies can significantly improve Web app development.)
  • I pulled an all-nighter from Sat to Sun and managed to demo the knowee beta on Sunday. There are still a few bugs to fix, but an official announcement should come soon now. knowee allows you to consolidate portable social network data (XFN, hCard, FOAF, feeds) and to manage the collected information via a freebase-like hyperdata editor.
  • The third thing is what might be called "micrordf". I didn't run a session, but discussed the idea with a couple of people and think it's worthwhile pursuing. Although certain RDF solutions could be really handy for the µF community, there are a couple of things that are considered deal breakers. Among those are the namespace prefix mechanism (esp. in any of the current RDF-in-HTML proposals, where non-predictable prefixes break reliable self-containment and CSS styling) and the need to map HTML-encoded information to non-identical and unstable RDF Schemas. What I was trying to figure out during SemanticCamp was the possibility of creating a simplified, but still RDF-compatible mechanism that would be acceptable to microformateers. It's essentially a simple, intermediate structure to represent any microformat (no need for a different syntax), and possibly also POSH data. What that would bring to the microformats community is the ability to auto-create universal parsers, a unified mofo-style API, and a proper test suite, which still seems to be lacking. The RDF crowd would get a way to access microformats as resource descriptions, with the ability to map those to their RDF vocab of choice. It could perhaps even be possible to auto-generate GRDDL XSLTs from the micrordf definitions. More on this soon.

As Yves put it already: Yay for SemanticCamp!

New ARC features: Triggers and MySQL extensions

A
The latest ARC revision got two new features: SPARQL Triggers and MySQL function extensions for SPARQL.

SPARQL Triggers

Triggers in ARC were suggested by Dan Brickley, who is experimenting with dynamically populated/updated Group definitions. What you can effectively do now in ARC is associating custom trigger classes with SPARQL query types, which will then be automatically called after registered query types, for example to refresh inferred Graphs:
$config = array(
  ...
  'store_triggers' => array(
    /* register LOAD triggers */
    'load' => array('updateFriendsList', 'crawlXFNLinks'),
  ),
);
$ep = ARC2::getStoreEndpoint($config);
$ep->go();

MySQL Extension Functions

Morten Frederiksen did it again. He sent in about 10 lines of code which he suggested to add to ARC's SQL rewriter. The effect? ARC suddenly has access to dozens of MySQL functions. That's CONCAT, CURDATE, MD5, UNIX_TIMESTAMP, and many more. A namespace for MySQL function URIs is now online, and queries look like this:
PREFIX mysql: <http://web-semantics.org/ns/mysql/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?person WHERE { ?person foaf:givenname ?n1 ; foaf:family_name ?n2 . FILTER (mysql:concat(?n1, " ", ?n2) = "Alec Tronnick") . }
I talked a little bit more about these things with Danny Ayers in a recent podcast.

Project offer: Part-time RDF/OWL Modeling

p
Here is a nice project offer I received, but that I won't have enough time to work on myself. The project is about analyzing a set of statistical reports and creating an RDF Schema or OWL Ontology for them. With help from friendly #swig folks, a first selection of probably re-usable schemas could already be identified. The next task would be picking the right terms, and maybe some thoughts about additionally needed glue terms.

If you are interested, please send a short mail to fwd_1 at semsol dot com. It will be auto-forwarded to the offerer.

Looking for paid (Semantic Web) Projects

I
Update 2: Yay, I think I'm safe for the next couple of months, should have blogged much earlier. Now I'm starting to think we could really need a Job site for SemWeb people..

Update: Ah, the blogosphere. I already received some replies. One to share: Aduna is looking for a Java Engineer.

About a year ago, I received some funds which allowed me to re-write the ARC toolkit, and also to bring Trice (a semantic web application framework for PHP) to production-readiness. However, Semantic Web Development is generally still very new, especially in the Web Agency market where I'm coming from. It's not that easy yet to keep things self-sustaining.

May well be that I should blog less about bleeding-edge experiments, but rather about how RDF and SPARQL allow me to deploy extensible websites at a fraction of the time it used to take in the past. "Release Early", "Data First", "Evolve on the Fly", and all those patterns that SemWeb technology enables in a web development context.

Anyway, to keep things short: I'm actively (read: urgently ;-) looking for more paid projects. I'm a Web development all-rounder with particular interest in scripting languages and quite some experience in delivering RDF and frontend solutions (more details on my profile page). While it would of course be great to work on stuff where I can use my tools, I'm available for more general web development as well. I'm most productive when I can work from my office, but temporary travelling is basically fine, too. The Düsseldorf Airport is just minutes away.

Cheers in advance for suggestions,

Grawiki - A Wiki (and aggregator) for graph-shaped data

T
In case you watched the "DriftR" screencast I created in December, there is now a live version online. (I dropped the initial name, my blog posts suddenly showed up in CrunchBase. ;-)

Grawiki is a SPARQL-based Data Wiki, a little bit inspired by freebase, less impressive, feature-rich, scalable and all that, but, well, OpenSource, SemWeb-enabled, and decentralized (each Grawiki installation can import selected graphs from other ones, back-POSTing is in the works). As it seems that I forgot to write-protect the instance mentioned above, you can play with it if you like. You'll most probably encounter bugs, the built-in inferencer is still at alpha stage, and editing of consolidated bnodes is quite tricky to implement. I'll tweak things in a day or two.

With Grawiki, I think I finally have (the start of) a tool that could work nicely for ad-hoc RDF editing and aggregation (it can import RDF and certain microformats). Oh, and a personal URI, and a FOAF file. At last ;-)

I'm now considering the addition of RDFa injections as a possible next step, the current editor uses a home-grown mechanism to activate the editing hooks and stuff, which was easier to implement and debug in my XHTML 1.0 development environment. Stay tuned, a download site probably won't be up before next week, gotta focus on urrrgent SWEO/knowee todos first...

ARC Remote Endpoint Plugin

M
OK, you're probably already wondering if Morten and I have a link exchange contract, but anyway: He just announced a plugin for ARC that provides "access to remote SPARQL endpoints as if they were local stores." Cool stuff :-)

SPARQL is a W3C Recommendation

T
I guess I already pushed out enough ARC spam today, so I'll keep things short: SPARQL is now a W3C Recommendation!

What I'm personally very happy about is the Implementation Survey which features two pure-PHP implementations*. This really opens the door for mainstream Web Developers to start exploring RDF and SPARQL on off-the-shelf hosted web servers. Everything I create these days (e.g. the ARC site, including the bots and archive generators there, or this blog) is powered by SPARQL. It's an amazing productivity booster as you never have to worry about complicated JOINs or evolving database schemas again. You can just code away and it's great fun to work with. Want more Testimonials? The Data Access Working Group collected quite a number of them from W3C member organizations.

* Don't let yourself be fooled by RAP's low report scores, their SPARQL engine is quite mature, they just didn't run the whole test suite.

RDF Tools - An RDF Store for WordPress

T
Together with Morten Frederiksen and Dan Brickley (who is revisiting his SparqlPress idea), I've created a WordPress extension (called "RDF Tools") that adds an (ARC-based) RDF Store and SPARQL Endpoint to the blogging system. The store is kept separate from the WP tables (i.e. it's not a wrapper), but you can use WP's nice admin screens to configure it (screenshot), and given the amount of developer-friendly hooks that WP offers, I'm curious what can be done now, possibly in combination with other extensions such as those Alexandre Passant is working on. It could perhaps also be handy as a deployment accelerator for knowee.

ARC Data Wiki Plugin

A
I'm blessed with a small but first-class community around ARC that helps me with bug reports, patches, encouraging feedback, and nifty ideas. One example for the latter was Morten Frederiksen's invention to allow ARC to be extended with third party plugins. He even demonstrated the utility by enhancing the toolkit with a remote SPARQL endpoint for his named graph exchange work. ARC plugins are not bundled with the core codebase (which is meant to stay compact), but can easily be integrated in any ARC installation (Developer documentation is now online, too).

My first own plugin was triggered by Tim Berners-Lee's suggestion to write a lightweight request handler for an RDF-powered Data Wiki, as described in a recent Tech Report (PDF) and already implemented with Algae. I had to tweak the SPARQL+ spec and ARC's Query Parser to make it compatible with Eric Prud'hommeaux's SPARQL/Update flavor. This had the nice side-effect that all three SPARQL Write proposals (SPARUL, SPARQL/Update, SPARQL+) now (almost) share a common subset for basic INSERTs and DELETEs. After these updates, writing the plugin itself became almost trivial.

The code is still experimental and limited, but it's available for download, together with setup instructions. The Data Wiki plugin doesn't require a database (unlike the other SPARQL components in ARC) and supports update calls sent by RDF editors such as the Tabulator. I've set up a demo RDF wiki and will try to add remote update functionality to my own editor (to be renamed) now as well. Hmmm, would be cool to have a selection of generic tools to collaboratively read from and write to shared RDF spaces one day.

Data Wiki