I'm currently working on a new release of Knowee. This is another (long-promised) item on my ToDo list before I can finally concentrate on paggr (although it took too long already and hopefully won't break my neck. All the planned paid projects for bootstrapping paggr didn't happen, due to frozen budgets and politics. I hope the situation here improves soon.)
So, while I was trawling the vocabulary market, trying to gather terms for the stuff that Knowee works with (people, their profiles, contacts, accounts, and activities), I remembered OpenSocial, the effort to standardize basic interactions between social networking sites. I can use a good amount of FOAF, but OpenSocial has very handy things such as a generic "tags" field and a clean vCard mapping. And it's a super-set of Portable Contacts, too.
Today, I wrote a converter that extracts the field definitions from the JavaScript specification files, together with their labels, comments, domains, and value types. (A little too late, I found out that Dan Brickley had already done part of this a couple of months ago, could have saved me some work, d'oh.)
I've just added the osoc spec to web-semantics.org/ns. I hope it might be of use to others as well. Funnily, the "relationship" term was not part of any of the source files, maybe I still have to invent a property (a foaf:knows equivalent that also works with organizations).
OpenSocial in RDF
I've created an RDF converter for the OpenSocial field definitions.
Posted on 2008-12-16 at 16:45 UTC
by
(trackback)
Zemanta releases LOD-connected NLP API
Results from Zemanta's new Open Semantic API are interlinked with DBPedia, Freebase, MusicBrainz, Semantic CrunchBase, etc.
When I read OpenCalais' pre-announcement of Calais 4 a couple of weeks ago, I got pretty excited about their plan to offer an NLP API that can be combined with entities from the LOD cloud. It seems we don't have to wait any longer: Today, Zemanta beat the Calais team to it with the release of a new Semantic API (More details on TC). Andraž Tori already (and kindly) sent me a file with hundreds of mappings for Semantic CrunchBase which I'm going to include in the coming days.
I think these APIs have the potential to sweep away feature-poor or closed services in favor of personal DIY SemWeb apps (my first reaction to the Calais 4 post was "bye bye Twine"). Think of a simple RDF/SPARQL tool that, based on a set of tags, subscribes to feeds from the major bookmarking (or other) services, pumps the links through Zemanta's API, and then delivers all the things that might interest you. It wouldn't require a new bookmarking service, it could let you filter by company or product, or even limit results to suggestions by people in your social network. Such an app could provide rich add-on information from LOD datasets like DBPedia. In a very light-weight, loosely coupled, "On Demand" fashion.
I think these APIs have the potential to sweep away feature-poor or closed services in favor of personal DIY SemWeb apps (my first reaction to the Calais 4 post was "bye bye Twine"). Think of a simple RDF/SPARQL tool that, based on a set of tags, subscribes to feeds from the major bookmarking (or other) services, pumps the links through Zemanta's API, and then delivers all the things that might interest you. It wouldn't require a new bookmarking service, it could let you filter by company or product, or even limit results to suggestions by people in your social network. Such an app could provide rich add-on information from LOD datasets like DBPedia. In a very light-weight, loosely coupled, "On Demand" fashion.
Posted on 2008-12-09 at 18:50 UTC
by
(trackback)
poshRDF - RDF extraction from microformats and ad-hoc markup
poshRDF is a new attempt to extract RDF from microformats and ad-hoc markup
I've been thinking about this since Semantic Camp where I had an inspiring dialogue with Keith Alexander about semantics in HTML. We were wondering about the feasibility of a true microformats superset, where existing microformats could be converted to RDF without the need to write a dedicated extractor for each format. This was also about the time when "scoping" and context issues around certain microformats started to be discussed (What happens for example with other people's XFN markup, aggregated in a widget on my homepage? Does it affect my social graph as seen by XFN crawlers? Can I reuse existing class names for new formats, or do we confuse parsers and authors then? Stuff like that).
A couple of days ago I finally wrote up this "poshRDF" idea on the ESW wiki and started with an implementation for paggr widgets, which are meant to expose machine-readable data from RDFa, microformats, but also from user-defined, ad-hoc formats, in an efficient way. PoshRDF can enable single-pass RDF extraction for a set of formats. Previously, my code had to walk through the DOM multiple times, once for each format.
A poshRDF parser is going to be part of one of the next ARC revisions. I've just put up a site at poshrdf.org to host the dynamic posh namespace. For now the site links to a possibly interesting by-product: A unified RDF/OWL schema for the most popular microformats: xfn, rel-tag, rel-bookmark, rel-nofollow, rel-directory, rel-license, hcard, hcalendar, hatom, hreview, xfolk, hresume, address, and geolocation. It's not 100% correct, poshRDF is after all still a generic mechanism and doesn't cover format-specific interpretations. But it might be interesting for implementors. The schema could be used to generate dedicated parser configurations. It also describes the typical context of class names so that you can work around scoping issues (e.g. the XFN relations are usually scoped to the document or embedded hAtom entries).
I hope to find some time to build a JSON exporter and microformats validator on top of poshRDF in the not too distant future. Got to move on for now, though. Dear Lazyweb, feel free to jump in ;)
A couple of days ago I finally wrote up this "poshRDF" idea on the ESW wiki and started with an implementation for paggr widgets, which are meant to expose machine-readable data from RDFa, microformats, but also from user-defined, ad-hoc formats, in an efficient way. PoshRDF can enable single-pass RDF extraction for a set of formats. Previously, my code had to walk through the DOM multiple times, once for each format.
A poshRDF parser is going to be part of one of the next ARC revisions. I've just put up a site at poshrdf.org to host the dynamic posh namespace. For now the site links to a possibly interesting by-product: A unified RDF/OWL schema for the most popular microformats: xfn, rel-tag, rel-bookmark, rel-nofollow, rel-directory, rel-license, hcard, hcalendar, hatom, hreview, xfolk, hresume, address, and geolocation. It's not 100% correct, poshRDF is after all still a generic mechanism and doesn't cover format-specific interpretations. But it might be interesting for implementors. The schema could be used to generate dedicated parser configurations. It also describes the typical context of class names so that you can work around scoping issues (e.g. the XFN relations are usually scoped to the document or embedded hAtom entries).
I hope to find some time to build a JSON exporter and microformats validator on top of poshRDF in the not too distant future. Got to move on for now, though. Dear Lazyweb, feel free to jump in ;)
Posted on 2008-11-10 at 12:25 UTC
by
(trackback)
paggr wins Semantic Web Challenge 2008
ISWC 2008 in Karlsruhe was just great. Even won the Semantic Web Challenge.
What can I say? I'm still smiling like on the pic on the left (credits: Keith Alexander). And you have no idea how urgently I need the money ;-)paggr has received very encouraging feedback (or premature praise, rather), so I'm busily working on getting the beta out as soon as possible. Especially given that paggr wouldn't have had a chance to convince the judges without the great amount of Linked Data and all the painful spec work by the Semantic Web Community. The ball's in my court to actually deliver now.
There are some items left on my todo list before I dare sending out more invitation codes (some were added after feedback at ISWC):
- improved RDF exporter for portals and individual widgets (just finished the first version, using a new thingy called poshRDF)
- the widget and agent builders should be visual, more like the cool SPARQLMotion editor (I'm working on that now).
- dropping a widget item on the canvas should auto-open a corresponding details widget
- widgets should be able to "listen to" other widgets for auto-refreshs
- a setup wizard that lets you specify initial accounts and data sources
I assume that a fully generic semantic widget and agent platform might be either over- or underwhelming, so I plan to provide a set of ready-to-run apps for paggr. Here are some ideas:
- feed reader with rich filtering and bookmarking (to del.icio.us?) and rating
- microblog and aggregator (twitter + identi.ca + groups + filters + posting)
- address book
- semantic email client
- calendaring
- decentralized social network (portable personal profile + lifestream aggregation)
Posted on 2008-11-06 at 10:50 UTC
by
(trackback)
paggr teaser video and pre-registration site online
paggr teaser video and landing page
I've been semi-silently working on something new. A combination of many semwebby things I came across and played with during the last 3 years or so:
So, what happens when you put this all together? At least something interesting, and perhaps semsol's first commercial service. (Or product, this is all just LAMP stuff and can easily be run in an intranet or on a hosted server). Anyway, still some way to go. It's called paggr, the landing page is up, and today I created a first teaser/intro video.
I'll demo the beta (launch planned for November) at upcoming ISWC during the poster session (my poster is about SPARQL+ and SPARQLScript, the two SPARQL extensions that paggr is based on). I may have early invites by then.
As a preparation for the hopefully busy fall and winter months, though, I'll be on vacation for the next two weeks. No Email, no Web, no Phone. Yay!
HQ version (quicktime, 130MB)
- semantic markup
- smart data
- an rdf clipboard
- ajax
- sparql sparql sparql
- sparql + scripting
- sparql + templates
- sparql + widgets
- lightweight, federated semweb services and bots
- UIs for open data
- semwikis
- agile and collaborative web development
So, what happens when you put this all together? At least something interesting, and perhaps semsol's first commercial service. (Or product, this is all just LAMP stuff and can easily be run in an intranet or on a hosted server). Anyway, still some way to go. It's called paggr, the landing page is up, and today I created a first teaser/intro video.
I'll demo the beta (launch planned for November) at upcoming ISWC during the poster session (my poster is about SPARQL+ and SPARQLScript, the two SPARQL extensions that paggr is based on). I may have early invites by then.
As a preparation for the hopefully busy fall and winter months, though, I'll be on vacation for the next two weeks. No Email, no Web, no Phone. Yay!
HQ version (quicktime, 130MB)
Posted on 2008-10-09 at 21:50 UTC
by
(trackback)
Getting Real with RDF & SPARQL at DevX
DevX article about combining the Getting Real approach with SemWeb technologies
My "Getting Real" with RDF and SPARQL article is now available in DevX' Semantic Web zone:
"Getting Real" is an agile approach to web application development. This article explains how it can be successfully combined with the flexibility of semantic web technologies. The article is a look behind the scenes of dooit's first iteration (and an introduction to Trice, code included). The focus is not so much on the Web aspect of RDF, but rather on its ability to accelerate software development ("Data First", etc).
Any feedback is welcome, in comments here or over at the DevX site.
"Getting Real" is an agile approach to web application development. This article explains how it can be successfully combined with the flexibility of semantic web technologies. The article is a look behind the scenes of dooit's first iteration (and an introduction to Trice, code included). The focus is not so much on the Web aspect of RDF, but rather on its ability to accelerate software development ("Data First", etc).
Any feedback is welcome, in comments here or over at the DevX site.
Posted on 2008-09-24 at 09:45 UTC
by
(trackback)
SPARQLBot - Your Semantic Web Commandline
SPARQLBot is now officially launched
Update: I added a Ubiquity script after a suggestion by Gautier.
SPARQLBot, the weekend project we started at SemanticCamp London, is now finally online at a proper home, and with a more solid toolset. I've ported the essential commands from the old site, and the "Getting Started" manual should be online later today as well.
If you happen to be at ISWC next month and would like to have a look behind the scenes, I'll present SPARQL+ and SPARQLScript with SPARQL result templates during the poster session.
SPARQLBot, the weekend project we started at SemanticCamp London, is now finally online at a proper home, and with a more solid toolset. I've ported the essential commands from the old site, and the "Getting Started" manual should be online later today as well.What is SPARQLBot?
SPARQLBot is a web-based service that reads and writes Semantic Web data based on simple, human-friendly commands received via IRC or the Web. The command base can be freely extended using a browser-based editor. SPARQLBot can process microformats, RSS, several RDF serializations, and results from parameterized SPARQL queries.New Features
SPARQLBot was more or less rewritten from scratch. Compared to the earlier version, things have become much more powerful, but also more simple and stable in many cases. The system can now:- operate on multiple freenode IRC channels (just send "
join #channel" to "sparqlbot"), - reply to private IRC messages,
- be accessed via the Ubiquity plugin
- reuse other commands,
- call web APIs via GET or POST,
- access arbitrary SPARQL endpoints,
- help you cut your way through the growing Linked Data cloud,
- use a single command to combine results from federated SPARQL endpoints and datasets such as DBPedia, DBLP, the SemWeb Conference Corpus, GeoNames , CrunchBase, or flickr wrappr,
- produce highly customizable output via SPARQL result templates,
- OpenID-protect your commands,
- cache results in a local SPARQL+-enabled store.
If you happen to be at ISWC next month and would like to have a look behind the scenes, I'll present SPARQL+ and SPARQLScript with SPARQL result templates during the poster session.
Posted on 2008-09-22 at 18:20 UTC
by
(trackback)
Writing Inference Rules with SPARQLScript
SPARQLScript can be used for forward chaining, including string manipulations on the run.
In order to keep data structures in Semantic CrunchBase close to the source API, I used a 1-to-1 mapping between CrunchBase JSON keys and RDF terms (with only a few exceptions). This was helpful for people knowing the JSON API, but it wasn't easy to interlink the converted information with existing SemWeb data such as FOAF, or the various LOD sources.
SPARQLScript is already heavily used by the Pimp-My-API tool or the TwitterBot, but yesterday I added a couple of new features and finally had a go at implementing a (forward chaining) rule evaluator (for the reasons mentioned some time ago).
A first version ("LOD Linker") is installed on Semantic CB, with initially 9 rules (feel free to leave a comment here if you need some additional mappings). With SPARQLScript being a superset of SPARQL+, most inference scripts are not much more than a single INSERT + CONSTRUCT query (you can click on the form's "Show inference scripts" button to see the source code):
Inferred triples are added to a graph directly associated with the script. Apart from a destructive rule that removes all email addresses, the reasoning can easily be undone again by running a single DELETE query against the inferred graph.
I'm quite happy with the functionality so far. What's still missing is a way to rewrite bnodes, I don't think that's already possible. But INSERT + CONSTRUCT will leave bnode IDs unchanged, so the inference scripts don't necessarily require URI-denoted resources.
Another cool aspect of SPARQLScript-based inferencing is the possibility to use a federated set of endpoints, each processing only a part of a rule. The initial DBPedia mapper above, for example, uses locally available wikipedia links. However, CrunchBase only provides very few of those. So I created a second script which can retrieve DBPedia identifiers for local company homepages, using a combination of local queries and remote ones against the DBPedia SPARQL endpoint (in small iterations and only for companies with at least one employee, but it works).
SPARQLScript is already heavily used by the Pimp-My-API tool or the TwitterBot, but yesterday I added a couple of new features and finally had a go at implementing a (forward chaining) rule evaluator (for the reasons mentioned some time ago).
A first version ("LOD Linker") is installed on Semantic CB, with initially 9 rules (feel free to leave a comment here if you need some additional mappings). With SPARQLScript being a superset of SPARQL+, most inference scripts are not much more than a single INSERT + CONSTRUCT query (you can click on the form's "Show inference scripts" button to see the source code):
$ins_count = INSERT INTO <${target_g}>
CONSTRUCT {?res a foaf:Organization } WHERE {
{ ?res a cb:Company }
UNION { ?res a cb:FinancialOrganization }
UNION { ?res a cb:ServiceProvider }
# prevent dupes
OPTIONAL { GRAPH ?g { ?res a foaf:Organization } }
FILTER(!bound(?g))
}
LIMIT 2000
But with the latest SPARQLScript processor (ARC release 2008-09-12) you can run more sophisticated scripts, such as the one below, which infers DBPedia links from wikipedia URLs:
$rows = SELECT ?res ?link WHERE {
{ ?res cb:web_presence ?link . }
UNION { ?res cb:external_link ?link . }
FILTER(REGEX(?link, "wikipedia.org/wiki"))
# prevent dupes
OPTIONAL { GRAPH ?g { ?res owl:sameAs ?v2 } . }
FILTER(!bound(?g))
}
LIMIT 500
$triples = "";
FOR ($row in $rows) {
# extract the wikipedia identifier
$id = ${row.link.replace("/^.*\/([^\/\#]+)(\#.*)?$/", "\1")};
# construct a dbpedia URI
$res2 = "http://dbpedia.org/resource/${id}";
# append to triples buffer
$triples = "${triples} <${row.res}> owl:sameAs <${res2}> . "
}
#insert
if ($triples) {
$ins_count = INSERT INTO <${target_g}> { ${triples} }
}
(I'm using a similar script to generate foaf:name triples by concatenating cb:first_name and cb:last_name.)Inferred triples are added to a graph directly associated with the script. Apart from a destructive rule that removes all email addresses, the reasoning can easily be undone again by running a single DELETE query against the inferred graph.
I'm quite happy with the functionality so far. What's still missing is a way to rewrite bnodes, I don't think that's already possible. But INSERT + CONSTRUCT will leave bnode IDs unchanged, so the inference scripts don't necessarily require URI-denoted resources.
Another cool aspect of SPARQLScript-based inferencing is the possibility to use a federated set of endpoints, each processing only a part of a rule. The initial DBPedia mapper above, for example, uses locally available wikipedia links. However, CrunchBase only provides very few of those. So I created a second script which can retrieve DBPedia identifiers for local company homepages, using a combination of local queries and remote ones against the DBPedia SPARQL endpoint (in small iterations and only for companies with at least one employee, but it works).
Posted on 2008-09-12 at 12:00 UTC
by
(trackback)
dooit - a live Getting Real experiment
I created an RDF app following the Getting Real approach
I've probably read Getting Real half a dozen times since the release of the free online version last year. The agile process seems to fit quite nicely with RDF-based tools (Semantic CrunchBase was the most recent proof of concept for me). I'm currently writing a DevX article about using RDF and SPARQL in combination with Getting Real and wondered about quantitative numbers for such an approach. As I usually don't record hours for personal projects, I had to create a new one: sillily named "dooit", a to-do list manager.dooit follows a lot of GR suggestions such as "UI first", not wasting too much time on a name, that less may be enough for 80% of the use cases, or that usage patterns may evolve as "just-as-good" replacements of features ("mm-dd" tags could for example enable calendar-like functionality).
I started the live experiment on Friday and finished the first iteration on Saturday. Below is a twitter log of the individual activities. I was using Trice as a Web framework, otherwise I would of course have spent much more time on generating forms and implementing AJAX handlers etc. So, the numbers only reflect the project-specific effort, but that's what I was interested in.
- (Fr 08:24) trying the "Getting Real" approach for a small RDF app
- (Fr 10:51) idea: a siiimple to-do list with taggable items
- (Fr 11:02) nailing down initial feature set: ~15mins: add, edit, tick off taggable to-do items
- (Fr 11:02) finding a silly product name: ~5mins: "dooit"
- (Fr 11:27) creating paper sketches: ~20mins (IIRC, done yesterday evening)
- (Fr 11:42) got unreal by first spending ~30mins on a logo
- (Fr 12:07) Setting up blank Trice instance and basic layout to help with HTML creation: ~25mins
- (Fr 13:52) first dooit HTML mock-up and CSS stylesheet: ~90mins
- (Fr 17:14) JavaScript/AJAX hooks for editing in place, forms work, too, but w/o data access on the server: ~3h
- (Fr 18:12) identifying RDF terms for the data structures: ~30min
- (Fr 18:13) gotta run. time spent so far for creating RDF from a submitted form: 20mins
- (Sa 14:40) continuing Getting Real live experiment
- (Sa 14:41) "URIs everywhere" is one of the main issues for agile development of rdf-based apps. Will try to auto-gen them directly from the forms..
- (Sa 19:04) rdf infrastructure work to auto-generate RDF from forms and to auto-fill forms from RDF: ~2h
- (Sa 19:07) functions to send form data to RDF store via SPARQL DELETE/INSERT calls: ~1h
- (Sa 19:09) replacing mockup template sections with SPARQL-generated snippets: ~1h (CRUD and filter-by-tag now in place, just ticking off items doesn't work yet)
- (Sa 20:09) implementing rest of initial feature set, tests, fine-tuning: ~1 h. done :)
- (Sa 20:14) Result of Getting Real experiment: http://semsol.org/dooit Got Real in ~
1012 hours
Posted on 2008-09-08 at 10:40 UTC
by
(trackback)
CrunchBase Interview
I've been interviewed by the CrunchBase team.
Semantic CrunchBase seems to be worth the time I'm putting into it. Thanks to TechCrunch's and CrunchBase' great move to open their data and encourage reuse (and writing about the apps that use their API), I've had the chance to do a couple of SemWeb demos and reach out to the audience that could benefit as much (or maybe even more) from RDF & Co. as the groups we already have on board: Web app developers.
I also got an offer to write some related articles for DevX, and the CrunchBase team just published an interview where I (shamelessly) promote SemWeb development. I am already noticing an increased number of mails asking for RDF introductions, and people are even starting to just figure things out on their own, with friendly SPARQL paving the path.
This might be the right time for a SWEO II (with a focus on the "E") or a similar effort driven by the RDF community.
I also got an offer to write some related articles for DevX, and the CrunchBase team just published an interview where I (shamelessly) promote SemWeb development. I am already noticing an increased number of mails asking for RDF introductions, and people are even starting to just figure things out on their own, with friendly SPARQL paving the path.
This might be the right time for a SWEO II (with a focus on the "E") or a similar effort driven by the RDF community.
Posted on 2008-08-27 at 09:30 UTC
by
(trackback)
How to use CBbot
Some simple instructions for the CrunchBase Twitter bot
OK, looks like the CrunchBase bot got some attention after Fred Wilson's post about possibly handy phone apps. So, if you discovered Semantic CrunchBase or the related bot via non-SemWebby paths, the whole "Define your own API commands with SPARQL" is probably a bit too much, tech-wise. Here are some short instructions for using the bot:
The syntax is basically just "@cbbot, command" where command has to match one of the user-defined commands. Some of them are generating HTML and might therefore not be suited for Twitter access. The main, Twitter-optimized commands are:
Would you like to see additional twitter commands, but don't know about SPARQL or how to use the command editor? Please send command requests to me or to the bot and I'll try to add them.
The syntax is basically just "@cbbot, command" where command has to match one of the user-defined commands. Some of them are generating HTML and might therefore not be suited for Twitter access. The main, Twitter-optimized commands are:
- ${role} (of|at) ${company}: This command is for requests like "@cbbot, Intern at TechCrunch", "@cbbot, founder of facebook", "@cbbot, board of Pandora", "@cbbot, ceo of twitter", etc. (I made the command case-insensitive today, BTW)
- link to ${keyword}: This command returns a CrunchBase link for the given keyword. The latter can be a company (name or CB identifier), product (name or CB identifier), or person (CB identifier). Examples: "@cbbot, Link to foodzie", "@cbbot, link to EC2", "@cbbot, link to michael-arrington".
Would you like to see additional twitter commands, but don't know about SPARQL or how to use the command editor? Please send command requests to me or to the bot and I'll try to add them.
Posted on 2008-08-25 at 15:30 UTC
by
(trackback)
CrunchBase Twitter Bot
Semantic CrunchBase features a bot that replies to "Pimp My API" commands via twitter
Update (2008-08-25): I've written a follow-up post explaining the main commands.
Heh, as John Crow points out, a cool way to look at work is to think in terms of being "between vacations". So, here is a fun/experimental hack from between my previous (bday on Santorini) and next (family visit at Lake Constance) vacation: It's a tweetsphere cousin of SPARQLBot who can give answers to user-defined CrunchBase API commands. (Instructions for the "Pimp MY API" tool.)
In order to use the bot, just send a known command call to cbbot (using the @-convention), for example "

The bot is implemented as a long-running PHP process (*cough*), you may have to re-start the script in case you don't get an answer within a few minutes.
In order to use the bot, just send a known command call to cbbot (using the @-convention), for example "
@cbbot, founder of Flickr". A tweet with the answer should appear on your "Replies" tab (or under "Recent", if you are following cbbot, see screenshot below).
The bot is implemented as a long-running PHP process (*cough*), you may have to re-start the script in case you don't get an answer within a few minutes.
Posted on 2008-08-20 at 13:30 UTC
by
(trackback)
A Faceted Browser for ARC
One of the first Trice components is probably going to be a faceted browser for ARC
I'm going on vacation in a couple of days, and before that, I'm trying to tick off at least a few of the bigger items on my ToDo list. I was hoping for a first Trice preview (now that ARC is slowly getting stable), but this will have to wait until September. However, I managed to get another component that's been on my list for ages into a demo-able state today: A SPARQL/ARC-based faceted browser (test installation at Semantic CrunchBase).

It's an early, but working (I think ;) version. A template mechanism for the item previews is still missing, but I'm already quite happy with the facet column. The facets are auto-generated (based on statistical info and scope-detection), but it's also possible to define custom filters (for more complicated graph patterns, see screenshot below). Once again, SPARQLScript simplified development, thanks to its placeholders for parameterized queries.

I think I'm going to use the browser for a first Trice bundle. It's not too sophisticated, but builds on several core features such as request dispatching, RDF/SPARQL-based views and forms, basic AJAX calls, and cached template sections.

It's an early, but working (I think ;) version. A template mechanism for the item previews is still missing, but I'm already quite happy with the facet column. The facets are auto-generated (based on statistical info and scope-detection), but it's also possible to define custom filters (for more complicated graph patterns, see screenshot below). Once again, SPARQLScript simplified development, thanks to its placeholders for parameterized queries.

I think I'm going to use the browser for a first Trice bundle. It's not too sophisticated, but builds on several core features such as request dispatching, RDF/SPARQL-based views and forms, basic AJAX calls, and cached template sections.
Posted on 2008-08-11 at 16:45 UTC
by
(trackback)
ARC Triples Visualizer Plugin
Luis Paulo implemented a graphviz access plugin for ARC
Luis Paulo created a graphviz plugin for ARC that generates .dot files and also SVG or bitmap graphics from triple sets (available features depend on the graphviz libraries installed on your machine). Thanks, Luis, great stuff!
Posted on 2008-08-05 at 10:50 UTC
by
(trackback)
Pimp My (CrunchBase) API
Define your own CrunchBase API commands with SPARQL
In the Semantic CrunchBase announcement, we saw how SPARQL can be used to retrieve fine-grained information from the CrunchBase graph. This follow-up post explains "Pimp My API", a browser-based tool for creating tailored API calls by combining SPARQL with input parameters and output templating. The command editor consists of three tabs: "Define it", "Test it", and "Activate it".
In the 1st field ("Command") you define a (human-readable) command, with input parameters set via the ${parameter_name} notation. In the screenshot on the left, we created "${role} of ${comp_name}" which we are going to use to retrieve persons with a specific role at a given company. The command processor will automatically assign variables for a matching input string, e.g. "Editor of TechCrunch" will set the variable ${role} to "Editor", and ${comp_name} to "TechCrunch".
Now on to the 2nd field ("SPARQLScript code"):
SPARQLScript is an experiment to extend SPARQL with scripting language features such as variable assignments, loops, etc. (think Pipes for SemWeb developers). If you are familiar with SPARQL, you will notice only three differences to a standard SPARQL query: In the first line, we are setting a target SPARQL service for the following script blocks. In the second line, we assign the results form the SELECT query to a variable, and the the third difference is the use of placeholders in the query. These placeholder will be filled from matching variables before the query is sent to the target endpoint.
If you don't know SPARQL at all, here is a pseudo-translation of the query: Find resources (?comp) with a cb:name (cb is the CrunchBase namespace used for CB attributes) that equals the input parameter "comp_name", and a relationship (?rel). The relationship should have an attribute ?role which regex-matches the input parameter "role". The relationship should also have a cb:person attribute (linking to ?person). The ?person node should have the cb:first_name and cb:last_name attributes. Those should be returned by the query as "fname" and "lname" variables. The whole result set is then assigned to a variable named "rows" (Hmm, maybe the SPARQL is easier to read than my explanation ;)
The third form field lets us define an output template. Each stand-lone section surrounded by quotation marks will fill the output buffer. Thus, looping through the "rows" will create a small name snippet for each row. Again, placeholders will be filled with values from the current script scope.
Using the Test form, we can see if our command pattern works, and if the result is formatted as desired. Should anything go wrong, we can select "Show raw output" to get some debugging information. Please note, even though we are using a browser, simple HTML forms, and a friendly pattern language, the commands are sent to real Web services. A broken script usually just hurts your local machine. A distributed Semantic Web processor like this, however, may harm other people's servers, so we should be careful, start small, and improve our script incrementally. In this case, the output result is a little ugly, so we could improve the output template and inject commas:


Instead of the sort-of natural language command, the API expects GET or POST arguments.
The example above generates a plain text result, but it's also possible to return markup or other formats. SPARQLScript can access GETvariables via

Step 1: Define a new API command
In the 1st field ("Command") you define a (human-readable) command, with input parameters set via the ${parameter_name} notation. In the screenshot on the left, we created "${role} of ${comp_name}" which we are going to use to retrieve persons with a specific role at a given company. The command processor will automatically assign variables for a matching input string, e.g. "Editor of TechCrunch" will set the variable ${role} to "Editor", and ${comp_name} to "TechCrunch".Now on to the 2nd field ("SPARQLScript code"):
SPARQLScript is an experiment to extend SPARQL with scripting language features such as variable assignments, loops, etc. (think Pipes for SemWeb developers). If you are familiar with SPARQL, you will notice only three differences to a standard SPARQL query: In the first line, we are setting a target SPARQL service for the following script blocks. In the second line, we assign the results form the SELECT query to a variable, and the the third difference is the use of placeholders in the query. These placeholder will be filled from matching variables before the query is sent to the target endpoint.
If you don't know SPARQL at all, here is a pseudo-translation of the query: Find resources (?comp) with a cb:name (cb is the CrunchBase namespace used for CB attributes) that equals the input parameter "comp_name", and a relationship (?rel). The relationship should have an attribute ?role which regex-matches the input parameter "role". The relationship should also have a cb:person attribute (linking to ?person). The ?person node should have the cb:first_name and cb:last_name attributes. Those should be returned by the query as "fname" and "lname" variables. The whole result set is then assigned to a variable named "rows" (Hmm, maybe the SPARQL is easier to read than my explanation ;)
The third form field lets us define an output template. Each stand-lone section surrounded by quotation marks will fill the output buffer. Thus, looping through the "rows" will create a small name snippet for each row. Again, placeholders will be filled with values from the current script scope.
Step 2: Test your new Command
Using the Test form, we can see if our command pattern works, and if the result is formatted as desired. Should anything go wrong, we can select "Show raw output" to get some debugging information. Please note, even though we are using a browser, simple HTML forms, and a friendly pattern language, the commands are sent to real Web services. A broken script usually just hurts your local machine. A distributed Semantic Web processor like this, however, may harm other people's servers, so we should be careful, start small, and improve our script incrementally. In this case, the output result is a little ugly, so we could improve the output template and inject commas:
Step 3: HTTP access activation
Our command is now defined and successfully tested, let's turn it into a public API call.
Instead of the sort-of natural language command, the API expects GET or POST arguments.
The example above generates a plain text result, but it's also possible to return markup or other formats. SPARQLScript can access GETvariables via
${GET.var_name}, this feature can be used to create different output, depending on e.g. a "format" parameter. I'm also working on support for content negotiation, where you'd simply create a "${rows}" template and the SPARQLScript processor would auto-generate an appropriate serialization including correct HTTP headers.
Step 4: Have some fun
You may wonder why the command editor allows the definiton of a human-friendly pattern, when the API itself just needs the parameters. The patterns allow the implementation of an API call detector, i.e. depending on the input stream at a generic service URL, we can auto-detect the right script to run. I've test-implemented a Twitter bot that can reply to messages that match a stored API command on Semantic CrunchBase (Inactive during the week-end, it's not tested enough. Stay tuned ;). Here is a teaser screenshot for next week:
Posted on 2008-08-01 at 16:20 UTC
by
(trackback)

