finally a bnode with a uri

Posts tagged with: swo

SemSol Site launched

Basic description and some screenshots.
I've put up a little preview site for the SemSol framework today. Not much yet, just a basic description and some screenshots. A first public release is planned for Q1/2007, I'd like to test it with some other projects first.

Speaking of projects: you may have noticed that my 10/2006 re-re-launch version of the site has been removed (if you noticed the relaunch at all), DERI is going to put more internal resources on the portal from now on. Although it would have been a great stress-test project, I have to admit that using already mature tools like Wordpress, Mediawiki, and Drupal reduces risks on their side and also frees a lot of resources here. The new site is going to get a conceptual change, but I'll try to make the already aggregated and manually created data available via some other community project. The new site I plan to test SemSol with is going to be a Semantic Social Networking Service which will also provide some editorial content (my personal little non-W3C SWEO initiative). The SNS will of course be RDF-based, so there is still a back-integration option for, should it get a SPARQL upgrade at some later stage.

Sparqlets and more updates: new editors and sparqlets.
OK, this is hour 32 without sleep, so I probably can't write much. But blogging still seems to work as opposed to coding...

There are some new components on the eternal beta site. (Probably not only) the DERIans couldn't really get used to my generic tools, so I went back to the drawing board and more or less rewrote the whole stuff (less URIs and qnames, more "classical web site" structure). The new browsers are still based on a shared standard class, but I divided the "resources" section into 5 customized pages (events, projects, documents, people, software) and developed a set of resource-specific SPARQL queries to generate better tailored views. The filter forms are more or less fixed, but I already had a lot of fun with them (although noone seems to owe me beer..).

It took me some time to work around several issues, not only the bnode reference problem I mentioned, but I also needed a better index structure for the RDF store and basic inferencing capabilities in my query engine. I'm currently trying an experimental mechanism to pass OWLchestra-generated class trees to the SPARQL2SQL rewriter which then generates queries that allow me to retrieve conf:Workshops even when I only specified a type of ical:Vevent. Seems to work, not too sure about scalability issues, but I didn't like the idea of rdf:type 4ward-chaining either, so I'll stick to this approach for the moment.

There's a lot new (and hopefully more usable) stuff in the front-end. I've parameterized several of the framework's interface components and can now re-use what I call "Sparqlets": interactive page items that are based on SPARQL queries and JavaScript glue code. A "quick search" sparqlet in the upper right corner of the front page provides a suggest-as-you-type feature and allows you to easily search the whole RDF store. More sophisticated interaction possibilities are provided by the browsers, where you can now decide if a related resource description should be displayed in an inline sparqlet or as a separate summary. I've also integrated my unsmusher/resmusher in resource context menus, so that it's now possible to re-consolidate wrongly merged resource descriptions with a single mouse click, and without having to stop browsing.

but first: sleep.

Portal updates updates
After struggling for three days with unicode and server availability issues I've now updated both the editor and the browser at the beta site, and I'm really starting to have fun with the system.

New components and features

  • Auto-Smushing
    Resources are automatically consolidated based on owl:FunctionalProperties and owl:InverseFunctionalProperties.
  • Manual Smusher
    It's possible to manually invoke the [wl Smusher] via an HTML interface and to see how resources were merged.
  • Provenance information
    The summaries now have a "sources" tab showing the different RDF files that were used to generate the selected resource view. The provenance information contains the URL of the source, the number of relevant statements, and a list of properties. Each source can be updated or removed online.
  • Un-Smusher
    As I expect there to be inconsistent data in the database sooner or later, I wanted to be able to split incorrectly merged resources. So I extended the store to not only keep track of sources, but also to save the initial identifier of subjects and objects. This makes the indexes larger but allows to efficiently reset merged URIs and bnode IDs. The Un-smusher is linked from the RDF browser (after selecting a resource).
  • Add-URL form
    You can import remotely maintained resource descriptions by simply entering the URL of your RDF/XML file in the "Add source" form on the browser page. At the moment, the reader doesn't follow redirects, and the amount of imported statements is limited.
  • Incoming relations
    The browser doesn't process owl:inverseOf information. Given that my inferencer is written in PHP, I was a little bit too shy to stress the server even more. But what I've added is an "Incoming relations" tab which lists all the resources which reference the selected individual. I found this feature to be very helpful as the browsing flow no longer stops due to dead-ends. It's especially nice for all the properties that don't have an explicit owl:inverseOf definition.
  • Browsing history
    The RDF browser maintains a clickable list of the last 7 resource summaries viewed.

Updated components:

  • Person views
    The experimental "Person" view I mentioned in my last post is now activated. It tries to find a person's title and name (via either foaf:name or combinations of given/first + ur/family/last name), an image, mailboxes, weblogs, homepages, and workplace homepages. Any other properties are rendered by the "Resource" template.
  • Improved editor
    I've re-arranged the RDF editor's layout. Instead of displaying the resource description list at the top of the main column, it's now in the right column. Much much less scrolling now, and less flickering in Gecko-based browsers. Additionally, the "quick-add" and "quick-edit" tabs are no longer nested in the "overview" tab, they are in the same tab box now. The nested tab boxes didn't really work usability-wise. The property editor still uses them, I haven't found a better way yet. But with the resource list out of the way, the property editor works better, too.
I tested the site with Mozilla/Firefox/Netscape 8, IE6, and Opera 7.23 on a windows machine. It also seems to work with Firefox on Linux and Mac. It would be great to get some additional compatibility feedback (and also to see the repository grow ;). Unfortunately, this blog doesn't have comments, so my email address is the only thing I can offer for bug reports.

When Baywatch isn't Betawatch..

Playing with RDF browsing templates during downtime.
The server was flakey again today and I couldn't access it via FTP. So I played a little bit with a local version and custom templates for the RDF browser.

I'm experimenting with a CSS-inspired approach, allowing to partly cascade templates. This means that there is always a matching generic template for each resource type. Before rendering the template, the system walks down the class hierarchy of the given resource, looking for annotation properties pointing to a more specific template (e.g. a template for foaf:Persons). Additionally, how to render certain properties can be defined at any template level, so that a foaf:img is not only rendered when a resource is typed as being a foaf:Person but everytime a resource has such an image assigned. Customizing resource-specific views is not really easy, though. You can't be sure that certain assumed properties are present or that image URLs don't 404. So I ended up with lots of if... then branches and other ugly stuff. But I think templates for more constrained resource types such as RSS channels or feed items could make a lot of sense.

mr baywatch

Finally browsing

Adding browsing functionality to OWLchestra's RDF Store
Yesterday, I finished a first version of an RDF browser on top of OWLchestra's RDF store. This means that I am finally getting close to the relaunch (at least as far as my components are concerned). I still have to implement a SPARQLing query builder and some other stuff, but with an editor and browser in place the portal's base functionality is there.

My concept for the browser included supporting facet restrictions but given that a) I'm behind schedule, b) funding ended in 12/2004, c) /me is heading towards insolvency, and d) the API to my RDF store is still very basic, I decided to better start with something simple. And as displaying the retrieved information is still complicated enough, I happily focused on the front-end stuff.

(Unfortunately, DERI seems to have problems with the beta version server at the moment. It was offline for 3 days and my FTP account is still gone. It might take some time before I can upload the browser, so I 'm going to annoy you with my clumsy screenshots once again...)

In order to reduce the result set even without selectable facets, I implemented a low-end fulltext search on potential resource labels and the possibility to restrict the resource type as well as the time of resource description modification:

rdf browser restrictions
Clicking on one of the result list's entries...
rdf browser result list
...will show a resource summary (and hide the list for reduced server load):
rdf browser summary

Depending on the type of related resources and the existence of resource descriptions, a couple of icons are displayed next to each entry. In the image above we can see that there is a summary available for Danny's weblog.

The blog summary shows an RDF import icon next to an rdfs:seeAlso link:
rdf browser summary
(I plan to offer resource description discovery for any URI but I still have to program the client to do MGETs, interpret headers, follow redirects etc., so at the moment, imports are only possible on "probably RDF/XML" links.)

Ok, what happens if I click the import link?
First, the script checks if the source is already known, otherwise it creates an entry in the system's list of RDF sources. Then, an ARC parser instance retrieves the RDF document and adds the parsed triples to the RDF store. In case the source has been crawled before, the database will only be updated if the document changed.

rdf browser import
Switching the resource filter to rss:item now and searching for "danny" returns a list of 10 current entries, of course I can't resist watching the latest SPARQL photo:
rdf browser summary
Next thing to do is to create custom templates for selected resource types, e.g. a nice profile page for foaf:Persons or an aggregated view for rss:channels. The mechanism is already activated, but I needed a little break, and browsing-importing-browsing-importing was too much fun to directly continue coding.

Hm, and perhaps I should write a smusher first...

Custom auto-generated custom auto-... RDF forms

Still looking for the best approach...
After my last post, I spent some more time on the custom RDF templates, trying to generalize and automate as much of the form generation process as possible.

The first step was to create a re-usable parent-class for custom templates with methods for standard tasks such as the retrieval of property labels and comments, as well as the automatic creation of simple properties.
Additionally, the list of available templates is now created from a little configuration file where each template definition consists of three simple fields: a label, a target class, and an identifying name to later find the custom template class. The target class is used to make sure that the template can be used with resources of a certain type (or inferred sub-type) only.

Upon form creation, the custom template class is instantiated, and a "get_fld_infos" method returns an array of form field parameters that can be used to generate input fields, textareas, checkboxes, and the related labels and property comments. E.g. the following code snippet creates a simple input field:

When a custom form is submitted, the generic template class checks for each form field if the custom class has a "custom creation" method defined, otherwise it will assume that the field is a standard field with either a literal or a URI value. (The script then does ontology look-ups to determine the value type of the property.)

For any non-simple field/property, the custom template has to provide a special method to create the triples "manually". I still had several almost identical methods that did nothing but splitting "one-entry-per-line"-textarea values into single values. So I added an option that allowed me to use textareas for a list of URIs/values, without having to write custom creation methods (by marking a field definition with a simple "is_list"=>true).

So, the only thing that actually remained were blank nodes, i.e. where the template just "asks" for a single value (e.g. a doap:revision identifier), but the script has to create a bnode of a certain type, plus a property with the submitted value (e.g. doap:release -> doap:Version -> doap:revision ->value).

However, that's also a pattern I could generalize, moving the create_bnode_property method in the parent class. I still need the custom methods, but they are rather small now. Only the connecting property, the class, and the properties of the bnode have to be defined.

I've built a custom DOAP-template today, it is generated from a single 7K php class. And the form still looks ok, too. Give it a try and describe your SemWeb project, the template is now available at in the "Quick-add" section, after creating/selecting a foaf:Project or a doap:Project.

Pimping my RIDE, and shifting down

RDF Instance Data Editing
You all know MTV's Pimp My Ride show where they tune a car until you almost forget its original purpose. While working on my RIDE I sometimes got the impression of maybe also adding too many features to it and losing overall usability.

My main problem was that I had a list of requirements that couldn't be implemented with custom forms in a reasonable amount of time:
  • auto-generating tooltip-like information for each form field
  • separating the RDF editing front-end from the (OWL) model, so that changing the model doesn't force me to adjust the editor
  • offering editing forms for more than 15 different resource types and at the same time making sure that any possibly "fitting" property can be attached to a selected resource
  • being able to define validation and auto-adjust procedures at the model- and not at the forms-level.
  • keeping the editing UI scalable by offering multi-page forms, and filters on namespaces, properties, and property values

However, other requirements were also that
  • people not too familiar with all those RDF characteristics (blank nodes, striping, etc.) and the various different RDF vocabularies should also be able to create and maintain resource descriptions.
  • the UI should not get too big and should not have too many options which rather confuse than help

The latter appeared trickier to implement to me than the more technical requirements, but let's start at the beginning of the list:

auto-generating tooltip-like information

The editor needs an API that allows it to retrieve rdfs:comment annotations for a given property URI. Once such an interface to the OWL model is available (I'm using OWLchestra for this), generating the info texts is easy. Even the creation of a multilingual interface is straight-forward as long as the annotations are available in different languages.

separating the RDF editing front-end from the (OWL) model

The editing forms are created on the fly. This requires the availability of label and comment information for each property. It's also neccessary (for my tool), that each property has a defined domain and range (this can be rdfs:Resource, though). Literal properties should have a datatype set which is needed to auto-generate the different widgets (single-line text fields vs. checkboxes vs. multi-line textareas). I've tweaked my OWL editor so that custom datatypes (with possibly custom widgets) can be created. (I'm still considering a switch to Dan Connolly's new rdfIcal design, which uses timezone datatypes. If I moved to the new design I could create a custom datatype for dtstart/dtend and a widget where users could enter date, time, and timezone.)

offering editing forms for many resource types with different sets of possible properties

As the forms are generated on the fly, there is no need to adjust the editor when new resource types are added to the model. However, I'm using drop-down lists to choose a class or property, so there is a limit for the number of classes and properties the system can support before things get clumsy. The initial model at the Beta site contains about 30 classes and 200 properties (collected on a relaxed term-shopping day ;). I managed to keep the drop-downs somewhat small by defining mappings between the different vocabularies and adjusting domains and ranges where possible, but I don't think I'll be able to add many more. At least not many generic properties with a domain of rdfs:Resource.

defining validation and auto-adjust procedures

Validation functions can be defined in two ways: OWLchestra's datatype editor allows to specify an external validation function for a datatype, and an annotation property can be used to "assign" an auto-adjust function to a property. I'm still not sure if this use of annotation properties is considered a hack, but they are very handy to define little hints whenever needed. And they allow me to standardize form validation, but also to add custom features such as auto-converting an entered email-address to a valid foaf:mbox or foaf:mbox_sha1sum.

keeping the editing UI scalable

In order to keep response times acceptable, I'm using two DHTML-generating PHP classes: A table class and a Tab box class. Using the table class, I don't have to worry too much about sorting and splitting the result sets on multiple views. The tab box class makes it possible to provide a consistent look and feel (I hope) for the lists, filters, details forms, and creation forms. The tabs can also be used to only load/refresh a certain part of the whole form, and to hide functionality that's not needed for the task at hand.

OK, now to the less-technical requirements:

"Easy editing" forms, more compact UI

I was happy when the forms started to work, but there surely is much space for improvements. Making certain fieldsets collapsible helped a little, but having to add properties one by one wasn't really cool. And although this is the only way to manage resources with many properties, and even after having added a "multi-edit" feature for properties, I'd still have preferred the foaf-a-matic over my tool to create an initial resource description, or a one-page form with limited functionality for a small resource description.

So I went back to the drawing-board and redesigned the main "overview tab" you see when you select a resource. It shows now a "quick-edit" form where you can edit up to 20 properties in one form. Properties of the same type are grouped. It's not possible to adjust the language of literals, to change the publication setting, or to update blank nodes. But the form is nice and small and each field still has rdfs:comment information where available.

Next to the "quick-edit" tab is another new tab labeled "quick-add". The quick-add tab enables the creation of semi-hard-coded *-a-matics. Selecting it will let you choose from a list of templates such as "Basic FOAF description", "A list of depictions", "Import PPD", or "Basic DOAP description". These templates are defined at the form level, but can still use labels and tooltip infos from the model. Each template field can be defined as being a "standard" field or a custom field. Standard fields will create simple triples, custom fields can be used to manually create bnodes or more complex triples (e.g. turning a simple IATA code value into a resource property contact:nearestAirport linking to an air:Airport resource with the entered air:iataCode.

So, after building this auto-generate-everything editor, I'm returning to (at least partly) custom forms. Funny. (Well, sooner or later I'm going to need the advanced features, though, the UI scalability argument still holds true.)

The templates are still experimental, some don't even work at all. But I dared uploading the whole stuff to, and if you like to give it a try and send me some feedback, comments, or bug reports, I'll really appreciate that. I can test the editor with windows browsers only, so I don't know how the app looks or works on other systems. (By the way, it's not possible to browse resource descriptions or site content yet, all you can do is signing-up, playing with the profile and the RDF editor, and generating RDF/XML.) If you don't want to register, you can alternatively use the account alec (with pwd tronnick). Alec is an RDF crash test dummy and will be around during the pre-beta phase.

Some thoughts on Semantic Web site interoperability

Semantic Web site interop ideas
There is a workshop on Semantic Web site interoperability somewhere in the UK (at KMi ?) next week. I won't be able to attend (My mission for the next days will be to find out if David is still life-guarding the foaflets of the Caribbean ;), but DERI's SemWeb cluster will be there and asked me to provide some slides. While scribbling them I (again) came across the issues of resource description discovery. I think I have some ideas how to basically implement the stuff, but there are still two things that I'd really like to be addressed by the Best Practices or Data Access Working Group:
  • A standard way for site developers to point scutters and other SemWeb agents to resource descriptions (MGET support vs. response headers etc.) from a given URI
  • A standard way to tell SemWeb agents how to find a site's SPARQL end point / query interface (something less complicated than UDDI, please).

(I only had a quick glance at the draft of the SPARQL protocol, maybe there is already a proposal on its way for the second issue.)

Oh yes, and the hash vs. slash issue, of course ;) (I guess that's somehow related to interoperability, too...)
But first: holiday! relaunch project announced is going to be relaunched.
Today I finally announced the upcoming relaunch of I'm using a little RDF-enhanced project documentation app I wrote during the last week. Luckily I can use the DOAP vocabulary as Edd Dumbill confirmed today that his vocabulary may be used with non-Open Source Software projects as well.
The pieces are slowly falling into place ;)

Project website:

Project weblog:

Kick-off Meeting

DERI Kick-off.
I had the kick-off meeting with Stefan Decker and Andreas Harth today where I proposed an iterative approach for the relaunch of in order to be able to have an updated website by November 7. The schedule is very tight, with the first deadline being just a couple of weeks away.

The reason is that there may (hopefully) be some media awareness (or at least some people who'd like to find out more about the SemWeb) after this year's International Semantic Web Conference (ISWC), we should really put up something by then (as it doesn't really help disseminating the Semantic Web when the second or third best result in a Google search points to a website that hasn't been updated for 16 months ;)

Directly after that I joined a face2face meeting of the DERI Semantic Web Portal Working Group (SWP WG) which was quite interesting. The portal was one of their initial use cases, so my work is perhaps going to be mentioned in one of the SWP WG's deliverables. Nice. (Well, only if I don't screw it up...;)

Oh, and there was a social event at Galway's Bazaar Bar afterwards:


No Posts found