After learning about Disqus this week, I became quite enamored of it. Drupal's comment system is swell and all, but I know that I don't have time to set it up as to be as nifty as it could be. So I think it's swell if someone wants to take it off my hands for free. The downside is that if I upgrade to D7, I wouldn't get those swell forthcoming RDF-a comments. But there's a Disqus API, so I can solve that later if it's important to me.
The problem is that having both Disqus and comments enabled makes things confusing. Each post has a link at the bottom with '13 comments 13 comments and 0 Reactions' or something similar. Not swell. I don't want to abandon my existing comments--in particular, my Puppet vs Chef and SPARQL posts got good feedback from their respective communities. The Puppet post is #1 on Google for 'puppet vs chef' largely because the authors of both projects commented there. Hopefully, Disqus would let people track those authors to comments on my site in the future.
So the obvious solution is to just import my comments into Disqus! I'll just download some module or converter and...and...nuts. Wordpress and Blogger have the goods, but not Drupal. So there went another afternoon: I wrote a basic importer in Ruby. Ruby is where the good Disqus library was (even if I did have to fix a bug).
Drupal comments have features that do not map to cleanly to Disqus, so it has weaknesses. All the comments are anonymous, and I really miss markdown. But it worked for me. Hopefully it helps someone else, too.
The W3C SPARQL working group (previously the Data Access Working Group) has recently released their first versions of the updated SPARQL standards, or SPARQL 1.1. The group's roadmap has these finalized a year from now, but they have asked for comments and I suppose these are mine.
I believe that these documents are a step further down a wrong path for SPARQL and, to a lesser degree, for RDF in general.
The latest round of changes includes a number of changes to SPARQL, including aggregate functions, subqueries, projection expressions, negations, updates and deletions, more specific HTTP protocol bindings, service discovery, entailment regimes, and a RESTful protocol for managing RDF graphs (the last one is not really just SPARQL, but it's in the updates).
So I'll start with my comments, which are mostly critical.
To start, an RDF-specific complaint, not really related to the rest of the post. Why would the one mandated format to be supported in the new RESTful RDF graph management interface be RDF/XML? What would it take for a the semweb community to move on from this failed standard, which has had known issues for more than 5 years? (those two issues were raised in 2001 and are currently marked 'postponed') Why should such an increasingly irrelevant standard as RDF/XML be chosen instead of the widely-supported and easy to implement N3, N-Triples, or Turtle?
As for SPARQL, the 1.1 standards continue to give named graphs first class citizen status, both in the web APIs and in more SPARQL syntax than they had before. It's not so much triples as quads these days. Other meta-metadata, such as time of assertion or validity time, are not covered. While named graphs are admittedly a particularly often-found case, why does it need to invade the syntax of SPARQL? Not every use case needs named graphs, but every SPARQL implementor must support them. The 1.1 standard now includes precedence rules when for named graph and base URIs when they conflict in HTTP query options and inside the query itself, attempting to solve this self-created problem.
How about subqueries? What about variables during insertions? What about subqueries during insertions? Do we really need implementors to consider these kinds of things for every SPARQL endpoint on the web?
None of these things is really all that bad by itself, but one must consider the bigger picture. SPARQL 1.0 was released in January of 2008 (with some comment period before that) and there is still no implementation of a SPARQL engine in PHP or Ruby (exceptions apply, see [1]). One does not increase the participation of that ecosystem by adding a selection of entailment regimes to the standard.
While a SPARQL implementation exists for the excellent RDFLib in Python, it's only one of the current big 3 (with Ruby and PHP) in web development, and there's only one. The fact that no SPARQL engines exist for Ruby or PHP should be considered a failure of the standard. Why are we adding complexity when there is no SQLite for SPARQL? Why are there at least 3 monolithic Java implementations (Jena, Sesame, Boca), all financially sponsored to some degree or another, but so little 'in the wild'? How long can RDFLib herd 16 cats as committers on the project? While I don't have a lot of direct experience with RDFLib, I pity the project 'leads' (I cannot find evidence that the project is sponsored or that anyone is 'in charge') trying to look towards the future of implementing 6 working papers of new standards.
One of the biggest success stories for semweb in widespread use is the Drupal RDF module, which has found wide acceptance in the Drupal community and started an ecosystem of modules. Drupal 7 will output RDFa by default and Drupal 6 supports a ton of wonderful features, including reversing the RSS 1.0 to 2.0 downgrade back to RDF. But Drupal remains a producer of simple triples and a consumer of SPARQL queries generated by other endpoints. Data in those sites remains locked down. Why? Because implementing SPARQL in PHP is nontrivial, and in a chicken-egg problem, nobody's paying for it before someone has a need for SPARQL.
I could go on, but these are symptoms (well, not that RDF/XML thing, I don't think there's a good reason for that). I feel that the working group is attempting to solve the wrong problem. Namely, it is attempting to define a somewhat-human-readable query language, SPARQL that works for almost all use cases. But why must the whole 'kitchen sink' be well-defined? Such a standards body should be attempting to define the easiest possible thing to implement and extend, not the the last tool anyone would ever use.
The SPARQL 1.0 standard's grammar was well-defined as a context free grammar. It also had extension functions, which were uniquely defined by URIs. Why the distinction between CFG elements and extension functions? Why not make syntax elements like named graphs and aggregate functions as discoverable as extensions? Well, the reason is that it's hard to write a parser of a human-readable format and make those things optional and discoverable. (Here's a SPARQL parser implementation in Scala, a language with powerful pattern matching features for good parsing, and it's 500 lines of code. It compiles to S-expressions, the parsing of which is about 30 lines. Hmm.)
If the protocol had been defined as S-expressions, the distinction would not exist and the syntax could be as expandable as the current functions (the current syntax would just be more functions). The new 1.1 service discovery mechanism is excellent and extendible and would allow the standard to grow dynamically instead of becoming bogged down in features for particular use cases. New baseline implementations of SPARQL would be easy to implement and grow incrementally, and the current human-readable format can be implemented in terms of these expressions.
The web of ontologies has grown with ad-hoc definitions created by people used to fill their needs. Standards grow organically around the ones that are needed most, others languish. Why should SPARQL functions have this kind of flexibility, but not the syntax? The distinction makes implementation overly difficult and is slowing the expansion of the Semantic Web.
In fact, it turns out that Jena has been parsing to S-expressions for some time. If you're an implementor, why would you do it any other way, especially when the standard can change as much as it does in 1.1? Any implementation will have to come up with something equivalent to S-expressions if you are going to be able to upgrade your engine implementation to meet standards like this when they are finalized. If people are doing it anyway, why not just make it the standard?
The SPARQL Working Group should be working on a definition for a function list and discovery protocol for S-expressions, and not for what we currently call SPARQL. What we call SPARQL is something that should compile to a simpler standard if various vendors want to implement it. S-expressions allow maximally simple parsing maximally simple serialization, and the ability to do feature discovery on core features of the language, not just portions which are blessed with the ability to be extended. S-expressions are easier for machines to generate for wide variety of automated use cases, far wider, I would venture, than the set of use cases for the human-readable queries.
Please, please, please do not doom the world to write the SPARQL equivalent of SQLAlchemy and ActiveRecord for the next 20 years! We can define a standard that machines can use natively. Now's the time.
At any rate, that's my beef in a nutshell. The working group won't come up with a successful standard until it's easy enough to implement it that workable implementations appear in the languages that are defining the web today. And when people can use those languages to implement that standard without an army of VC-funded engineers.
The SPARQL 1.1 proposals make the standard better than before, but it's not the standard we need. The SPARQL algebra is what needed expansion and specification, not the syntax.
[1]: The PHP ARC project has an implementation, but it attempts to directly convert SPARQL to an SQL query on particular table layout in MySQL, and is difficult to convert to general use. Despite SPARQL's complexity, ARC managed to implement this in just 6400 lines of code. The parser alone is 2000 lines and the engine another 4400. The serialization/parsing libraries, however, are fine, and were integrated successfully into the Drupal RDF module. The PHP RAP project has also done some good work and is perhaps more wrappable than ARC, but implements only a subset of SPARQL.
I've submitted my session Proposal for Drupalcon Paris 2009. After some consternation about the title, I went with 'Replacing your Internal Apps with Drupal'. I think that's where this hits home--we're replacing an entire backoffice of work-process and communications tools with Drupal, and we think it's great.
This is what we've been doing at Openband for the last few years. I've written a number of small, internal apps for a lot of small companies over the last few years and I would do them all in Drupal if I could do it all over again. It's too easy to get data in and out without buying some $20,000 piece of software to link sharepoint and your work-process system.
I gave Openband's presentation at Drupalcon DC 2009, and we got a lot of feedback that what we did was a lot more impressive than the title and session description gave away.
This talk will be a bit of a continuation of this topic. This platform is our baby, and we have a lot to say now that we've started using it internally. We're hoping to aim this at a slightly broader audience, and hopefully the session description is a bit more attractive this time. And even if you were at the session last time, we've come a long way in the last 6 months.
Anyways, go vote for the session!
scor did the hard work on putting together a presentation on RDFa for the keynote at DCDC. Just like last year, I did the legwork on the screencast while smarter, more productive people did the hard parts. It would seem that I am the voice of RDF in Drupal.
But alas, due to timeframe it was not to be. Fortunately, Boris played a version during his RDFa presentation at Drupalcon DC 2009, and scor ran two BoFs which I couldn't really participate in, having to give Openband's presentation on Thursday and being cut down by food poisoning on Friday.
Now scor did a writeup on the semweb group, and Dries noted it on his blog. I won't belabor the point when others have already worded it better by reiterating it all here, but the more exposure RDFa gets, the better! More credits for the work are in both of those links. You can view the video in the above links flashy style, and I've also put up the higher-res original here.
My food poisoning at DCDC (don't eat rare meat in DC!) abated just long enough for me to attend an excellent BoF towards the end of Friday at Drupalcon DC 2009. The goal was workable configuration management in Drupal, and in no particular order, attending were:
In terms of configuration management, having these folks in a room is a bit like the G7, but with Voltron attending.
In addition, one or two more interested parties showed up, including myself (I've tried to limit rattling off names to people who contributed to the projects listed; if I missed one, sorry). Openband is trying to stand up our stack with more than 200 modules, and configuration management is probably our biggest challenge right now. We'll be applying some resources to the problem and want to make sure that whatever solution happens is one that provides us a workable upgrade path for the future.
Probably the biggest trick about all of this is separating what's needed from the use cases. It's difficult to separate what really needs to happen from the end result, and the community is so quick to jump on any potential solution to this widespread problem that it's hard to work on something that enables solutions without talking about solutions. The short version of what needs to happen is that 'each module needs to do its own part', but there's no straightforward answer.
Going over my notes, the discussion had two main themes: requirements and implementation.
The context module does something really sweet, in that it provides a 'context' for a page load or a feature (a 'space' from the spaces module is more or less a context definition for a feature). This is an important idea, in that it maps settings to functionality and not to modules. It was generally agreed that this idea needs to be incorporated in any end solution, but perhaps not agreed that it was core's job to deal with it. Along related lines, there was also some question on whether or not core should handle ideas like dependencies, and I believe the general consensus was no.
Most everyone agreed that nobody wants to throw D6 to the wolves, and that the solution should come in the form of a hook/hooks added to D6 in contrib which can hopefully be added to D7 core. In either case, anything that builds on these hooks is a contrib space thing--this hook needs to be implemented by core, but won't be called by it. So the degree to which D6/D7 get supported after the hook's implementation can be left up to whoever has the resources and inclination to do it.
Related to roadmap, we need to be cognizant of what we're doing. It would be easy to replace a Data API by being too broad--the truth is that config and content have no inherent difference (what's a group, after all?). The Data API (or something else) should eventually take care not only of this problem, but of the content problem as well. Thus, this is not something that will live forever, nor is it something that will be perfect. It will pick some low-hanging fruit in the problem space, and do it broadly, but it's not going to be all things to all people. This is frustrating, as where to draw the line between 'config' and 'data' is something that varies from site to site and it's going to be a problem for people for whom the line is drawn just a tad bit off.
It was generally agreed upon is that there needs to be some kind of hook, or set of hooks, that *each module* can implement. The work needs to be associated with each module, so that it only need be done once. The exact nature of that hook has some argument left; there are two basic paths, outlined below. There's also consideration for a hook_default_config hook.
One idea is that modules simply export a configuration and import it: a module can eat whatever it outputs. But this turns out to be a Godzilla task for most modules to implement. Forms API has a couple of layers of validation in a couple of places, some of which, such as what kind of data something is, or whether it's in a list of allowed values, is never implemented by a developer. Such a high-level import/export API would force modules to write specific validation for everything, and a lot of modules would need to refactor custom validation from form code into reusable code. But even that would not be enough--thanks to the magic of form_alter, modules do not even necessarily know what consists of a valid configuration for themselves! To match existing functionality 100%, we'd have to have a hook_export_alter and hook_import_alter to allow modules to clobber each other as much as they already do. Thus, not only would each module need to implement the hook, it would have to re-alter every module it already does. Fun stuff!
The only API that *every* drupal module supports is Forms API. Everyone's a bit nervous about this, since the Forms API does not technically support macros, and even if it did, it's not the best way to visualize configuration for a module; a pretty form does not automatically transfer to a data structure. But as things stand, this is the only API that is 100% compatible.
Spaces is a module that provides context for a feature, with import/export functionality. I understood that it implements its config import/export mostly on its own, without using Forms API. Associating features with a context allows them to be quickly enabled or disabled, or for modules/themes to change their behavior based on the context , and probably more. I need to play with this more.
Gravitek Labs have written Patterns, which convert snippets of YAML and XML into Form API calls. They have support for most of core, plus views and cck, and it's fairly trivial to write patterns for modules that don't support it by identifying fields in their forms.
They have also started some work on something called the Configuration Framework for D6, which appears to be a standardized way to write data for Forms API and some magic for processing it. It provides some hooks for modules to implement which are a bit like import/export, but designed to provide input to their forms. It's also got the idea that modules should be able to ship with their default config in a text file. It uses patterns as the representation of config, which means it can be XML or YAML (with more to come).
Greg Dunlap uses both methods (for some things he used drupal_execute, others not). Much of the deployment framework is solving another problem, however, content, and I think it was generally agreed that this system should not attempt to create a layer that would be used for passing around content. It's a significantly more complicated problem anyway, as Greg discovered when he added a lot of things I suspect the Data API folks will end up having to do anyway, in particular indexing content by both auto-incremented id's *and* unique identifiers.
Other issues were mentioned, including, but not limited to, the D7 variables patch, and context and how important it is (solving, for example, the global uid problem),
At any rate, we agreed that the next phase of deliverables are:
I spent some time today setting up the SCM and issue tracking for the seasteading website, having recently fooled them into thinking I'm qualified to administrate their site. This was the first time I put an existing project into git.
I was pretty disappointed with how it went. When I finally learned SCM, it was on subversion (I can't use CVS, and don't intend to learn). I've shoved a system down my developers' throats in which vendor code is saved off in vendor land, just like all vendor code is done in all subversion repos, and that works fine. I've also recently been using some git for some ruby stuff on the side, and I'm way impressed. The Github model, with public pushing and pulling, makes it so ridiculously easy to contribute back to a project it's almost easier to give a patch back than not to. I'm going to start ordering Github Kool-Aid by the case.
But today, the first thing I wanted to do was update the Drupal core, so it's time for vendor code. In subversion, I'd put both versions of Drupal (current and previous) and copy the changeset of the upgrade into my working copy:
$ cd vendor/drupal/core $ svn copy 6.2 6.3 $ cd 6.3 $$ svn ci -m "Update drupal core to 6.3" Committed revision 5 $ cd ../../../trunk/drupal/core $ svn merge -c 5 .
This copies the same set of changes from the vendor upgrade to the trunk upgrade. I can even generate changesets between whatever version I want:
$ cd trunk/drupal/core $ svn merge http://svn/vendor/drupal/core/6.0 http://svn/vendor/drupal/6.4 .
Backwards merges (merge a changeset back):
$ cd trunk/drupal/core $ svn merge -c -5 .
You get the idea. I can merge any changeset, or the difference between two versions of any two files at any revision, and apply it to any set of files that can accept that changeset; the original ancestry is irrelevant.
Git's not letting me do this. The folks on IRC are helpful, and I can do what I need, but I feel it's a lot more awkward. There's no way to create a changeset from the difference between two files at different places in the repo, and there's no way to apply a change to anything but the file in which said change was originally made.
In the git model, you would start from scratch, with core drupal, and commit it. Then you'd code away. When it comes time to update drupal, you branch and apply. If you've edited core, you branch from your very first commit--from naked drupal--and commit there. Then you merge back to your master and merge that changeset back.
It works pretty well, but what if I have a project that's already halfway done? I can do it backwards, by starting from Drupal, then exploding the new project on top of it, but that feels awkward to me. What I ended up doing was creating a branch, installing the base version of the version of Drupal I wanted to upgrade, committed, exploded the most recent version of Drupal, and committed. Then I can switch back to master and apply the difference between the two commits:
$ git checkout master $ git branch drupal-core $ git checkout drupal-core $$ git commit -a $ $ git commit -a $ git checkout master $ git diff 89dca98 0f1ac4s | git-apply
This is uncomfortable to me; perhaps I'll get used to it. I don't really think it's any more or fewer steps than putting things in vendor, but when integrating several pieces of vendor code, I think this would get confusing. I don't like how vendor code has to exist in the exact same directory on the trunk branch as the branch where it's unedited. Especially considering that there is an excellent copy of Drupal on Github, it drives me nuts that there's no way to do this. Why can't I tell Github 'Give me the difference between Drupal 6.1 and Drupal 6.4'? Even if I download this repo, there's no way to do it; it's a different repository and I cannot find a way to copy the information into mine (I would not want to, anyway: the Drupal repo is some 22 megs).
All of that being said, I still prefer git. Now that I've climbed this little hurdle, it's completely appropriate for the project and hopefully we can make it public (waiting to hear back from the original devs about licensing) and get some fixes and all. But I'm now skeptical of git for a project that is mainly one of integration, which is my day job.
To be fair, this doesn't have to be a problem. This is as much a problem with PHP and/or Drupal as it is with git. Rails neatly solves the problem by having vendor code in /vendor right there in the live copy of the software, rather than only in the fantasy world of SCM. Code you write is here. Code you download is over there; it's like apartheid software development. Need to edit it? Re-open the classes and monkey patch it. Compare this with PHP, in which you re-open classes after typing ?> and tossing in some css style definitions in the middle of a constructor.
Git provides very nice little submodules; they work quite well if vendor code is completely separate from user code as in rails. They're useless for the usual PHP model of throwing everything in the main directory. Things like config.inc.php. Seriously, there are still projects with this model, Drupal included. Why? Why do Drupal sites live, from index.php, in ./sites/sitename? Why is there any sharing of directory space at all? I'm continually amazed that something as genuinely useful as Drupal comes from such miserable beginnings.
I hope I'm missing something obvious, and I'll bet that 10 seconds after I post this, someone is going to tell me how to do what I want. But oh well--such embarrassment is occasionally the price of knowledge.
I made a choice to go into system administration about a year ago, after a hiatus from web development. I'm getting back into coding again in a hurry, and rebuilding my toolchain is a pretty painful process. The biggest part of that toolchain has been Drupal 6, and it's both a wonderful revelation and a hugely frustrating experience.
This will make more sense with some background. I stopped doing web development in '03, after using the Perl on Rails framework, which is exactly what we did not call the combination of Perl 5, Class::DBI, Template Toolkit, and CGI::Application those days, long before Rails was a blip on the radar. I got out of it because I saw I wasn't capable of doing bigger projects alone, not with those tools, and I supported myself on making minor updates to other applications and the occasional small-scale bookkeeping app. When I came back to trying to find some full-time work, I went for system administration, which involves more interaction with people, or at least it used to--coding is much more social than it used to be.
I lucked into a job running one of the most complicated Drupal installations out there, with some 600k lines of code across 130 modules. I work with some terribly smart people, and we're doing some very cool stuff. It took me quite some time to start getting into the Drupal stuff itself, but once I started to figure it all out, it clicked pretty well. It's a welcome kind of uncomfortable not to be the smartest person in the room.
My previous toolchain consisted of some Perl's copious libraries, the windows ports of MySQL and Apache, cygwin, and vi. In hindsight, this was childish, pathetic, comical, the kind of toolchain that wears clown shoes. It still let me do an awful lot, compared to the scripts I was writing in '99 that amounted to little more than CGI to SQL wrappers, but it was clear to me how limited they were.
What a difference a few years makes: now I'm on a Macbook Pro with a ton of handy Macports, a native set of system tools equivalent to or better than the Linux equivalents I'm used to, and tons of useful apps, all set at the low, low price of $20-$40 apiece, because the cult of Mac is a true religion, complete with tithes. I'm finally getting practice with source control in the kind of complicated environment where it matters, and the devs around me know a ton of nifty tricks and tools. All of that stuff is a vast improvement, but the single biggest change to the toolchain is Drupal.
Before I get into what I like and don't like, there's a question about how one should see Drupal: as part of a toolchain, or as an end-user piece of software. It's designed, from the ground up, to let people start with nothing and end with a website, so perhaps considering it just a link in a chain is inappropriate, but I don't think so. No website, no matter how little code is involved, should be considered to exist in a vacuum, that's not how the web works today. Each site should be considered a node on a graph in addition to a site in and of itself, just like a library written for a project should be considered in the scope of reusability. In this respect, I'm weighing Drupal against web frameworks like Ruby on Rails or Django as opposed to other CMS's, such as that recurring villain of the Drupal graphic novel, Wordpress.
At any rate, wow oh wow, what an improvement! I really used to write menu templates that had to check for what page was currently being loaded and make that option have 'current page' css? Really? Write SQL for anything that required a join and an order or limit at the same time? Really? Write out form HTML? Hand-write javascript for the most basic form validation? Write templates at all? Drupal lets me worry about coding and not display, and that makes me actually interested in writing software again. It's an amazing tool, and as the browser becomes a platform, it's great to have so many of the basic tools in the web designer's toolkit be done better than most desktop client frameworks do desktop.
So the move to Drupal really is a groundbreaker for me, and now I'm busy with web development and writing silly blogs when I get home from busy days of the kinds of things sysadmins fill their days with. Unfortunately, I still see a lot of problems. The gains make it all very worth worthwhile, but some of the parts of Drupal are exceedingly frustrating when it's seen as a strong link in a toolchain and not an end-all, be-all website.
The first place this toolchain could be improved is an automated module install system. There are about 40 competing solutions, but on the whole, it's ridiculous that I can't tell my website to install the google analytics module into itself. The Drupal modules site doesn't really help a newbie find the kinds of modules that every site should have, things like pathauto and google analytics. It's a library distribution system that's technologically behind pear, ruby gems, easy install, and even behind CPAN, which is about a decade old. There's a lot of competing solutions to this problem, and hopefully one emerges as a workable base soon.
I find that some parts of the system are configured in strange places. Blocks, for example, are edited in a special area for blocks. That's appropriate. The location a block appears on, including page-by-page exceptions, is edited with the content of the block. Meanwhile, the node's own page selects menus, URL aliases, and publishing options, just inline options if the viewer is an administrator. While the ability to make a block display based on some PHP code is perfect sometimes, other times, it's more maintainable for a page to control what blocks it has, and not the other way around. Besides, it's rarely appropriate to give a non-administrator PHP rights. Breaking up the configuration of a page like this makes it harder to reuse any special configuration or code that particular page might have. I think a few instances of this kind of change might go a long way towards maintainability.
Hand in hand with maintainability is resuability. I have a lot of experience with this as a Drupal sysadmin. All of those great CCK types and views and whatnot require significant overhead to export and import properly, and keeping them in version control is even more difficult. It's a huge problem when you manage as many sites as we do, and lots of people are trying to solve it: there's a lot of good work going on in the change management group, tools like CoCKtail solve the problem by letting CCK types be a simple bit of text, and the ephemeral autopilot threatened to put the entire databases under version control to deal with the problem.
That problem looks like it's being solved: good. It's a shame that so much work is being spent on the simple problem that Drupal litters the definition of a site across code, a database, and a filesystem, and that it does it in a more or less unrepeatable way. Point to a particular file in a site's files directory: is it in use, or is it some poor lost inode, left adrift from 2 upgrades ago? Best just to leave it be.
None of these problems are terribly hard to solve for D7--it's just a point of view change. And really, at the end of the day, that's a long blog post for a bunch of problems that aren't going to keep me from picking something else, so let's not assume anyone's about to abandon all of those useful contributed modules in order to do something drastic. Let's just keep an open mind.
We finished the video for Dries' keynote just under the wire, as pretty much all such events need to be. Arto, Miglius and I had stayed up until past sunup for the last few days to make it happen. First Dan left, and then Miglius left on Saturday morning so that he could get stuck in Frankfurt for 24 hours. Once he got in to Boston, he logged on quick like a bunny and went back at it. Arto and I worked another 30-odd hours during Saturday and Sunday. Sometime during Monday, which I largely slept through, some of the office folk sent out a message noting that that our pile of pizza remains, chicken bones and coffee stains was not particularly helpful to the kitchen's ambiance. I don't think they know who did it, and I'm kind of afraid to fess up. Sorry, ladies.
Unlike most demo work, a ton of what went into this will be useful later. If our organizations were not keen on using RDF, we'd not have worked on this so hard. Arto's module stuff is anything but smoke and mirrors, and we figured out a lot of limitations to Exhibit and Potluck that will be important to understand later. These are now posted in our internal wiki and I will go and post them on the Simile project's site if I ever get a chance. It's worth a whole post in and of itself.
While Arto busied himself turning Drupal into the world's easiest to use RDF endpoint, Miglius and I combed datasets that would make for a decent demo and messed with Exhibit views. There's a lot of RDF data out there, but it doesn't all lend itself to being shown on a map, and people can only read so much on a video screen during a presentation. At the end of the day, I'm the only one with Leopard (and thus Screenflow), so I ended up doing the actual screencast.
Screencasting is an interesting thing. It's easier to script than a regular movie, but difficult to properly realize. There's a fine line between too little and too much data, without having awkward pauses and without skipping over too much. You have to take into account that different viewers have different levels of experience with the material, different reading speeds, whatever. I made a detailed narration that was a bit too fast paced for the keynote; that wasn't a problem, as Dries had already communicated that he'd prefer to do the narration himself.
On Monday, Arto and I woke up about a half hour before the talk and got on IM. As the talk began, we realized that we really needed to have this data up where people could get it. And we really wanted them to be able to get it--we'd worked ridiculous hours on this thing. So that's when we decided the site needed to be public.
We started to make that happen. There was a fair bit of configuration to be done to make it useful; Arto got the video onto s3 while I messed about with some permissions and redirects. I typoed just about everything I did related to that--I don't think I did a single thing once. Halfway through the whole thing I realized I had stage fright; I couldn't type because my hands were shaking. The video I had worked so hard on was about to be placed up to awe or bore a sizable number of people, on whom much depends. And there was still a possibility that Dries would use my narration, in my mind, as we'd given him the final cut of the video with extremely little time to rehearse anything he wanted to say. So there I was, still in bed, with the door shut and the window blocking out what passes for sunshine in Stuttgart, and I was nervous as hell about being up in front of a crowd.
Stage frightened of nobody at all. What a cool world we live in, that such a feeling can now be transferred over the wire.
Anyways, we did a good job (well, mostly Arto did a good job) of getting the video out there for anyone who wanted it, and at least a couple of people did. Here's another copy, if you're curious: