Yesterday here at Qcon I attended an informal get-together to discuss the BBC’s “tech refresh”, which turns out to mean the rebuilding of its web platform.
Apparently the budget has just been approved, which means the BBC will be going ahead with a new content platform built on Java supplemented by a lightweight PHP layer. The primary goal is flexibility. Recently the BBC went live with a new widgety home page which demonstrates its interest in personalization; ambitions include more extensive customization, more of a social platform (possibly using OpenSocial, OpenID); making a platform more amenable to mash-ups; data-only APIs.
As an aside, the BBC home page right now is a bit broken; it says “due to technical problems we are displaying a simplified version of the BBC homepage.” After yesterday’s session, I know a bit about why this is. The BBC’s current site is mostly based on Perl scripts and static pages. It’s not really a content management system. The recent home page innovations, which I blogged about recently, are not hosted on the new platform, but are a somewhat hacky affair built on the old platform using SSI and parsing cookies with regular expressions. It went live, but is currently not very reliable. It also uses more CPU, which ultimately means more servers are needed.
So what is the BBC’s backup plan for when its site fails? Well, it has a “big red button” which is really designed for moments of crisis when the whole world descends on the BBC to find out breaking news – an example was the London bombing in July 2005. At such times, scalability trumps everything, so the big red button switches on a simple home page which removes non-critical features like user tracking or smart widgets. The same procedure is handy for fallback if there are technical problems.
Another thing which interested me: apparently BBC pages are designed in PhotoShop and handed over to HTML coders for implementation. Unfortunately this doesn’t fit well with what I would like, which is pages that reflow nicely when you resize the page.
The BBC is conscious of its archival responsibilities and works with the Internet Archive. One of its problems is having to keep old material online, including some driven by old Perl scripts or even in some cases C scripts where the code has been lost. It is considering the use of virtualization to host old versions of Perl for content like this.
There is a bit of Ruby on the site but this has been problematic because of memory leaks. Maybe JRuby would help.
The current/old BBC site may be built on old and unfashionable technology, but I’ve personally appreciated its great availability and performance. And the lack of ads, of course.
I’m attending a further session on the BBC news site later today, so perhaps another post later.
It’s worth bearing in mind that not all of bbc.co.uk is built this way.
Notably news is different as is /programmes which is dynamically published. Nor (on /programmes and /music) do we design from Photoshop – the designers work directly with the mark-up in css and photoshop (to create the page assets).
Obviously lots of bbc.co.uk is as you describe.
I’ve been working on some large content sites recently dealing with some similar issues. It will be curious to learn a bit more about the underlying Java platform and the delivery tier being authored in PHP. It sounds a bit like they’ve implemented Alfresco as a ECM/WCM platform then utilized, or are experimenting with utilizing, the PHP APIs for delivery.
Then again, when looking at the current portal/portlet-type implementation on bbc.co.uk, it looks quite like a Liferay implementaiton. Which, curiously enough, plays well with Alfresco as the underlying content repository, since it is JSR-168 compliant.